Prateek Papriwal - Developing Accurate Probability Distribution Functions

Description

This is the report of Prateek Papriwal for the GSOC 2012 on the project "Distribution functions" detailed at:

Contributor - stats

Scilab is free open source software for numerical computation providing a powerful computing environment for engineering and scientific applications. The current list of distribution functions implemented is very small as compared to that of Matlab. My proposal is to add more Matlab-like pdf's,cdf's,invert cdf's and Rng's. The addition of above Matlab-like features would add more functionalities to the distribution functions toolbox of Scilab.

Deliverables

The features included in scilab are - Beta,Exponential,Gamma,LogNormal,Normal,Uniform. The structure of current toolbox comprises of the apifun module,assert module, content of help page, content of a test.

Following distribution functions (their pdfs,cdfs,icdfs,rngs) heve been implemented in Matlab but not in Scilab -- such as Binomial,Chi-square,Copula,Hypergeometric,Rayleigh,Weibull,Multinomial, Extreme Value, F probability function, Student's t probability density function,Geometric.

Though, My primary aim would be implementing Binomial ,Geometric,Hypergeometric,Chi-square,Weibull, F probability function ,Student's T probability function . The addition of above functions would be totally independent . In other words, The implementation of the above modules (macros(.sci),unit tests(.tst),help pages(.xml)) would be totally independent. The addition of above distribution functions would improve the functionality of the statistics module.

For each Distribution function we will have

-> Probability Distribution Function

-> Cumulative probability Function

-> Inverse CDF

-> the random number generator

-> the statistics(mean and variance)

Timeline

I would adopt the strategy of "Test Driven Development" for the implementation -->

--> Write a draft of the unit tests.

--> Write a draft of the help.

--> Code (macros(.sci), c sources of src)

--> Accordingly update the tests(The accuracy tests will be such that it determines the accuracy to 13 to 15 significant digits .)

--> Accordingly update the help

--> recode

--> documentation

The above strategy will be followed for implementation of each distribution function.

Apifun Module -- The goal of this toolbox is to check the input arguments in macros (.sci) . It checks whether the number of input arguments provided by the user is consistent with the number of the expected arguments.

Assert Module -- the goal of this toolbox is to provide functions to make testing easier. The functions of the assert module are designed to used in Scilab unit test files(.tst files)

Structure of Toolbox - The structure has the following components -- benchmark ,demos,doc,etc,help,macros,sci_gateway,src,tests,builder.sce,changelog.txt,license.txt,readme.txt

Task List

Week

Tasks

Description

Status

Results

8th-13th May

ATOMS functioning

got acquainted with several functionalities of ATOMS while installing ,loading several modules

Done

Bug[1] found in distfun module

distfun module On forge

went through the coding style of existing functions in distfun module

Done

binomial distribution

strengthened my theoretical knowledge of binomial distribution

Done

14th-20th May

version control system

installed svn client TortoiseSVN

Done

svn checkout

svn checkout the distfun module onto my local directory

Done

visual studio

installed visual studio for building the distfun module from the sources,compiled a debug version of distfun module

Done

Compiling Environment estbilished

Linking scilab to visual studio

to enable debugging functionality

Done

21st-27th May

Patching the Bug[1]/Bug - 11127

corrected the compiling error present in src/cdflib.c,src/gwsupport.c,src/genrand.c,src/unifrng.c

sript builder.sce ran but loader.sce is still generating error (Bug 11127)

Geometric Distribution Implementation

Implemented geometric probability density,cumulative density , mean and variance , and inverse CDF

Done

28th-3rd June

Geometric Distribution Implementation

geometric random generator,Unit tests implemented and their refs created, dataset written in .csv format

Done

All the tests passed

Documentation

Documentation of geometric distribution functions

Done

4th-10th June

Addition of latex and examples

Added latex and some more examples to the documentation of macros

Done

Help pages

Created .xml files(help pages) with the help of help_from_sci() function

Done

Benchmarks

Created geom.r file with R software to check the accuracy of geometric distribution

Done

Bug 11127

Bug fixed . distfun module now runs smoothly on linux as well

Done

the error in the loader.sce script also fixed

11th-17th June

Small Updates in Geometric Distribution

Updated examples and hence help pages,updated readme.txt as well

Done

Added new version

Added 0.4-1 version of distfun module on atom

Done

Link -http://atoms.scilab.org/toolboxes/distfun

Bug 11127

In the version 0.4-1 the bug 11127 does not exist.

Done

Bug 11127 solved

Issue 760

The inverse beta cdf computes wrong result on linux 32 bits.

not done

Issue 760 - http://forge.scilab.org/index.php/p/distfun/issues/760/

18th-24th June

Binomial Distribution

Added macros and corresponding tests of binomial distribution functions

Done

Addition of more tests

Added .csv files and binom.r benchmark for accuracy and compatibility

Done

Help Pages

Added help pages for binomial distribution macros

Done

...

.....

.....

....

....

Implementing binomial Distribution

will be updated

Not Done

Implementing Hypergeometric Distribution

will be updated

Not Done

Implementing Chi-Square Distribution

will be updated

Not Done

Implementing binomial Distribution

will be updated

Not Done

Implementing Student's Distribution

will be updated

Not Done

Implementing F Distribution

will be updated

Not Done

[1 ]While on loading distfun module on scilab on linux 32-bit, an error pops up

-->atomsLoad('distfun');

Start Distfun
    Load macros
    Load gateways
atomsLoad: An error occurred while loading 'distfun-0.2-1':
    link: The shared archive was not loaded: /home/hp/scilab-5.3.3/share/scilab/contrib/distfun/0.2-1/sci_gateway/c//../../src/c/libdistfun_c.so: cannot open shared object file: No such file or directory
 !--error 10000

at line     337 of function atomsLoad called by : 
atomsLoad('distfun');

This error suggests that the libdistfun_c.so file is missing

-->The above bug has been resolved now in the new version 0.4 of distfun module.

Weekly reports

Source Code of Other Implementations

Implementation proposals

Final Report

http://www.google-melange.com/gsoc/project/google/gsoc2012/papriwalprateek/32002


CategoryHomepage

public: Contributor-stats-GSOC2012 (last edited 2012-11-29 12:04:54 by reverse)