Prateek Papriwal  Developing Accurate Probability Distribution Functions
Contents
Description
This is the report of Prateek Papriwal for the GSOC 2012 on the project "Distribution functions" detailed at:
Scilab is free open source software for numerical computation providing a powerful computing environment for engineering and scientific applications. The current list of distribution functions implemented is very small as compared to that of Matlab. My proposal is to add more Matlablike pdf's,cdf's,invert cdf's and Rng's. The addition of above Matlablike features would add more functionalities to the distribution functions toolbox of Scilab.
Deliverables
The features included in scilab are  Beta,Exponential,Gamma,LogNormal,Normal,Uniform. The structure of current toolbox comprises of the apifun module,assert module, content of help page, content of a test.
Following distribution functions (their pdfs,cdfs,icdfs,rngs) heve been implemented in Matlab but not in Scilab  such as Binomial,Chisquare,Copula,Hypergeometric,Rayleigh,Weibull,Multinomial, Extreme Value, F probability function, Student's t probability density function,Geometric.
Though, My primary aim would be implementing Binomial ,Geometric,Hypergeometric,Chisquare,Weibull, F probability function ,Student's T probability function . The addition of above functions would be totally independent . In other words, The implementation of the above modules (macros(.sci),unit tests(.tst),help pages(.xml)) would be totally independent. The addition of above distribution functions would improve the functionality of the statistics module.
For each Distribution function we will have
> Probability Distribution Function
> Cumulative probability Function
> Inverse CDF
> the random number generator
> the statistics(mean and variance)
Timeline
I would adopt the strategy of "Test Driven Development" for the implementation >
> Write a draft of the unit tests.
> Write a draft of the help.
> Code (macros(.sci), c sources of src)
> Accordingly update the tests(The accuracy tests will be such that it determines the accuracy to 13 to 15 significant digits .)
> Accordingly update the help
> recode
> documentation
The above strategy will be followed for implementation of each distribution function.
Apifun Module  The goal of this toolbox is to check the input arguments in macros (.sci) . It checks whether the number of input arguments provided by the user is consistent with the number of the expected arguments.
Assert Module  the goal of this toolbox is to provide functions to make testing easier. The functions of the assert module are designed to used in Scilab unit test files(.tst files)
Structure of Toolbox  The structure has the following components  benchmark ,demos,doc,etc,help,macros,sci_gateway,src,tests,builder.sce,changelog.txt,license.txt,readme.txt
Task List
Week 
Tasks 
Description 
Status 
Results 

8th13th May 
ATOMS functioning 
got acquainted with several functionalities of ATOMS while installing ,loading several modules 
Done 
Bug[1] found in distfun module 


distfun module On forge 
went through the coding style of existing functions in distfun module 
Done 



binomial distribution 
strengthened my theoretical knowledge of binomial distribution 
Done 


14th20th May 
version control system 
installed svn client TortoiseSVN 
Done 



svn checkout 
svn checkout the distfun module onto my local directory 
Done 



visual studio 
installed visual studio for building the distfun module from the sources,compiled a debug version of distfun module 
Done 
Compiling Environment estbilished 


Linking scilab to visual studio 
to enable debugging functionality 
Done 


21st27th May 
Patching the Bug[1]/Bug  11127 
corrected the compiling error present in src/cdflib.c,src/gwsupport.c,src/genrand.c,src/unifrng.c 

sript builder.sce ran but loader.sce is still generating error (Bug 11127) 


Geometric Distribution Implementation 
Implemented geometric probability density,cumulative density , mean and variance , and inverse CDF 
Done 


28th3rd June 
Geometric Distribution Implementation 
geometric random generator,Unit tests implemented and their refs created, dataset written in .csv format 
Done 
All the tests passed 


Documentation 
Documentation of geometric distribution functions 
Done 


4th10th June 
Addition of latex and examples 
Added latex and some more examples to the documentation of macros 
Done 



Help pages 
Created .xml files(help pages) with the help of help_from_sci() function 
Done 



Benchmarks 
Created geom.r file with R software to check the accuracy of geometric distribution 
Done 


Bug 11127 
Bug fixed . distfun module now runs smoothly on linux as well 
Done 
the error in the loader.sce script also fixed 

11th17th June 
Small Updates in Geometric Distribution 
Updated examples and hence help pages,updated readme.txt as well 
Done 



Added new version 
Added 0.41 version of distfun module on atom 
Done 


Bug 11127 
In the version 0.41 the bug 11127 does not exist. 
Done 
Bug 11127 solved 


Issue 760 
The inverse beta cdf computes wrong result on linux 32 bits. 
not done 
Issue 760  http://forge.scilab.org/index.php/p/distfun/issues/760/ 

18th24th June 
Binomial Distribution 
Added macros and corresponding tests of binomial distribution functions 
Done 



Addition of more tests 
Added .csv files and binom.r benchmark for accuracy and compatibility 
Done 



Help Pages 
Added help pages for binomial distribution macros 
Done 


... 
..... 
..... 
.... 
.... 


Implementing binomial Distribution 
will be updated 
Not Done 



Implementing Hypergeometric Distribution 
will be updated 
Not Done 



Implementing ChiSquare Distribution 
will be updated 
Not Done 



Implementing binomial Distribution 
will be updated 
Not Done 



Implementing Student's Distribution 
will be updated 
Not Done 



Implementing F Distribution 
will be updated 
Not Done 

Links
[1 ]While on loading distfun module on scilab on linux 32bit, an error pops up
>atomsLoad('distfun'); Start Distfun Load macros Load gateways atomsLoad: An error occurred while loading 'distfun0.21': link: The shared archive was not loaded: /home/hp/scilab5.3.3/share/scilab/contrib/distfun/0.21/sci_gateway/c//../../src/c/libdistfun_c.so: cannot open shared object file: No such file or directory !error 10000 at line 337 of function atomsLoad called by : atomsLoad('distfun');
This error suggests that the libdistfun_c.so file is missing
>The above bug has been resolved now in the new version 0.4 of distfun module.
Weekly reports
Source Code of Other Implementations
Implementation proposals
Final Report
http://www.googlemelange.com/gsoc/project/google/gsoc2012/papriwalprateek/32002