[Contents] [TitleIndex] [WordIndex

Overview of Probabilities and Statistics in Scilab

In this page, we present a list of ressources for Probabilities and Statistics in Scilab. We present documents, tutorials and software tools in this field.

Key features in Scilab

Scilab provides the following features:

u = grand(1000,5,"def")

Y=grand(m,n,"nor",av,sd)

[P,Q]=cdfnor("PQ",X,Mean,Std)

Graphics:

Data fitting and parameter identification:

The following plot comes from the scidemo module (http://forge.scilab.org/index.php/p/scidemo). It compares the normal distribution function with normal random numbers generated by the grand function. Moreover, confidence intervals are computed based on numerical integration of the normal distribution function.

demo_normdist.png

Some of these functions are based on the work by Carlos Kliman on Labostat [1].

Scilab makes use of the Open Source Library Dcdflib, by Barry W. Brown, James Lovato, Kathy Russell. Scilab uses the Fortran version of the Dcdflib. The Dcdflib library is known for its accuracy.

Toolboxes

In this section, we describe toolboxes which are providing features in statistics for Scilab.

We review the following toolboxes:

Stixbox

Stixbox is a statistics toolbox which provides distribution functions, datasets, statistical tests and plotting facilities.

Stixbox is developped on Scilab's Forge :

http://forge.scilab.org/index.php/p/stixbox/

and is available on ATOMS :

http://atoms.scilab.org/toolboxes/stixbox

Features

Low Discrepancy Toolbox

The goal of this toolbox is to provide a collection of low discrepancy sequences. These random numbers are designed to be used in a Monte-Carlo simulation. For example, low discrepancy sequences provide a higher convergence rate to the Monte-Carlo method when used in numerical integration. The toolbox takes into account the dimension of the problem, i.e. generate vectors with arbitrary size.

The low discrepancy toolbox is developped on Scilab's Forge:

http://forge.scilab.org/index.php/p/lowdisc/

and is available on ATOMS:

http://atoms.scilab.org/toolboxes/lowdisc

Overview of sequences :

Main features :

This module currently provides the following functions:

Provides the following functions to extend the maximum dimension of the Halton and Faure sequences :

Provides the following functions to suggest expert settings for the sequences :

This component currently provides the following sequences:

To install it, type :

atomsInstall('lowdisc')

The following example plots the 2D Faure sequence.

lds = lowdisc_new("fauref");
lds = lowdisc_configure(lds,"-dimension",2);
lds = lowdisc_startup (lds);
[lds,computed] = lowdisc_next (lds,100);
lds = lowdisc_destroy(lds);
plot(computed(:,1),computed(:,2),"bo");
xtitle("Faure sequence","X1","X2");

This produces the following figure.

lowdisc_faure.svg

This module was first reviewed at 17th June 2010 : Low Discrepancy Sequences

NIST Dataset

The goal of this toolbox is to provide a collection of datasets distributed by NIST.

The NIST Standard Reference Datasets is a collection of datasets. The purpose of this project is to improve the accuracy of statistical software by providing reference datasets with certified computational results that enable the objective evaluation of statistical software.

The following is a list of functions in this toolbox.

Moreover, the module provides 34 datasets from NIST in the following categories:

Datasets from other categories are provided on the NIST website, which cannot be read by the current toolbox. However, it should be straightforward to extend the current toolbox to read the other categories of files.

The reference website for this project is the Statistical Reference Datasets:

http://www.itl.nist.gov/div898/strd/

The nistdataset toolbox is developped on Scilab's Forge:

http://forge.scilab.org/index.php/p/nistdataset/

and is available on ATOMS:

http://atoms.scilab.org/toolboxes/nistdataset

To install it, type:

atomsInstall('nistdataset')

The nistdataset_read function reads a dataset from NIST. In the following example, we read the Gauss2 dataset in Scilab.

path = nistdataset_getpath();
filename = fullfile(path,"datasets","nls","lower","Gauss2.dat");
data = nistdataset_read(filename)

The previous script produces the following output:

-->data = nistdataset_read(filename)
 data  =
 
NISTDTST Object:
===========

name: Gauss2
category: Nonlinear Least Squares Regression
description:
         The data are two slightly-blended Gaussians on a
         decaying exponential baseline plus normally
         distributed zero-mean noise with variance = 6.25.
reference:
         Rust, B., NIST (1996).
datastring:
         1 Response  (y)
         1 Predictor (x)
         250 Observations
         Lower Level of Difficulty
         Generated Data
model:
         Exponential Class
         8 Parameters (b1 to b8)
         
residualSumOfSquares: 1247.5282
residualStandardDev: 2.270479
degreeFreedom: 242
numberOfObservations: 250
x: 250-by-1 constant matrix
y: 250-by-1 constant matrix
start1: 8-by-1 constant matrix
start2: 8-by-1 constant matrix
parameter: 8-by-1 constant matrix
standarddeviation: 8-by-1 constant matrix
sampleMean: []
sampleSTD: []
sampleAutocorr: []

From there, it is easy to access to the x and y fields of the data structure:

-->size(data.x)
 ans  =
    250.    1.  
-->data.x
 ans  =
    1.    
    2.    
    3.    
    4.    
    5.    
[...]
-->size(data.y)
 ans  =
    250.    1.  
-->data.y
 ans  =
    97.587761  
    97.763443  
    96.567047  
    92.52037   
    91.15097   
    95.217278  
[...]

For example, the following script:

scf();
plot(data.x,data.y,"bo")

The previous script produces the following output:

NISTdataset-Gauss2plot.png

Regression Tools

The toolbox regtools provides three functions for performing linear and non linear regression analysis.

The regtools module provides the following functions:

This module is developped by Torbjorn Pettersen.

It is available on ATOMS:

http://atoms.scilab.org/toolboxes/regtools

To install it, type:

atomsInstall('regtools')

The following plot is a demo of the Regression toolbox.

regtools_plot2.png

A more complete description of this module is available at:

Regression Tools

NaN-Toolbox

This toolbox is for classification and statistics. This toolbox is especially written for data with missing values encoded as NaN. It is a Scilab port of the nan-toolbox for matlab/octave.

This toolbox is developped by Holger Nahrstaedt under GPL (2.1).

The classification routines are routines for train a classificator (nan_train_sc, nan_classify, svmtrain, train ) and routines for testing (nan_test_sc, predict, svmpredict) and visualisation (nan_confusionmat, nan_partest, nan_rocplot).

The Nan toolbox provides the following functions.

The Nan Toolbox is available on ATOMS:

http://atoms.scilab.org/toolboxes/nan

To install it, type:

atomsInstall('nan')

The NISP toolbox

This module allows to perform sensitivity analysis. This is the analysis of the uncertainty in the output of a given model, depending on the uncertainty in its inputs.

The analysis is based on chaos polynomials, which are orthogonal polynomials which are used as an approximation of the original model. Once the coefficients of the chaos polynomial are computed, the associated sensitivity indices are straightforward to get.

The module provides the following components :

The current toolbox provides an object-oriented approach of the C++ NISP library.

The following list presents the features provided by the NISP toolbox :

log-normal,

Sobol, Latin Hypercube Sampling, various samplings based on Smolyak.

variance, quantile, correlation, etc... Generate the C source code which computes the output of the polynomial chaos expansion.

We additionally provide the nisp_sobolsa function which provides the Sobol method for sensitivity analysis. It allows to compute the first order sensitivity indices, the total sensitivity indices and all the sensitivity indices.

This module is developed by Michael Baudin (Digiteo) and Jean-Marc Martinez (CEA). The module is provided under the LGPL licence.

The NISP toolbox is developped on Scilab's Forge:

http://forge.scilab.org/index.php/p/nisp/

The NISP toolbox is provided on Atoms:

http://atoms.scilab.org/toolboxes/NISP

The following figure is the histogram of the output of the ishigami function, a classical benchmark in sensitivity analysis.

ishigami-histogram.svg

More details on this module are provided on the wiki:

NISP Module

The libsvm toolbox

The libsvm toolbox provides a simple interface to LIBSVM, a library for support vector machines (http://www.csie.ntu.edu.tw/~cjlin/libsvm). It is very easy to use as the usage and the way of specifying parameters are the same as that of LIBSVM.

This tool provides also a simple interface to LIBLINEAR, a library for large-scale regularized linear classification (http://www.csie.ntu.edu.tw/~cjlin/liblinear). It is very easy to use as the usage and the way of specifying parameters are the same as that of LIBLINEAR.

The libsvm v1.2.2 toolbox is an update of the libsvm v1.0 toolbox, first distributed at 7th of November 2011.

This interface was initially written by Jun-Cheng Chen, Kuan-Jen Peng, Chih-Yuan Yang and Chih-Huai Cheng from Department of Computer Science, National Taiwan University.

It was converted to Scilab 5.3 by Holger Nahrstaedt from TU Berlin.

This Toolbox is compatible with the NaN-toolbox.

The libsvm toolbox provides the following functions :

The libsvm toolbox is provided under the BSD license.

The toolbox provides the following demos :

The libsvm toolbox is provided on ATOMS :

http://atoms.scilab.org/toolboxes/libsvm

and is developped on Scilab's Forge :

http://forge.scilab.org/index.php/p/libsvm/

To install this toolbox, we type :

atomsInstall('libsvm')

and restart Scilab.

The "linear_demo" produces the following graphics.

libsvm-lineardemo.png

This review was first published at :

6th of March 2012: libsvm, v1.2.2

Other Toolboxes

There are other significant toolboxes for Scilab which are relevant to this field. There are two toolboxes for Neural Networks:

More toolboxes are available in the Statistics category of ATOMS :

http://atoms.scilab.org/categories/data_analysis_and_statistics

Other modules are available in the former Toolbox Center:

http://www.scilab.org/contrib/index_contrib.php?page=download&category=DATA%20ANALYSIS%20AND%20STATISTICS

For example, a boxplot toolbox is available at:

Documents and tutorials

In this section, we present documents, tutorials and books which present practical uses of Scilab on probabilities and statistics.

In English

In French

Bibliography


2022-09-08 09:27