• Edit
• Info
• Attachments   # Statistical Visualization in Scilab

## Introduction

Scilab provides a few statistical visualization functions, including:

• princomp — Principal components analysis
• show_pca — Visualization of principal components analysis results

Several existing toolboxes provides statistical visualization features, including:

The problem is that

• most functions are not compatible with Matlab
• most functions have less features than Matlab
• most functions have less tests than required
• most functions have less help pages than required
• some functions are duplicated in several toolboxes : this spreads the development effort into several pieces, instead of focusing on a small set of high-quality functions

This conclusion was shared by several toolbox authors, including

• Michael Baudin, author of Distfun, Stixbox (contributor)
• Holger Nahrstaedt, author of Nan-Toolbox
• Torbjørn Pettersen, author of regtools

This leaded us to write our "Ideal" statistics module at :

The collection of statistical visualization functions that we have come to is defined below.

## Proposal

We think that this is a fun project for a GSOC student, and extremely useful for engineering and research purposes.

Here is a list of functions that we suggest to develop.

• statvis_identify : Identify points on a plot by clicking with the mouse (draft from Stixbox)
• statvis_plotsym : Plot with symbols (draft from Stixbox)
• statvis_qqnorm : Normal probability paper (draft from Stixbox)
• statvis_qqplot : Plot empirical quantile vs empirical quantile (draft from Stixbox, from Nan-Toolbox)
• statvis_boxplot : Draw a box-and-whiskers plot for data provided as column vectors (draft from Stixbox)
• statvis_cdfplot : plots empirical commulative distribution function (draft from Stixbox)
• statvis_normplot : Produce a normal probability plot for each column of X (draft from Stixbox)
• statvis_plotmatrix : Scatter plot matrix - http://www.mathworks.fr/help/techdoc/ref/plotmatrix.html (draft from Stixbox, from Nan-Toolbox)

• statvis_cdfplot : http://www.mathworks.fr/fr/help/stats/cdfplot.html. (draft = nan_cdfplot from Nan-Toolbox)

• statvis_gscatter : http://www.mathworks.fr/fr/help/stats/gscatter.html (draft = nan_gscatter from Nan-Toolbox)

• statvis_boxplot
• statvis_normplot
• statvis_andrewsplot
• statvis_hist : http://www.mathworks.fr/fr/help/matlab/ref/hist.html (draft = histo from Stixbox, and nan_hist from Nan-Toolbox)

• statvis_ecdfhist
• statvis_fscatter3
• statvis_gplotmatrix
• statvis_parallelcoords
• statvis_errorb
• statvis_errorbar
• statvis_nhist
• statvis_bubblechart — Plot a bubble chart
• statvis_bubblematrix — Plot a bubble chart matrix
• statvis_inthisto : Discrete histogram (draft is distfun_inthisto in distfun)

## Examples

These are some examples of statistical grahics.

The following is a bubble chart. The following is a matrix of scatter plots. The following is a matrix of QQ-plot. In this section, we gather a set of steps required to achieve this goal.

• Identify the existing functions in the various Scilab toolboxes.
• Identify the existing functions in Matlab.
• Clarify the required functions in the new "statviz" toolbox : see which functions are
• required,
• optionnal,
• unnecessary.
• Set priorities to the functions :
• high,
• medium,
• low.
• Create the "statviz" project on Scilab Forge.
• Create a draft of 6 high priority functions, with
• Matlab compatiblity,
• argument checking,
• unit tests,
• examples,
• argument description.
• Create a tutorial help page in XML showing how to quick start with these functions.
• Create a XML help page with gallery of graphics.
• Release the v0.1 on ATOMS.
• Increase the set of functions from 6 to 12.
• Remove the duplicated function in Stixbox, Nan-Toolbox, Distfun and other toolboxes.

public: Contributor - statvis (last edited 2013-04-11 19:27:44 by michael.baudin@contrib.scilab.org)