In this report, we present the functions included in the DoE_beta toolbox by Yann Collette. The functions cover random number generation, factorial design, RSM functions and computation of statistical values.

Random Number Generation

Result = doe_prbs(init,feedback)

This is a pseudorandom binary signal generation function

lhs_matrix = doe_lhs(nb_dims, x_min, x_max, nb_div, nb_iterations, nb_points, random)

This function computes a latin hypercube sampling.

Quasi-random sequences

Quasi-random sequences are less random but better distributed than pseudo-random sequences, as they produce points with high correlation between them. Depending on the way set points are generated, quasi-random sequences can be hammersley, halton, faure or sobol sequences.

r = doe_hammersley(dim_num, n, step, seed, leap, base)

This function computes a Hammersley data set.

The function works inputting only the above 2 parameters. The following are optional:

r = doe_halton(dim_num,n,step,seed,leap,base)

This function computes a Halton point set.

r = doe_sobol(n)

This function computes a Sobol data set

The user first initialises the function inputting doe_sobol(-1) and then repeatedly calls doe_sobol(n).

Factorial Design

Factorial designs were analysed in the 2012-05-06 report of DoE.

r = doe_factorial(nb_var)

Yates Algorithm

Yates algorithm estimates effects in factorial designs. In this toolbox it is implemented by doe_yates.sci

[ef,id] = doe_yates(y, sort_eff)

The Reverve Yates algorithm estimates the response given the effects.

[y,id] = doe_ryates(ef)

Hadamard matrix

Hadamard matrix is a square matrix consisting of +1's and -1's, with each consequent rows representing orthogonal vectors.

H = hadamard(n)

Response Surface Methodology

In Doe_beta toolbox, there are functions producing Box-Benkhen and central composite design.

H = doe_box_benkhen(nb_var,nb_center)

H = doe_composite(nb_var,alpha)

H = doe_star(nb_var)

This function outputs a 2*nb_var -by-nb_var matrix, containing a matrix with diagonal of +1's and a matrix with diagonal of -1's.

Computer-aided designs

Computer-aided designs are experimental designs that are generated based on a particular optimality criterion and are generally optimal only for a specified model. The most used criteria are the following:

It is implemented by [M_doe,history] = doe_a_opti(M_init, M_cand, doe_size, model, l_bounds, u_bounds, ItMx, p_level, Log, size_tabu_list) and comp_a_opti_crit.sci

Input Parameters (the input parameters are the same for all the optimal functions)

[M_doe,history] = doe_d_opti(M_init, M_cand, doe_size, model, l_bounds, u_bounds, ItMX, p_level, LOg, size_tabu_list)


[M_doe,history] = doe_g_opti(M_init, M_cand, doe_size, model, l_bounds, u_bounds, ItMX, p_level, LOg, size_tabu_list)


[M_doe,history] = doe_o_opti(M_init, M_cand, doe_size, model, l_bounds, u_bounds, ItMX, p_level, LOg, size_tabu_list)


Super Saturated experiments

When the number of factors exceed the number of runs, the design is called super saturated. Such designs can be computed by applying the a-optimal, d-optimal , correlation and khi2 criteria. The following functions implement such designs:

Result = comp_ssd_a_value_crit(M_doe,model)

This function uses the a-optimal criterion mentioned above for the computation of a super saturated design.

These inputs are the same in all ssd functions.

Result = comp_ssd_ave_khi2_crit(M_doe,Model) (Needs more information)

Result = comp_ssd_max_khi2_crit(M_doe,Model) (Needs more information)

Result = comp_ssd_r_value_crit(M_doe,Model) (Needs more information)

This function uses the correlation criterion to compute such a design.

Computing Statistical Values

result = doe_test_mean(x, y, level, operation)

It tests whether the means of two samples x kai y are equal.

result = doe_test_var(x, y, level, operation)

It tests whether the variances of two samples x kai y are equal.

Input and output parameters are the same as in doe_test_mean().

result = doe_test_significance(param_mean,param_var,size_stat,val_to_comp,level,operation)

retval = skewness(x)

This function measures the skewness, the asymmetry of the probability distribution. If skewness is positive, the probability distribution is concentrated to the left of the figure, if it is negative it is concentrated on the right and if it is zero then it is symmetrical.

retval = kurtosis(x)

This function measures kurtosis, the degree of peakedness of a distribution.


The following functions compute the difference between the empirical cumulative distribution function of a design and the uniform cumulative distribution function, using the Centerd-L2 and Wrap-around-L2 discrepancy criteria.

Result = comp_CL2_crit(Data)

Result = comp_WD2_crit(M_doe,Model)

Input Parameters

Data sets

X_norm = normalize(X_in,replace)

This function normalises a given data set

X_std = standardize(X_in)

This function normalises and centers the given data set.

H = doe_scramble(H1,N)

This function scrambles a given design of experiments

H = doe_union(H1,H2)

H = doe_merge(H1,H2)

These functions merge two given design of experiments H1, H2.

H = doe_diff(H1,H2)

This function outputs a vector H containing the common points between two data sets H1 and H2.

[s_opt,b_opt,res_mean,res_std] = crossvalidate(fun,K,steps,X,y,varargin)

Cross validation measures how accurately a model will perform in practise. All observations are used for both training and validation, and each observation is used for validation exactly once.

H_unnorm = unnorm_doe_matrix(H,min_levels,max_levels)

This function translates a design if experimtnes containing +1's and -1's to maximun and minimim levels.


R = build_regression_matrix(H,model,build)

This function computes the regression matrix of a given model.

var = var_regression_matrix(H,x,model,sigma)

This function computes the variance of a given model.

model = doe_poly_model(mod_type,nb_var,order)

This function produces a list of monomials that represent a polynomial model.

[model_new,coeff_new] = doe_model_bselect(nb_var,model_old,measures,Log)

This function removes unnecessary monomials from an input model and selects the best subset.

[model_new,coeff_new] = doe_model_fselect(nb_var,model_old,measures,Log)

This function starts with one monomial of a model and progressively adds the best monomials.

The input and output parameters are the same as in doe_model_bselect()

public: Contributor-DoE-GSOC2012/report-2012-05-13 (last edited 2012-05-18 08:12:46 by michael.baudin@contrib.scilab.org)