1. Performances of Scilab on the More, Garbow, Hillstrom optimization benchmark
Abstract
In this page, we report benchmarks of unconstrained nonlinear optimization in Scilab. We analyze the performance of various optimization solvers on the Moré, Garbow and Hillstrom collection of test problems for unconstrained optimization and nonlinear least squares problems. More precisely, we tested the fminsearch, optim, lsqrsolve and leastsq nonlinear optimization solvers.
Contents
-
Performances of Scilab on the More, Garbow, Hillstrom optimization benchmark
- Introduction
- The MGH benchmark
- Numerical results for optim
-
Numerical results for fminsearch
- Introduction
- Numerical results
- Reaching the maximum number of function evaluations
- Failures of the fminsearch function
- Failure of fminsearch on the problems #13 "SING" and #17 "OSB1"
- Failure of fminsearch on the problem #20 "WATSON"
- Failure of fminsearch on the problem #25 "VARDIM"
- Failure of fminsearch on the problem #26 "TRIG"
- Failure of fminsearch on the problem #35 "CHEB"
- Numerical results for lsqrsolve
- Numerical results for leastsq
- Conclusion
- References
1.1. Introduction
In this document, we analyze the performance of the fminsearch, optim, lsqrsolve and leastsq functions on a collection of test problems. We consider here the Moré, Garbow and Hillstrom collection of test problems for unconstrained optimization and nonlinear least squares problems. This benchmark will be denoted by MGH in the remaining of this document.
1.2. The MGH benchmark
1.2.1. Overview
The More, Garbow and Hillstrom collection of test functions is widely used in testing unconstrained optimization software. The code for these problems is available in Fortran from the netlib software archives:
The references for this benchmark are:
- "Algorithm 566: FORTRAN Subroutines for Testing Unconstrained Optimization Software", ACM Transactions on Mathematical Software (TOMS), Volume 7 , Issue 1, March 1981, Pages: 136 - 140, J. J. Moré, Burton S. Garbow, Kenneth E. Hillstrom
- "HESFCN - A Fortran Package Of Hessian Subroutines For Testing Nonlinear Optimization Software", Victoria Averbukh, Samuel igueroa, And Tamar Schlick Courant Institue Of Mathematical Sciences
Although, the original paper "Algorithm 566" provides only 35 test cases, several test problems (from #36 to #45) were added later to the collection.
1.2.2. The uncprb toolbox
The goal of this toolbox is to provide unconstrained optimization problems in order to test optimization algorithms.
The port from Fortran to Matlab was done by two undergraduate students at Brooklyn College, Livia Klein and Madhu Lamba, under the supervision of Chaya Gurwitz.
Benoit Hamelin did the port from Matlab to Scilab v4 with m2sci and did some manual tuning of the result.
The features of this toolbox are:
- Provides 35 unconstrained optimization problems.
- Provide the function value, the gradient, the function vector, the Jacobian.
- Provide the Hessian matrix for 18 problems.
- Provides the starting point for each problem.
- Provides the optimum function value and the optimum point x for many problems.
- Provide finite difference routines for the gradient, the Jacobian and the Hessian matrix.
- Macro based functions : no compiler required.
- All function values, gradients, Jacobians and Hessians are tested.
The authors for this toolbox are:
- Scilab v5 port and update: 2010, Michael Baudin
- Scilab port: 2000-2004, Benoit Hamelin, Jean-Pierre Dussault
- Matlab port: Chaya Gurwitz, Livia Klein, and Madhu Lamba
- 2000 - John Burkardt (optimum points for problem #20)
- Fortran 77: Jorge More, Burton Garbow and Kenneth Hillstrom
1.2.3. Comparisons with other benchmarks
In this section, we present other technical reports which present the results of optimization solvers on the MGH benchmark.
1.2.3.1. Solvopt
Solvopt is solves minimization or maximization of nonlinear, possibly non-smooth objective functions and solution of nonlinear minimization problems taking into account constraints by the method of exact penalization.
http://www.uni-graz.at/imawww/kuntsevich/solvopt/
This software was developped by Alexei Kuntsevich and Franz Kappel. The "Performance and tests" section:
http://www.uni-graz.at/imawww/kuntsevich/solvopt/results.html
presents the solution of SolvOpt on the MGH benchmark.
Kuntsevich and Kappel presents in :
http://www.uni-graz.at/imawww/kuntsevich/solvopt/results/moreset.html
more details on several problems in this collection.
The results are presented in 12 tables depending on the information given to the solver. For example:
the table 1 presents the solution for the Moré set of UNC-problems by SolvOpt with user-supplied gradients,
the table 2 presents the solution the Moré set of for UNC-problems by SolvOpt without user-supplied gradients.
1.2.3.2. Neumaier's bound constrained variants
Arnold Neumaier collected some data on the MGH benchmark at
http://www.mat.univie.ac.at/~neum/glopt/bounds.html
The author provides bound-constrained variants of the test problems and their solutions.
1.2.3.3. Matlab 5 / FMINU
Neumaier also provides two tables gathering the results from the FMINU function of Matlab.
http://www.mat.univie.ac.at/~neum/glopt/results/more/moreg.html
The FMINU function in Matlab 5 has been renamed FMINUNC in Matlab 6.
The page
http://www.mat.univie.ac.at/~neum/glopt/results/more/moreg.html
presents the results for Moré/Garbow/Hillstrom test problems unconstrained, using function values and gradients with the MATLAB Optimization Toolbox adapted from original source (A. Kuntsevich).
There are two tables, one for the BFGS method, the other for the DFP method.
1.2.3.4. Nelder-Mead: Burmen, Puhan and Tuma, 2006
Some paper also presents the performance of the Nelder-Mead algorithm on the MGH collection.
The paper :
Grid Restrained Nelder-Mead Algorithm, Arpad Burmen, Janez Puhan, and Tadej Tuma, 2006, Comput. Optim. Appl. 34, 3 (July 2006), 359-375
presents a converging variant of the Nelder-Mead algorithm.
They compared their algorithm GRNMA to SDNMA, the suffficient descent based algorithm proposed by Price, Coope, and Byatt.
1.2.3.5. Nelder-Mead: Burmen and Tuma - 2009
In the paper:
Arpad Burmen, Tadej Tuma, Unconstrained derivative-free optimization by successive approximation, Journal of Computational and Applied Mathematics, Volume 223, Issue 1, 1 January 2009, Pages 62-74
the authors compare several variants of Nelder-Mead's algorithm.
The sufficient descent-based simplex algorithm (SDNM), the GRNM algorithm, and Algorithm 3 (SANM) were implemented in MATLAB R14. The Moré–Garbow–Hillstrom set of test functions was used for algorithm evaluation. Besides these functions the standard quadratic and McKinnon [16] functions were also used. The starting simplex was chosen in the same manner, except for the McKinnon (alt.) function where McKinnon’s initial simplex, which causes the original NM algorithm to fail, was used. The results of the testing are in Table. NF denotes the number of function evaluations. The best (lowest) function value obtained by the algorithms is also listed.
1.3. Numerical results for optim
In this section, we present the results of various Scilab functions on the MGH benchmark. More specifically, we analyzed the following solvers:
- optim/UNC/"qn": a quasi-Newton method with BFGS
- optim/UNC/"gc": a Limited Memory BFGS method
- optim/BND/"qn": a quasi-Newton method with BFGS and bounds
- optim/BND/"gc": a Limited Memory BFGS method with bounds
- optim/BND/"nd": a nonsmooth unconstrained algorithm
The solvers used in the optim function are based on the Modulopt library from INRIA, developped by C. Lemaréchal, J.-C. Gilbert and F. Bonnans.
where UNC means unconstrained and BND means bound constrained.
1.3.1. Introduction
Each table presents :
- n: the number of unknowns
- F*: the known function global minimum
- F(X) : the function value at computed solution
- norm(G(X)) : the norm of the gradient at computed solution
- Feval: the number of function evaluations
- Status: the status of optimization. Status=T if the solver succeeded, and status=F if the solver failed.
Notice that the optim solvers are so that, each time the objective function is call, both f and g are computed.
We used the tolerances:
rtolF = 1.e-4; atolF = 1.e-10; rtolX = 1.e-4; atolX = 1.e-6;
The status of the optimization is T if:
- the convergence on F is achieved if
abs(fopt-fstar)< rtolF*abs(fstar) + atolF
- the convergence on X is achieved if
norm(xopt-xstar)< rtolX*norm(xstar) + atolX
In order to make sure that the solver converged, we used the
"ar",1000,1000
option, which implies that the solver terminates if the convergence criteria is reached or if 1000 iterations have been used or if 1000 function evaluations have been used.
The exact Jacobian matrix is used.
The problems in the MGH benchmark do not have bounds. In order to use the bounded algorithms, we used infinite bounds, that is, we computed bounds with the script:
lb = -%inf*ones(n,1); ub = %inf*ones(n,1);
and used the
"b",lb,ub
option.
The following table compares the global results of the optim solvers on all the benchmarks in this set.
Solver |
Pass |
Fail |
Feval |
optim UNC "qn" |
31 |
4 |
3770 |
optim BND "qn" |
31 |
4 |
5011 |
optim UNC "gc" |
32 |
3 |
5556 |
optim BND "gc" |
25 |
10 |
17975 |
optim UNC "nd" |
15 |
25 |
4983 |
1.3.2. The optim/UNC/"qn" solver
In this section, we present the performances of the optim function without constraints (UNC) and with the default "qn" solver.
Problem |
Name |
n |
F* |
F(X) |
norm(G(X)) |
Feval |
Status |
#1 |
ROSE |
2 |
0 |
0 |
0 |
46 |
T |
#2 |
FROTH |
2 |
0 |
48.984254 |
6.851D-12 |
48 |
F |
#3 |
BADSCP |
2 |
0 |
0 |
0 |
204 |
T |
#4 |
BADSCB |
2 |
0 |
0 |
0 |
25 |
T |
#5 |
BEALE |
2 |
0 |
0 |
0 |
22 |
T |
#6 |
JENSAM |
2 |
124.362 |
124.36218 |
0.0000020 |
43 |
T |
#7 |
HELIX |
3 |
0 |
0.3799580 |
10.374489 |
408 |
F |
#8 |
BARD |
3 |
0.0082149 |
0.0082149 |
2.814D-09 |
63 |
T |
#9 |
GAUSS |
3 |
1.128D-08 |
1.128D-08 |
1.198D-16 |
13 |
T |
#10 |
MEYER |
3 |
87.9458 |
87.945855 |
0.0022233 |
497 |
T |
#11 |
GULF |
3 |
0 |
9.561D-31 |
6.686D-16 |
61 |
T |
#12 |
BOX |
3 |
0 |
1.541D-32 |
2.142D-16 |
39 |
T |
#13 |
SING |
4 |
0 |
2.452D-61 |
2.852D-45 |
187 |
T |
#14 |
WOOD |
4 |
0 |
0 |
0 |
103 |
T |
#15 |
KOWOSB |
4 |
0.0003075 |
0.0003075 |
5.004D-16 |
30 |
T |
#16 |
BD |
4 |
85822.2 |
85822.202 |
0.0000016 |
59 |
T |
#17 |
OSB1 |
5 |
0.0000546 |
0.0000546 |
5.619D-09 |
128 |
T |
#18 |
BIGGS |
6 |
0 |
0.0056556 |
3.185D-09 |
79 |
F |
#19 |
OSB2 |
11 |
0.0401377 |
0.0401377 |
1.746D-11 |
70 |
T |
#20 |
WATSON |
9 |
0.0000014 |
0.0000014 |
1.355D-09 |
100 |
T |
#21 |
ROSEX |
20 |
0 |
1.001D-29 |
1.407D-13 |
127 |
T |
#22 |
SINGX |
12 |
0 |
3.956D-32 |
1.407D-23 |
242 |
T |
#23 |
PEN1 |
10 |
0.0000709 |
0.0000709 |
1.651D-13 |
87 |
T |
#24 |
PEN2 |
10 |
0.0002937 |
0.0002937 |
2.523D-11 |
706 |
T |
#25 |
VARDIM |
10 |
0 |
7.913D-30 |
3.966D-14 |
24 |
T |
#26 |
TRIG |
10 |
0 |
0.0000280 |
3.115D-11 |
88 |
F |
#27 |
ALMOST |
10 |
0 |
1.233D-31 |
7.657D-15 |
25 |
T |
#28 |
BV |
8 |
0 |
3.865D-33 |
4.474D-16 |
24 |
T |
#29 |
IE |
8 |
0 |
0 |
0 |
33 |
T |
#30 |
TRID |
12 |
0 |
1.800D-30 |
1.915D-14 |
39 |
T |
#31 |
BAND |
15 |
0 |
1.729D-30 |
1.929D-14 |
53 |
T |
#32 |
LIN |
10 |
5 |
5 |
8.028D-15 |
8 |
T |
#33 |
LIN1 |
10 |
3.3870968 |
3.3870968 |
3.901D-11 |
7 |
T |
#34 |
LIN0 |
10 |
4.8888889 |
4.8888889 |
1.362D-11 |
5 |
T |
#35 |
CHEB |
10 |
0.0065039 |
0.0065040 |
3.911D-10 |
77 |
T |
The abstract is the following:
Time: 9 (s) Total number of problems:35 Number of successes:31 Number of failures:4 Total Feval:3770
We notice that the number of function F evaluations is always equal to the number of gradient G evaluations. This can be explained by the fact that the solvers need the function and the gradient at each intermediate point x.
We notice that the time required to run all the test is quite short. This reflects the general experience that the algorithm is generally fast to converge.
In the next sections, we analyze the failures of the optim/UNC/"qn" solver:
- Problem #2 FROTH: optim converges to a correct local minimum, but not to the global minimum,
- Problem #7 HELIX: optim exits without a local minimum,
- Problem #18 BIGGS: optim exits without a local minimum (the point is a saddle point),
- Problem #26 TRIG: optim converges to a correct local minimum, but not to the global minimum.
1.3.2.1. Reaching the default maximum number of function evaluations
We see that many cases are associated with a number of function evaluations which is greater than 100, which is the default maximum. In the case where we configure the "ar" option, the maximum number of function evaluations may be reached, which silently stops the algorithms before actual convergence.
For example, in the problem #3, the number of function evaluations required by the algorithm is 204. If we do not configure the "ar" option, we can still get the computed optimum, without a warning:
-->nprob=3; --> [n,m,x0] = uncprb_getinitf(nprob); --> uncprb_optimfuninit(); --> [fopt,xopt,gopt]=optim(list(uncprb_optimfun,nprob,n,m),x0) gopt = - 9.3720597 - 0.0000150 xopt = 0.0000126 7.9602092 fopt = 5.941D-08
We see that the gradient is far from being zero. In order to see what happens, we may configure the "imp" option, which prints more messages.
--> [fopt,xopt,gopt]=optim(list(uncprb_optimfun,nprob,n,m),x0,imp=1); ***** enters -qn code- (without bound cstr) dimension= 2, epsq= 0.2220446049250313E-15, verbosity level: imp= 1 max number of iterations allowed: iter= 100 max number of calls to costf allowed: nap= 100 ------------------------------------------------ iter num 72, nb calls= 100, f=0.5941E-07 ***** leaves -qn code-, gradient norm= 0.9372059667329404E+01 Optim stops: maximum number of calls to f is reached.
We can then see that there is something wrong, which can be fixed by increasing the number of iterations and number of F calls.
--> [fopt,xopt,gopt]=optim(list(uncprb_optimfun,nprob,n,m),x0,"ar",1000,1000) gopt = 0. 0. xopt = 0.0000110 9.1061467 fopt = 0.
This is a known bug of optim, which has been reported on Bugzilla:
optim can fail to converge but does not produce a warning.
We see that optim/UNC/"qn" fails in 5 problems. In the next sections, we comment each failure case and try to see what happens.
1.3.2.2. Failure of optim/UNC/"qn" on Problem #2 FROTH
The optim function seems to fail on the Problem #2 "FROTH", where the optimum is F*=0, while optim produces F(X)=48.984254. Actually, optim converges in this case on a local optimum, instead of the global optimum. First, we see that this point is so that the gradient is very small, as shown in the following session.
-->nprob = 2; -->[n,m,x0] = uncprb_getinitf(nprob); -->uncprb_optimfuninit(); -->[fopt,xopt,gopt]=optim(list(uncprb_optimfun,nprob,n,m),x0,"ar",1000,1000) gopt = 10^(-11) * 0.6036061 - 0.3240075 xopt = 11.412779 - 0.8968053 fopt = 48.984254
Moreover, we can check that the eigenvalues are positive at this point.
-->H=uncprb_gethesfd(n,m,xopt,nprob); -->disp(spec(H)) 0.8207178 905.06704
Hence, this is a true local minimum. This local minimum is known for the problem #2 (see Kuntsevich).
This means that optim succeeds in finding a correct local minimum for problem #2, but this is not a global minimum. This is a property which can be expected from this type of algorithm, which only provide local optimas.
1.3.2.3. Failure of optim/UNC/"qn" on Problem #7 HELIX
On the problem #7, "HELIX", optim fails to find the minimum and finishes with a gradient which is very far from being zero, since ||G(X)||~10.3.
-->nprob = 7; -->[fopt,xopt,gopt]=optim(list(uncprb_optimfun,nprob,n,m),x0,"ar",1000,1000,imp=1) ***** enters -qn code- (without bound cstr) dimension= 3, epsq= 0.2220446049250313E-15, verbosity level: imp= 1 max number of iterations allowed: iter= 1000 max number of calls to costf allowed: nap= 1000 ------------------------------------------------ iter num 1, nb calls= 1, f= 2500. iter num 2, nb calls= 3, f= 1557. iter num 3, nb calls= 4, f= 54.72 iter num 4, nb calls= 5, f= 25.24 iter num 5, nb calls= 6, f= 5.105 iter num 6, nb calls= 7, f= 3.709 iter num 7, nb calls= 8, f= 3.441 iter num 8, nb calls= 9, f= 3.426 iter num 9, nb calls= 10, f= 3.406 iter num 10, nb calls= 12, f= 3.309 iter num 11, nb calls= 16, f= 3.004 iter num 12, nb calls= 18, f=0.9863 iter num 13, nb calls= 20, f=0.5610 iter num 14, nb calls= 22, f=0.4856 iter num 15, nb calls= 23, f=0.4118 iter num 16, nb calls= 27, f=0.3880 iter num 17, nb calls= 57, f=0.3822 iter num 18, nb calls= 113, f=0.3800 iter num 19, nb calls= 137, f=0.3800 iter num 20, nb calls= 161, f=0.3800 iter num 21, nb calls= 190, f=0.3800 iter num 22, nb calls= 216, f=0.3800 iter num 23, nb calls= 240, f=0.3800 iter num 24, nb calls= 263, f=0.3800 iter num 25, nb calls= 289, f=0.3800 iter num 26, nb calls= 313, f=0.3800 iter num 27, nb calls= 339, f=0.3800 iter num 28, nb calls= 362, f=0.3800 iter num 29, nb calls= 385, f=0.3800 iter num 29, nb calls= 408, f=0.3800 ***** leaves -qn code-, gradient norm= 0.1037448869040164E+02 End of optimization. gopt = 4.8517091 - 6.1989262 6.7575327 xopt = 0.9636073 0.3139571 0.5297775 fopt = 0.3799580
Apparently, the solver is unable to improve the function value and stops. The variables are not badly scaled.
At this point, the eigenvalues of the Hessian matrix are all positive. The condition number is not extreme in this case.
-->H=uncprb_gethesfd(n,m,xopt,nprob); -->disp(spec(H)) 2.0758694 200.0016 695.81466
We do not have a clear explanation of this failure.
1.3.2.4. Failure of optim/UNC/"qn" on Problem #18 BIGGS
The optim function apparently fails to solve the problem #18 BIGGS.
F eval: 79 G eval: 79 Status X: FAIL Status F: FAIL f(X)=0.0056556 (f*=0) ||G(X)||= 3.185D-09 ||X-X*||=8.2589785 |f(x)-f(x*)|=0.0056556
We notice that the norm of the gradient is very small at this point. Moreover, the Hessian matrix has negative eigenvalues:
-->H=uncprb_gethesfd(n,m,xopt,nprob); -->disp(spec(H)) - 0.0098031 2.256D-10 0.0001624 0.0229040 0.7469579 11.249998
This indicates that the current point is a saddle point and is not a local minimum.
The global minimum is at f=0:
-->[fstar,xstar] = uncprb_getopt(nprob,n,m) xstar = 1. 10. 1. 5. 4. 3. fstar = 0.
We see that the computed X is far from the optimum.
-->xopt xopt = 1.711416 17.683198 1.1631437 5.1865615 1.711416 1.1631437
We notice that other solvers, such as FMINU-BFGS for example, also converge to the same saddle point.
1.3.2.5. Failure of optim/UNC/"qn" for problem #26 TRIG
The optim function without constraints and the "qn" solver fails to converge to the global minimum for the problem #26 Trigonometric.
F eval: 148 G eval: 148 Status X: FAIL Status F: OK f(X)=0.0000280 (f*=0) ||G(X)||= 1.929D-10 ||X-X*||=0.1849980 |f(x)-f(x*)|=0.0000280
We notice that the gradient at this point is small. Moreover, the eigenvalues of the Hessian matrix are positive:
-->H=uncprb_gethesfcn(n,m,xopt,nprob); -->disp(spec(H)) 0.0626007 0.1212685 0.2688956 0.4904942 0.8265467 1.0504036 1.2715562 1.4935737 1.7199128 2.5935528
Hence, this point is a correct local minimum. This local minimum is known for the problem #16 (see Kuntsevich.
1.3.3. The optim/BND/"qn" solver
In this section, we present the performances of the optim function with infinite bound constraints (BND) and with the default "qn" solver.
Problem |
Name |
n |
F* |
F(X) |
norm(G(X)) |
Feval |
Status |
#1 |
ROSE |
2 |
0 |
7.400D-25 |
3.562D-11 |
184 |
T |
#2 |
FROTH |
2 |
0 |
48.984254 |
1.801D-08 |
39 |
F |
#3 |
BADSCP |
2 |
0 |
8.376D-22 |
0.0000049 |
308 |
T |
#4 |
BADSCB |
2 |
0 |
2.174D-27 |
9.326D-08 |
24 |
T |
#5 |
BEALE |
2 |
0 |
4.524D-28 |
1.907D-13 |
43 |
T |
#6 |
JENSAM |
2 |
124.362 |
124.36218 |
5.283D-08 |
43 |
T |
#7 |
HELIX |
3 |
0 |
1.843D-25 |
2.460D-12 |
163 |
T |
#8 |
BARD |
3 |
0.0082149 |
0.0082149 |
7.966D-12 |
47 |
T |
#9 |
GAUSS |
3 |
1.128D-08 |
1.128D-08 |
7.949D-11 |
48 |
T |
#10 |
MEYER |
3 |
87.9458 |
87.945855 |
0.0051886 |
573 |
T |
#11 |
GULF |
3 |
0 |
3.448D-20 |
1.504D-09 |
257 |
T |
#12 |
BOX |
3 |
0 |
1.363D-23 |
1.075D-11 |
125 |
T |
#13 |
SING |
4 |
0 |
8.833D-19 |
1.552D-09 |
191 |
F |
#14 |
WOOD |
4 |
0 |
3.723D-22 |
6.727D-10 |
260 |
T |
#15 |
KOWOSB |
4 |
0.0003075 |
0.0003075 |
3.114D-10 |
102 |
T |
#16 |
BD |
4 |
85822.2 |
85822.202 |
0.0000217 |
76 |
T |
#17 |
OSB1 |
5 |
0.0000546 |
0.0000546 |
4.181D-08 |
159 |
T |
#18 |
BIGGS |
6 |
0 |
4.329D-19 |
1.035D-09 |
262 |
T |
#19 |
OSB2 |
11 |
0.0401377 |
0.0401377 |
4.995D-09 |
134 |
T |
#20 |
WATSON |
9 |
0.0000014 |
3.2700708 |
17.141036 |
9 |
F |
#21 |
ROSEX |
20 |
0 |
3.651D-21 |
2.148D-09 |
221 |
T |
#22 |
SINGX |
12 |
0 |
1.949D-13 |
5.788D-08 |
183 |
F |
#23 |
PEN1 |
10 |
0.0000709 |
0.0000709 |
1.154D-09 |
231 |
T |
#24 |
PEN2 |
10 |
0.0002937 |
0.0002937 |
0.0000002 |
286 |
T |
#25 |
VARDIM |
10 |
0 |
1.929D-26 |
1.134D-12 |
85 |
T |
#26 |
TRIG |
10 |
0 |
0.0000280 |
1.646D-10 |
53 |
F |
#27 |
ALMOST |
10 |
0 |
1.258D-23 |
2.681D-11 |
102 |
T |
#28 |
BV |
8 |
0 |
3.873D-20 |
1.015D-09 |
223 |
T |
#29 |
IE |
8 |
0 |
1.176D-19 |
7.290D-10 |
94 |
T |
#30 |
TRID |
12 |
0 |
1.162D-18 |
1.306D-08 |
102 |
T |
#31 |
BAND |
15 |
0 |
2.737D-21 |
7.509D-10 |
165 |
T |
#32 |
LIN |
10 |
5 |
5 |
1.094D-15 |
22 |
T |
#33 |
LIN1 |
10 |
3.3870968 |
3.3870968 |
2.823D-11 |
24 |
T |
#34 |
LIN0 |
10 |
4.8888889 |
4.8888889 |
0.0000026 |
63 |
T |
#35 |
CHEB |
10 |
0.0065039 |
0.0065040 |
1.912D-09 |
110 |
T |
The abstract is the following:
Time: 13 (s) Total number of problems:35 Number of successes:31 Number of failures:4 Total Feval:5011
The optim/BND/"qn" has the same global behavior as the optim/UNC/"qn" solver.
1.3.4. The optim/UNC/"gc" solver
In this section, we present the performances of the optim function without constraints (UNC) and with the "gc" solver.
Problem |
Name |
n |
F* |
F(X) |
norm(G(X)) |
Feval |
Status |
#1 |
ROSE |
2 |
0 |
0 |
0 |
53 |
T |
#2 |
FROTH |
2 |
0 |
48.984254 |
0.0000001 |
80 |
F |
#3 |
BADSCP |
2 |
0 |
0 |
0 |
206 |
T |
#4 |
BADSCB |
2 |
0 |
8.487D-26 |
0.0000438 |
34 |
T |
#5 |
BEALE |
2 |
0 |
0 |
0 |
22 |
T |
#6 |
JENSAM |
2 |
124.362 |
124.36218 |
0.0000042 |
175 |
T |
#7 |
HELIX |
3 |
0 |
4.405D-40 |
5.659D-19 |
37 |
T |
#8 |
BARD |
3 |
0.0082149 |
0.0082149 |
1.321D-09 |
111 |
T |
#9 |
GAUSS |
3 |
1.128D-08 |
1.128D-08 |
1.198D-16 |
17 |
T |
#10 |
MEYER |
3 |
87.9458 |
87.945855 |
1.1830935 |
581 |
T |
#11 |
GULF |
3 |
0 |
1.047D-30 |
1.075D-14 |
77 |
T |
#12 |
BOX |
3 |
0 |
1.541D-32 |
1.948D-16 |
48 |
T |
#13 |
SING |
4 |
0 |
1.310D-32 |
3.494D-17 |
124 |
T |
#14 |
WOOD |
4 |
0 |
2.835D-30 |
1.023D-13 |
118 |
T |
#15 |
KOWOSB |
4 |
0.0003075 |
0.0003075 |
1.261D-10 |
198 |
T |
#16 |
BD |
4 |
85822.2 |
85822.202 |
0.0000404 |
197 |
T |
#17 |
OSB1 |
5 |
0.0000546 |
0.0000546 |
4.670D-11 |
341 |
T |
#18 |
BIGGS |
6 |
0 |
0.0056556 |
3.311D-09 |
93 |
F |
#19 |
OSB2 |
11 |
0.0401377 |
0.0401377 |
3.539D-08 |
108 |
T |
#20 |
WATSON |
9 |
0.0000014 |
0.0000014 |
2.957D-08 |
858 |
T |
#21 |
ROSEX |
20 |
0 |
0 |
0 |
53 |
T |
#22 |
SINGX |
12 |
0 |
9.742D-32 |
1.591D-16 |
135 |
T |
#23 |
PEN1 |
10 |
0.0000709 |
0.0000709 |
1.808D-11 |
327 |
T |
#24 |
PEN2 |
10 |
0.0002937 |
0.0002937 |
1.081D-09 |
834 |
T |
#25 |
VARDIM |
10 |
0 |
0 |
0 |
30 |
T |
#26 |
TRIG |
10 |
0 |
0.0000280 |
2.960D-09 |
75 |
F |
#27 |
ALMOST |
10 |
0 |
3.994D-30 |
4.454D-14 |
26 |
T |
#28 |
BV |
8 |
0 |
2.998D-33 |
3.693D-16 |
75 |
T |
#29 |
IE |
8 |
0 |
3.467D-33 |
1.205D-16 |
15 |
T |
#30 |
TRID |
12 |
0 |
1.726D-30 |
5.370D-14 |
55 |
T |
#31 |
BAND |
15 |
0 |
1.670D-30 |
4.998D-14 |
28 |
T |
#32 |
LIN |
10 |
5 |
5 |
1.690D-09 |
202 |
T |
#33 |
LIN1 |
10 |
3.3870968 |
3.3870968 |
0.0000016 |
51 |
T |
#34 |
LIN0 |
10 |
4.8888889 |
4.8888889 |
0.0000018 |
117 |
T |
#35 |
CHEB |
10 |
0.0065039 |
0.0065040 |
7.170D-09 |
55 |
T |
Time: 15 (s) Total number of problems:35 Number of successes:32 Number of failures:3 Total Feval:5556
The optim/UNC/"gc" solver has a globally the same behavior as optim/UNC/"qn". Still, the Problem #7 HELIX is correctly solved by optim/UNC/"gc", which correctly finds the global minimum for this problem.
1.3.5. The optim/BND/"gc" solver
In this section, we present the performances of the optim function with infinite bound constraints (BND) and with the "gc" solver.
Problem |
Name |
n |
F* |
F(X) |
norm(G(X)) |
Feval |
Status |
#1 |
rose |
2 |
0 |
2.665D-18 |
2.421D-09 |
1001 |
T |
#2 |
froth |
2 |
0 |
48.984254 |
0.0000048 |
34 |
F |
#3 |
badscp |
2 |
0 |
0.0001093 |
0.0047767 |
1001 |
F |
#4 |
badscb |
2 |
0 |
0.0155721 |
0.2531464 |
252 |
F |
#5 |
beale |
2 |
0 |
3.668D-29 |
6.569D-15 |
434 |
T |
#6 |
jensam |
2 |
124.362 |
124.36218 |
0.0000005 |
42 |
T |
#7 |
helix |
3 |
0 |
1.894D-27 |
9.728D-14 |
608 |
T |
#8 |
bard |
3 |
0.0082149 |
0.0082149 |
6.801D-09 |
98 |
T |
#9 |
gauss |
3 |
1.128D-08 |
1.128D-08 |
9.396D-11 |
43 |
T |
#10 |
meyer |
3 |
87.9458 |
105484.05 |
281.8806 |
1001 |
F |
#11 |
gulf |
3 |
0 |
2.507D-11 |
7.013D-08 |
1002 |
T |
#12 |
box |
3 |
0 |
1.247D-13 |
2.425D-08 |
1001 |
T |
#13 |
sing |
4 |
0 |
1.055D-12 |
7.256D-09 |
1001 |
F |
#14 |
wood |
4 |
0 |
8.499D-16 |
4.297D-08 |
1001 |
T |
#15 |
kowosb |
4 |
0.0003075 |
0.0003075 |
2.550D-09 |
196 |
T |
#16 |
bd |
4 |
85822.2 |
85822.202 |
0.0001098 |
152 |
T |
#17 |
osb1 |
5 |
0.0000546 |
0.0000769 |
0.0035369 |
1001 |
F |
#18 |
biggs |
6 |
0 |
0.0056556 |
2.282D-08 |
387 |
F |
#19 |
osb2 |
11 |
0.0401377 |
0.0401377 |
3.296D-08 |
706 |
T |
#20 |
watson |
9 |
0.0000014 |
0.0000476 |
0.0003292 |
1002 |
F |
#21 |
rosex |
20 |
0 |
4.471D-24 |
2.570D-12 |
1001 |
T |
#22 |
singx |
12 |
0 |
5.600D-12 |
1.741D-08 |
1001 |
F |
#23 |
pen1 |
10 |
0.0000709 |
0.0000709 |
8.768D-11 |
231 |
T |
#24 |
pen2 |
10 |
0.0002937 |
0.0002937 |
0.0000135 |
1002 |
T |
#25 |
vardim |
10 |
0 |
1.763D-30 |
4.806D-14 |
41 |
T |
#26 |
trig |
10 |
0 |
0.0000280 |
3.965D-09 |
125 |
F |
#27 |
almost |
10 |
0 |
6.789D-20 |
5.937D-11 |
1001 |
T |
#28 |
bv |
8 |
0 |
1.352D-20 |
5.878D-11 |
1001 |
T |
#29 |
ie |
8 |
0 |
2.552D-33 |
1.224D-16 |
73 |
T |
#30 |
trid |
12 |
0 |
1.238D-29 |
3.389D-14 |
135 |
T |
#31 |
band |
15 |
0 |
3.736D-30 |
2.593D-14 |
117 |
T |
#32 |
lin |
10 |
5 |
5 |
7.809D-09 |
14 |
T |
#33 |
lin1 |
10 |
3.3870968 |
3.3870968 |
0.0000003 |
9 |
T |
#34 |
lin0 |
10 |
4.8888889 |
4.8888889 |
0.0000170 |
25 |
T |
#35 |
cheb |
10 |
0.0065039 |
0.0065040 |
2.045D-08 |
236 |
T |
Time: 48 (s) Total number of problems:35 Number of successes:25 Number of failures:10 Total Feval:17975
The optim/BND/"gc" solver seems less robust than the other solvers and produces more failures. For example, optim/BND/"gc" fails to solve the Problem #3 BADSCP, which is solved by optim/UNC/"gc".
Here is the detailed information for the Problem #3 BADSCP with the optim/BND/"gc" algorithm.
F eval: 1001 G eval: 1001 Status X: FAIL Status F: FAIL f(X)=0.0001093 (f*=0) ||G(X)||= 0.0047767 ||X-X*||=4.5568734 |f(x)-f(x*)|=0.0001093
We see that the gradient is far from being zero and that the maximum number of function evaluations has been reached. We also see that the variables are badly scaled:
-->xopt xopt = 0.0000220 4.5492734
Finally, the condition number of the Hessian matrix is large, because the eigenvalues have very different magnitudes.
-->disp(spec(H)) 0.0004445 4.139D+09 -->cond(H) ans = 9.312D+12
1.3.6. The optim/BND/"nd" solver
In this section, we present the performances of the optim function with infinite bound constraints (BND) and with the "nd" solver.
Problem |
Name |
n |
F* |
F(X) |
norm(G(X)) |
Feval |
Status |
#1 |
ROSE |
2 |
0 |
2.045D-12 |
0.0000019 |
104 |
T |
#2 |
FROTH |
2 |
0 |
48.984254 |
0.0016364 |
50 |
F |
#3 |
BADSCP |
2 |
0 |
0.0000277 |
0.0000566 |
105 |
F |
#4 |
BADSCB |
2 |
0 |
1.431D-15 |
7.567D-08 |
99 |
T |
#5 |
BEALE |
2 |
0 |
3.847D-17 |
3.745D-08 |
94 |
T |
#6 |
JENSAM |
2 |
124.362 |
124.36218 |
0.0000007 |
62 |
T |
#7 |
HELIX |
3 |
0 |
0.6522108 |
1.0260011 |
115 |
F |
#8 |
BARD |
3 |
0.0082149 |
0.0082149 |
0.0000728 |
56 |
F |
#9 |
GAUSS |
3 |
1.128D-08 |
0.0000039 |
3.532D-14 |
12 |
F |
#10 |
MEYER |
3 |
87.9458 |
84937.468 |
148.56332 |
297 |
F |
#11 |
GULF |
3 |
0 |
6.6203499 |
0.4287628 |
20 |
F |
#12 |
BOX |
3 |
0 |
1.722D-08 |
0.0000023 |
100 |
F |
#13 |
SING |
4 |
0 |
3.046D-10 |
0.0000041 |
125 |
F |
#14 |
WOOD |
4 |
0 |
5.095D-12 |
0.0002007 |
102 |
T |
#15 |
KOWOSB |
4 |
0.0003075 |
0.0003075 |
0.0000018 |
109 |
F |
#16 |
BD |
4 |
85822.2 |
85822.202 |
0.0003883 |
142 |
T |
#17 |
OSB1 |
5 |
0.0000546 |
0.0000551 |
0.0000088 |
354 |
F |
#18 |
BIGGS |
6 |
0 |
0.0056556 |
0.0000020 |
158 |
F |
#19 |
OSB2 |
11 |
0.0401377 |
0.0401377 |
0.0000001 |
766 |
T |
#20 |
WATSON |
9 |
0.0000014 |
0.0000045 |
0.0000212 |
1000 |
F |
#21 |
ROSEX |
20 |
0 |
5.051D-09 |
0.0000226 |
136 |
F |
#22 |
SINGX |
12 |
0 |
1.074D-08 |
0.0000633 |
77 |
F |
#23 |
PEN1 |
10 |
0.0000709 |
0.0000929 |
1.970D-13 |
45 |
F |
#24 |
PEN2 |
10 |
0.0002937 |
0.0002938 |
0.0001626 |
147 |
F |
#25 |
VARDIM |
10 |
0 |
5.577D-12 |
8.857D-16 |
34 |
T |
#26 |
TRIG |
10 |
0 |
0.0000280 |
0.0000001 |
95 |
F |
#27 |
ALMOST |
10 |
0 |
0.0000008 |
6.811D-14 |
35 |
F |
#28 |
BV |
8 |
0 |
1.115D-15 |
2.090D-08 |
234 |
T |
#29 |
IE |
8 |
0 |
1.395D-15 |
0.0000001 |
36 |
T |
#30 |
TRID |
12 |
0 |
3.393D-17 |
4.554D-08 |
64 |
T |
#31 |
BAND |
15 |
0 |
2.256D-16 |
0.0000002 |
77 |
T |
#32 |
LIN |
10 |
5 |
5 |
5.163D-16 |
15 |
T |
#33 |
LIN1 |
10 |
3.3870968 |
3.3870968 |
4.737D-11 |
9 |
T |
#34 |
LIN0 |
10 |
4.8888889 |
4.8888889 |
1.363D-12 |
17 |
T |
#35 |
CHEB |
10 |
0.0065039 |
0.0065082 |
0.0018140 |
92 |
F |
Time: 13 (s) Total number of problems:35 Number of successes:15 Number of failures:20 Total Feval:4983
1.4. Numerical results for fminsearch
In this section, we present the results of the fminsearch function in Scilab on the MGH benchmark.
1.4.1. Introduction
Each table presents :
- n: the number of unknowns
- F*: the known function global minimum
- F(X) : the function value at computed solution
- norm(G(X)) : the norm of the gradient at computed solution
- Feval: the number of function evaluations
- Status: the status of optimization. Status=T if the solver succeeded, and status=F if the solver failed.
We used the tolerances:
rtolF = 1.e-4; atolF = 1.e-4; rtolX = 1.e-4; atolX = 1.e-6;
The status of the optimization is T if:
- the convergence on F is achieved if
abs(fopt-fstar)< rtolF*abs(fstar) + atolF
- the convergence on X is achieved if
norm(xopt-xstar)< rtolX*norm(xstar) + atolX
The fminsearch function is used with its default parameters, except that the warnings produced by the fminsearch function when convergence is not reached. Indeed, in this benchmark, we do not need messages such as:
fminsearch: Exiting: Maximum number of function evaluations has been exceeded - increase MaxFunEvals option. Current function value: 0.0000309
This type of message is extremely useful during one single optimization, but here we run 35 optimization, so that the messages are not useful. This is why we set the "Display" option to "off":
fmsopt = optimset ("Display","off");
The default convergence criteria is used for fminsearch.
options.TolFun: The absolute tolerance on function value. The default value is 1.e-4.
- options.TolX: The absolute tolerance on simplex size. The default value is 1.e-4.
Moreover, we need that the convergence is achieved at optimum. The default maximum number of iterations is 200 * n, where n is the number of variables. The default maximum number of evaluations of the cost function is 200 * n, where n is the number of variables. There are cases where these default settings limit the convergence of the algorithm. This is why we increase the number of function evaluations and the number of iterations.
fmsopt = optimset ("Display","off", "MaxFunEvals", 5000, "MaxIter", 5000 );
1.4.2. Numerical results
Problem |
Name |
n |
F* |
F(X) |
norm(G(X)) |
Feval |
Status |
#1 |
ROSE |
2 |
0 |
8.178D-10 |
0.0008555 |
159 |
T |
#2 |
FROTH |
2 |
0 |
48.984254 |
0.0010169 |
120 |
F |
#3 |
BADSCP |
2 |
0 |
3.143D-17 |
0.0010209 |
686 |
T |
#4 |
BADSCB |
2 |
0 |
1.186D-09 |
24.842629 |
277 |
T |
#5 |
BEALE |
2 |
0 |
1.393D-10 |
0.0000612 |
107 |
T |
#6 |
JENSAM |
2 |
124.362 |
124.36218 |
0.6436706 |
72 |
T |
#7 |
HELIX |
3 |
0 |
6.707D-10 |
0.0005217 |
205 |
T |
#8 |
BARD |
3 |
0.0082149 |
0.0082149 |
0.0000187 |
226 |
T |
#9 |
GAUSS |
3 |
1.128D-08 |
1.157D-08 |
0.0000393 |
67 |
T |
#10 |
MEYER |
3 |
87.9458 |
87.945855 |
91.523916 |
1774 |
T |
#11 |
GULF |
3 |
0 |
2.023D-13 |
0.0000014 |
578 |
T |
#12 |
BOX |
3 |
0 |
7.279D-13 |
0.0000009 |
355 |
T |
#13 |
SING |
4 |
0 |
7.458D-15 |
0.0000005 |
458 |
F |
#14 |
WOOD |
4 |
0 |
1.945D-09 |
0.0013649 |
535 |
T |
#15 |
KOWOSB |
4 |
0.0003075 |
0.0003075 |
0.0000067 |
264 |
T |
#16 |
BD |
4 |
85822.2 |
85822.202 |
0.1733833 |
333 |
T |
#17 |
OSB1 |
5 |
0.0000546 |
0.0000546 |
0.0001660 |
915 |
F |
#18 |
BIGGS |
6 |
0 |
0.0056556 |
0.0000023 |
943 |
F |
#19 |
OSB2 |
11 |
0.0401377 |
0.0401377 |
0.0001506 |
3827 |
T |
#20 |
WATSON |
9 |
0.0000014 |
0.0001053 |
0.0006689 |
1290 |
F |
#21 |
ROSEX |
20 |
0 |
27.018231 |
8.785632 |
5000 |
F |
#22 |
SINGX |
12 |
0 |
0.1562408 |
1.4288539 |
5001 |
F |
#23 |
PEN1 |
10 |
0.0000709 |
0.0000757 |
0.0000456 |
3923 |
T |
#24 |
PEN2 |
10 |
0.0002937 |
0.0002979 |
0.0000254 |
4027 |
T |
#25 |
VARDIM |
10 |
0 |
1.984599 |
3.5076609 |
2920 |
F |
#26 |
TRIG |
10 |
0 |
0.0000280 |
0.0001171 |
2307 |
F |
#27 |
ALMOST |
10 |
0 |
4.090D-10 |
0.0000966 |
4206 |
T |
#28 |
BV |
8 |
0 |
5.988D-10 |
0.0001456 |
483 |
T |
#29 |
IE |
8 |
0 |
3.050D-09 |
0.0001188 |
579 |
T |
#30 |
TRID |
12 |
0 |
0.0000002 |
0.0043313 |
1356 |
T |
#31 |
BAND |
15 |
0 |
0.0000014 |
0.0171835 |
3183 |
T |
#32 |
LIN |
10 |
5 |
5 |
0.0001311 |
2014 |
T |
#33 |
LIN1 |
10 |
3.3870968 |
3.3871289 |
7.8359291 |
383 |
T |
#34 |
LIN0 |
10 |
4.8888889 |
4.8889214 |
5.5039349 |
365 |
T |
#35 |
CHEB |
10 |
0.0065039 |
0.0047824 |
0.0083048 |
1151 |
F |
Here is the abstract of the benchmark for the fminsearch function:
Time: 273 (s) Total number of problems:35 Number of successes:25 Number of failures:10 Total Feval:50089
We notice that the solver typically needs from 100 to 1000 function evaluations when the number of unknown is from 2 to 5, and may require more than 2000 when the number of unknowns is greater than 10.
We notice that the time required to run all the test is quite long. This reflects the general experience that the algorithm is generally slow to converge.
1.4.3. Reaching the maximum number of function evaluations
It may happen that the solver may require more function evaluations than the default number. For example, in the problem #3 "BADSCP", the algorithm requires 686 function evaluations to converge to the X and F tolerances. The default number of function evaluations is 200*n, where n is the number of unknowns. In the case of the problem #3, we have n=2, which leads to 400 function evaluations by default. The following session shows what happens when we solve the case with the default number of function evaluations.
The following script uses fminsearch to optimize the problem #3. We use the optimplotfval function, which plots the function value depending on the time.
nprob = 3; [n,m,x0] = uncprb_getinitf(nprob); [...] fmsopt = optimset ("PlotFcns" , optimplotfval); [xopt,fopt,exitflag,output]=fminsearch(objfun,x0,fmsopt); h = gcf(); h.children.log_flags="nln";
The previous script produces the following output.
--> [xopt,fopt,exitflag,output]=fminsearch(objfun,x0,fmsopt); fminsearch: Exiting: Maximum number of function evaluations has been exceeded - increase MaxFunEvals option. Current function value: 1.886D-08
The warning is clear about the fact that the maximum number of function evaluations has been exceeded. The previous script produces the following plot. We see that the function is still decreasing at the optimum, so that we can hope to achieve the F and X tolerances if we increase the number of function evaluations.
In the following script, we increase the maximum number of function evaluations.
-->fmsopt = optimset ("MaxFunEvals", 5000, "MaxIter", 5000); -->[xopt,fopt,exitflag,output]=fminsearch(objfun,x0,fmsopt);
The warning is not produced anymore: the tolerance criteria on X and F have been reached.
1.4.4. Failures of the fminsearch function
In this section, we analyze the reasons of the failures of the fminsearch algorithm.
The cases where the fminsearch algorithm fails are presented below.
Problem |
Name |
n |
F* |
F(X) |
norm(G(X)) |
Feval |
Status |
#2 |
FROTH |
2 |
0 |
48.984254 |
0.0010169 |
120 |
F |
#13 |
SING |
4 |
0 |
7.458D-15 |
0.0000005 |
458 |
F |
#17 |
OSB1 |
5 |
0.0000546 |
0.0000546 |
0.0001660 |
915 |
F |
#18 |
BIGGS |
6 |
0 |
0.0056556 |
0.0000023 |
943 |
F |
#20 |
WATSON |
9 |
0.0000014 |
0.0001053 |
0.0006689 |
1290 |
F |
#21 |
ROSEX |
20 |
0 |
27.018231 |
8.785632 |
5000 |
F |
#22 |
SINGX |
12 |
0 |
0.1562408 |
1.4288539 |
5001 |
F |
#25 |
VARDIM |
10 |
0 |
1.984599 |
3.5076609 |
2920 |
F |
#26 |
TRIG |
10 |
0 |
0.0000280 |
0.0001171 |
2307 |
F |
#35 |
CHEB |
10 |
0.0065039 |
0.0047824 |
0.0083048 |
1151 |
F |
Problem |
Name |
Cause of failure |
#2 |
FROTH |
Local minimum found, but not global minimum |
#13 |
SING |
Local minimum found, but tolerance on X not achieved |
#17 |
OSB1 |
Local minimum found, but tolerance on X not achieved |
#18 |
BIGGS |
Local minimum not found (saddle point) |
#20 |
WATSON |
Local minimum not found |
#21 |
ROSEX |
Maximum number of F eval. reached. |
#22 |
SINGX |
Maximum number of F eval. reached. |
#25 |
VARDIM |
Local minimum not found |
#26 |
TRIG |
Local minimum not found |
#35 |
CHEB |
Local minimum not found |
1.4.5. Failure of fminsearch on the problems #13 "SING" and #17 "OSB1"
In the case of the problem #13, the algorithm has found the correct minimum, but the problem is identified as not being solved accurately.
-->nprob = 13; [...] --> [fstar,xstar] = uncprb_getopt(nprob,n,m) xstar = 0. 0. 0. 0. fstar = 0. --> success = uncprb_printsummary("fminsearch", nprob, fopt, xopt, gopt, iter, funeval, geval, .. --> rtolF, atolF, rtolX, atolX, "detailed"); fminsearch: Iterations: 280 F eval: 458 Status X: F Status F: T f(X)=7.458D-15 (f*=0) ||G(X)||= 0.0000005 ||X-X*||=0.0002160 |f(x)-f(x*)|=7.458D-15 -->xopt xopt = - 0.0001821 0.0000182 - 0.0000811 - 0.0000811 -->norm(xopt-xstar) ans = 0.0002160
We see that the distance to the optimum is equal to 2.e-4, which is slightly greater than 1.e-4. Hence, the simplex size has become smaller than the tolerance 1.e-4, but this does not imply an accuracy on X lower than 1.e-4. Hence, the fminsearch function has found the minimum of the the problem #3, but has not found an X accurate to the specified tolerance.
The same problem happens for the problem #17 "OSB1": the correct minimum is found, but the tolerance on X is not achieved as specified.
1.4.6. Failure of fminsearch on the problem #20 "WATSON"
In the problem #20 "WATSON", the fminsearch function fails to compute an accurate value of X or F.
fminsearch: Iterations: 887 F eval: 1290 Status X: F Status F: F f(X)=0.0001053 (f*=0.0000014) Rel. Err. on F: 74.232488 ||G(X)||= 0.0006689 ||X-X*||=5.9686783 Rel. Err. on X: 66.524072 |f(x)-f(x*)|=0.0001039 -->xstar xstar = - 0.0000153 0.9997897 0.0147640 0.1463423 1.0008211 - 2.6177311 4.1044032 - 3.1436123 1.0526264 -->xopt xopt = - 0.0010336 1.0011595 0.0006476 0.2494037 0.2675769 - 0.0923142 - 0.0570717 0.0910339 0.0944453
The gradient seems to be relatively small at the computed X.
We tried to increase to number of function evaluations and iterations and to reduce the tolerances in order to reach the optimum. More precisely, we configure the following options:
fmsopt = optimset ("PlotFcns" , optimplotfval, "Display","iter", "TolX",1.e-7, "TolFun",1.e-7, "MaxFunEvals", 5000, "MaxIter", 5000);
The following plot presents the function value depending on the iteration number.
It seems that fminsearch reaches several plateaux, which appear to be difficult to espace from.
We notice that Burmen and Tuma (2009) used more than 5000 function evaluations to reach the minimum with their various Nelder-Mead algorithms.
1.4.7. Failure of fminsearch on the problem #25 "VARDIM"
In the problem #25 "VARDIM", the fminsearch function fails to compute an accurate value of X or F.
fminsearch: Iterations: 2081 F eval: 2920 Status X: F Status F: F f(X)=1.984599 (f*=0) ||G(X)||= 3.5076608 ||X-X*||=1.4077625 Rel. Err. on X: 0.9795623 |f(x)-f(x*)|=1.984599
The gradient is far from being zero and the function value is relatively large.
The following plot presents the function value depending on the iteration number. The function value seems to stagnate at the end of the optimization.
1.4.8. Failure of fminsearch on the problem #26 "TRIG"
In the problem #26 "TRIG", the fminsearch function fails to compute an accurate value of X.
fminsearch: Iterations: 1657 F eval: 2307 Status X: F Status F: T f(X)=0.0000280 (f*=0) ||G(X)||= 0.0001171 ||X-X*||=0.1849874 Rel. Err. on X: 3.0630751 |f(x)-f(x*)|=0.0000280 -->xopt xopt = 0.0551739 0.0568700 0.0587352 0.0610559 0.0636009 0.0668881 0.2081812 0.1642841 0.0851624 0.0915378 -->xstar xstar = 0.0429646 0.0439763 0.0450934 0.0463389 0.0477444 0.0493547 0.0512373 0.1952095 0.1649777 0.0601486
The gradient seems to be relatively small at the computed X.
We notice that Burmen and Tuma (2009) used from 1500 to 2500 function evaluations to reach the minimum with their various Nelder-Mead algorithms.
1.4.9. Failure of fminsearch on the problem #35 "CHEB"
fminsearch: Iterations: 811 F eval: 1151 Status X: T Status F: F f(X)=0.0047824 (f*=0.0065039) Rel. Err. on F: 0.2646956 ||G(X)||= 0.0083048 |f(x)-f(x*)|=0.0017216
The computed optimum X cannot be evaluated, since the expected minimum X is not known for this problem.
-->xopt xopt = 0.0741528 0.1699734 0.2852316 0.3582438 0.4691236 0.6144995 0.6172949 0.7991077 0.8448565 0.9668736 -->xstar xstar = []
The gradient seems to be relatively small at the computed X.
1.5. Numerical results for lsqrsolve
In this section, we present the results of the lsqrsolve function on the MGH benchmark. We performed two different tests:
- lsqrsolve with function values only (and Jacobian computed by lsqrsolve with finite differences),
- lsqrsolve with exact Jacobian.
1.5.1. Introduction
The lsqrsolve function solves nonlinear least squares problems with a Levenberg-Marquardt method. This function is dedicated to nonlinear least squares problems, so that it should perform very well on this type of test cases. The lsqrsolve function is connected to the Minpack functions
LMDIF, when the Jacobian is not provided (it is computed with finite differences)
LMDER, when the Jacobian is provided by the user.
Notice that there are may different ways to implement the Levenberg-Marquardt method. The implementation in Minpack was created by Garbow, Hillstrom and Moré, the same authors of the MGH benchmark we are just using. Therefore, we expect that the solver should perform well on this benchmark.
Reference papers for the Minpack implementation are presented in:
The Levenberg-Marquardt algorithm: Implementation and theory, Lecture Notes in Mathematics, 1978, Volume 630/1978, 105-116, Jorge J. Moré
User Guide for MINPACK-1, J. J. Moré, B. S. Garbow, and K. E. Hillstrom, Argonne National Laboratory Report ANL-80-74, Argonne, Ill., 1980.
Some details are also given on the Wikipedia page [http://en.wikipedia.org/wiki/MINPACK|Minpack].
Each table presents :
- n: the number of unknowns
- F*: the known function global minimum
- F(X) : the function value at computed solution
- norm(G(X)) : the norm of the gradient at computed solution
- Feval: the number of function evaluations
- Jeval: the number of Jacobian evaluations
- Status: the status of optimization. Status=T if the solver succeeded, and status=F if the solver failed.
We used the tolerances:
rtolF = 1.e-4; atolF = 1.e-10; rtolX = 1.e-4; atolX = 1.e-6;
The status of the optimization is T if:
- the convergence on F is achieved if
abs(fopt-fstar)< rtolF*abs(fstar) + atolF
- the convergence on X is achieved if
norm(xopt-xstar)< rtolX*norm(xstar) + atolX
The following table summarizes the results of the lsqrsolve function with or without the exact Jacobian matrix.
Solver |
Pass |
Fail |
Feval |
Jevals |
lsqrsolve without J |
30 |
5 |
6107 |
- |
lsqrsolve with J |
28 |
7 |
975 |
826 |
1.5.2. lsqrsolve without J
In this section, we present the results of lsqrsolve with function values only (and Jacobian computed by lsqrsolve with finite differences).
We used lsrsolve with the following script
objfun = list(uncprb_lsqrsolvefun,nprob,n); [xopt, ropt,info]=lsqrsolve(x0,objfun,m);
where uncprb_lsqrsolvefun is a function which returns the function value. Hence, the Jacobian matrix is computed by lsrsolve with order 1 finite differences. Notice that
Problem |
Name |
n |
F* |
F(X) |
norm(G(X)) |
Feval |
Status |
#1 |
ROSE |
2 |
0 |
0 |
0 |
54 |
T |
#2 |
FROTH |
2 |
0 |
48.984254 |
0.0046737 |
30 |
F |
#3 |
BADSCP |
2 |
0 |
0 |
0 |
53 |
T |
#4 |
BADSCB |
2 |
0 |
1.940D-24 |
0.0000028 |
46 |
T |
#5 |
BEALE |
2 |
0 |
0 |
0 |
23 |
T |
#6 |
JENSAM |
2 |
124.362 |
124.36218 |
0.0595399 |
45 |
T |
#7 |
HELIX |
3 |
0 |
1.235D-32 |
4.177D-15 |
35 |
T |
#8 |
BARD |
3 |
0.0082149 |
0.0082149 |
5.369D-09 |
21 |
T |
#9 |
GAUSS |
3 |
1.128D-08 |
1.128D-08 |
1.188D-12 |
12 |
T |
#10 |
MEYER |
3 |
87.9458 |
87.945855 |
6.4542802 |
473 |
T |
#11 |
GULF |
3 |
0 |
1.477D-30 |
1.032D-14 |
77 |
T |
#12 |
BOX |
3 |
0 |
2.465D-32 |
2.759D-16 |
25 |
T |
#13 |
SING |
4 |
0 |
1.250D-40 |
1.140D-29 |
222 |
T |
#14 |
WOOD |
4 |
0 |
7.8769659 |
0.0012210 |
35 |
F |
#15 |
KOWOSB |
4 |
0.0003075 |
0.0003075 |
0.0000003 |
81 |
F |
#16 |
BD |
4 |
85822.2 |
85822.695 |
57.256383 |
1001 |
T |
#17 |
OSB1 |
5 |
0.0000546 |
0.0000546 |
2.210D-08 |
93 |
T |
#18 |
BIGGS |
6 |
0 |
4.006D-31 |
2.324D-15 |
221 |
T |
#19 |
OSB2 |
11 |
0.0401377 |
0.0401377 |
0.0000036 |
159 |
T |
#20 |
WATSON |
9 |
0.0000014 |
0.0000014 |
0.0000307 |
70 |
T |
#21 |
ROSEX |
20 |
0 |
0 |
0 |
342 |
T |
#22 |
SINGX |
12 |
0 |
4.333D-46 |
5.717D-34 |
731 |
T |
#23 |
PEN1 |
10 |
0.0000709 |
0.0000709 |
3.930D-08 |
754 |
T |
#24 |
PEN2 |
10 |
0.0002937 |
0.0002937 |
0.0000003 |
700 |
T |
#25 |
VARDIM |
10 |
0 |
0 |
0 |
111 |
T |
#26 |
TRIG |
10 |
0 |
0.0000280 |
0.0000002 |
188 |
F |
#27 |
ALMOST |
10 |
0 |
1 |
5.684D-09 |
57 |
F |
#28 |
BV |
8 |
0 |
2.420D-33 |
3.391D-16 |
37 |
T |
#29 |
IE |
8 |
0 |
2.407D-34 |
3.264D-17 |
37 |
T |
#30 |
TRID |
12 |
0 |
3.846D-30 |
2.295D-14 |
66 |
T |
#31 |
BAND |
15 |
0 |
2.012D-30 |
2.157D-14 |
97 |
T |
#32 |
LIN |
10 |
5 |
5 |
0.0000021 |
22 |
T |
#33 |
LIN1 |
10 |
3.3870968 |
3.3870968 |
1.535D-08 |
22 |
T |
#34 |
LIN0 |
10 |
4.8888889 |
4.8888889 |
6.974D-09 |
22 |
T |
#35 |
CHEB |
10 |
0.0065039 |
0.0065040 |
0.0000091 |
145 |
T |
Time: 9 (s) Total=35 Passed=30 Failed=5 Total Feval:6107
1.5.2.1. Failure of lsqrsolve without J
The following problems are failing with lsqrsolve.
Problem |
Name |
n |
F* |
F(X) |
norm(G(X)) |
Feval |
Status |
Cause |
#2 |
FROTH |
2 |
0 |
48.984254 |
0.0046737 |
30 |
F |
Local Minimum (but not global) |
#14 |
WOOD |
4 |
0 |
7.8769659 |
0.0012210 |
35 |
F |
Default tolerance too large on norm(G) |
#15 |
KOWOSB |
4 |
0.0003075 |
0.0003075 |
0.0000003 |
81 |
F |
Default tolerance too large on norm(G) |
#26 |
TRIG |
10 |
0 |
0.0000280 |
0.0000002 |
188 |
F |
Local Minimum (but not global) |
#27 |
ALMOST |
10 |
0 |
1 |
5.684D-09 |
57 |
F |
Failure. |
In the following list, we analyze the causes of the failures of lsqrsolve in these cases.
- Problem #2 : the local minimum, associated with F = 48.984254 has been correctly found. This is a wrong failure: lsqrsolve works well here.
- Problem #14: the default tolerances of lsqrsolve are set to a too low value, which causes lsqrsolve to converge too early in the iterations:
lsqrsolve: F eval: 35 G eval: 0 Status X: F Status F: F f(X)=7.8769659 (f*=0) ||G(X)||= 0.0012210 ||X-X*||=2.7851541 Rel. Err. on X: 1.9694891 |f(x)-f(x*)|=7.8769659
- This is obvious from the value of the gradient, which is low at this point. Setting the stop variable to
stop = [1.d-8,1.d-8,1.d-6,1000,0,100]; [xopt, ropt,info]=lsqrsolve(x0,objfun,m,stop);
- that is, reducing the default value of the tolerance on the gradient from 1.e-5 to 1.e-6 makes lsqrsolve pass this test:
lsqrsolve: F eval: 326 G eval: 0 Status X: T Status F: T f(X)=0 (f*=0) ||G(X)||= 0 ||X-X*||=0 Rel. Err. on X: 0 |f(x)-f(x*)|=0
- We can even further reduce the tolerance on the gradient:
stop = [1.d-8,1.d-8,sqrt(%eps),1000,0,100];
but this produces the same result. Remember that sqrt(%eps) means that approximately half of the significant digits:
-->sqrt(%eps) ans = 1.490D-08
- Problem #15: the same issue is for the problem #15, which converges to early in the iterations.
lsqrsolve: F eval: 81 G eval: 0 Status X: F Status F: T f(X)=0.0003075 (f*=0.0003075) Rel. Err. on F: 0.0000020 ||G(X)||= 0.0000003 ||X-X*||=0.0000357 Rel. Err. on X: 0.0001659 |f(x)-f(x*)|=6.057D-10
The norm of the gradient is low, which indicates that lsqrsolve is close to an optimum, but has not reached it. We reduce the tolerance on the gradient to
stop = [1.d-8,1.d-8,sqrt(%eps),1000,0,100];
and get :
lsqrsolve: F eval: 82 G eval: 0 Status X: T Status F: T f(X)=0.0003075 (f*=0.0003075) Rel. Err. on F: 0.0000020 ||G(X)||= 0.0000002 ||X-X*||=0.0000214 Rel. Err. on X: 0.0001008 |f(x)-f(x*)|=6.046D-10
- We see that just 1 more iteration suffices to make the test pass.
- Problem #26: lsqrsolve seems to stop rather close to the minimum:
lsqrsolve: F eval: 189 G eval: 0 Status X: F Status F: F f(X)=0.0000280 (f*=0) ||G(X)||= 7.442D-08 ||X-X*||=0.1849978 Rel. Err. on X: 3.062689 |f(x)-f(x*)|=0.0000280
- The gradient is small. The eigenvalues of the Hessian matrix are positive and the Hessian matrix is not ill-conditionned:
-->disp(spec(H)) 0.0626004 0.1212673 0.2688954 0.4904908 0.8265437 1.0504011 1.2715541 1.493572 1.7199117 2.5935519 -->cond(H) ans = 41.430287
We can see that lsqrsolve converges to the known local minimum: f=2.79506e-5 at x=[ 0.0551509; 0.0568408; 0.0587639; 0.0609906; 0.0636263; 0.0668432; 0.208162; 0.164363; 0.0850068; 0.0914314]. This is the same issue as in Problem #2.
- Problem #27: lsqrsolve seems to converge close to a minimum, since the norm of the gradient is low.
lsqrsolve: F eval: 57 G eval: 0 Status X: F Status F: F f(X)=1 (f*=0) ||G(X)||= 5.193D-09 ||X-X*||=11.0113 Rel. Err. on X: 10.546913 |f(x)-f(x*)|=1
The parameters seems to be far from the optimum (x*=[1,1,...,1]):
-->xopt xopt = - 0.0546913 - 0.0546913 - 0.0546913 - 0.0546913 - 0.0546913 - 0.0546913 - 0.0546913 - 0.0546913 - 0.0546913 11.546913
- Reducing the tolerances does not seem to help. We notice that the condition number of the Hessian matrix is relatively ill-conditionned:
-->cond(Hfd) ans = 6.800D+09
- From the results presented in the next section, we see that lsqrsolve works much better when the exact jacobian is provided. It may happen that the numerical derivatives based on finite differences are too approximate, but this claim should be supported by more numerical experiments.
In the problem #16 "BD", lsqrsolve has a weird behavior: the value of X and F are correct, but the maximum number of iterations has been reached. Moreover, the norm of G is large: norm(G(X))=57.25. The detailed report for this case is the following:
lsqrsolve: F eval: 266 G eval: 247 Status X: T Status F: T f(X)=85822.206 (f*=85822.2) Rel. Err. on F: 6.537D-08 ||G(X)||= 5.0316826 |f(x)-f(x*)|=0.0056098
In fact, the optimum X is not known for this case: only the minimum function value is known. The following session shows that, even if the gradient is large at optimum, it has been reduced by 5 orders of magnitude from x0 to xopt.
-->gopt = uncprb_getgrdfcn ( n , m , xopt , nprob ) gopt = 3.1754682 - 3.9023233 - 0.0705361 - 0.0336054 -->g0 = uncprb_getgrdfcn ( n , m , x0 , nprob ) g0 = 1149322.8 1779291.7 - 254579.59 - 173400.43 -->norm(gopt)/norm(g0) ans = 0.0000024
1.5.3. lsqrsolve with exact J
Problem |
Name |
n |
F* |
F(X) |
norm(G(X)) |
Feval |
Jeval |
Status |
#1 |
ROSE |
2 |
0 |
0 |
0 |
21 |
16 |
T |
#2 |
FROTH |
2 |
0 |
48.984254 |
0.0046737 |
14 |
8 |
F |
#3 |
BADSCP |
2 |
0 |
0 |
0 |
19 |
17 |
T |
#4 |
BADSCB |
2 |
0 |
0 |
0 |
16 |
15 |
T |
#5 |
BEALE |
2 |
0 |
4.930D-32 |
1.351D-15 |
9 |
7 |
T |
#6 |
JENSAM |
2 |
124.362 |
124.36218 |
0.0595297 |
21 |
12 |
T |
#7 |
HELIX |
3 |
0 |
6.013D-64 |
9.218D-31 |
10 |
9 |
T |
#8 |
BARD |
3 |
0.0082149 |
0.0082149 |
6.072D-09 |
6 |
5 |
T |
#9 |
GAUSS |
3 |
1.128D-08 |
1.128D-08 |
1.186D-12 |
3 |
3 |
T |
#10 |
MEYER |
3 |
87.9458 |
87.945855 |
8.2633313 |
125 |
116 |
T |
#11 |
GULF |
3 |
0 |
1.139D-30 |
5.263D-15 |
23 |
18 |
T |
#12 |
BOX |
3 |
0 |
7.704D-32 |
6.083D-16 |
7 |
6 |
T |
#13 |
SING |
4 |
0 |
8.324D-24 |
4.907D-17 |
22 |
22 |
F |
#14 |
WOOD |
4 |
0 |
7.8769659 |
0.0012210 |
7 |
7 |
F |
#15 |
KOWOSB |
4 |
0.0003075 |
0.0003075 |
0.0000003 |
17 |
16 |
F |
#16 |
BD |
4 |
85822.2 |
85822.206 |
5.2185761 |
265 |
246 |
T |
#17 |
OSB1 |
5 |
0.0000546 |
0.0000546 |
3.917D-08 |
18 |
15 |
T |
#18 |
BIGGS |
6 |
0 |
8.567D-31 |
3.696D-15 |
40 |
29 |
T |
#19 |
OSB2 |
11 |
0.0401377 |
0.0401377 |
0.0000036 |
16 |
13 |
T |
#20 |
WATSON |
9 |
0.0000014 |
0.0000014 |
5.858D-11 |
7 |
7 |
T |
#21 |
ROSEX |
20 |
0 |
0 |
0 |
21 |
16 |
T |
#22 |
SINGX |
12 |
0 |
3.995D-22 |
6.800D-16 |
21 |
21 |
F |
#23 |
PEN1 |
10 |
0.0000709 |
0.0000709 |
3.388D-08 |
79 |
64 |
T |
#24 |
PEN2 |
10 |
0.0002937 |
0.0002937 |
0.0000003 |
80 |
62 |
T |
#25 |
VARDIM |
10 |
0 |
0 |
0 |
11 |
10 |
T |
#26 |
TRIG |
10 |
0 |
0.0000280 |
0.0000002 |
28 |
16 |
F |
#27 |
ALMOST |
10 |
0 |
3.081D-30 |
3.830D-14 |
15 |
13 |
F |
#28 |
BV |
8 |
0 |
3.537D-33 |
3.677D-16 |
5 |
4 |
T |
#29 |
IE |
8 |
0 |
3.130D-33 |
1.158D-16 |
5 |
4 |
T |
#30 |
TRID |
12 |
0 |
2.934D-30 |
2.853D-14 |
6 |
5 |
T |
#31 |
BAND |
15 |
0 |
1.917D-30 |
2.053D-14 |
7 |
6 |
T |
#32 |
LIN |
10 |
5 |
5 |
4.747D-15 |
2 |
2 |
T |
#33 |
LIN1 |
10 |
3.3870968 |
3.3870968 |
1.535D-08 |
2 |
2 |
T |
#34 |
LIN0 |
10 |
4.8888889 |
4.8888889 |
6.974D-09 |
2 |
2 |
T |
#35 |
CHEB |
10 |
0.0065039 |
0.0065040 |
0.0000092 |
25 |
12 |
T |
Time: 3 (s) Total=35 Passed=28 Failed=7 Total Feval:975 Total Jeval:826
1.5.3.1. Failure of lsqrsolve with exact J
The following tests do not pass with lsqrsolve with exact J.
Problem |
Name |
n |
F* |
F(X) |
norm(G(X)) |
Feval |
Jeval |
Status |
Cause |
#2 |
FROTH |
2 |
0 |
48.984254 |
0.0046737 |
14 |
8 |
F |
Local Minimum (but not global) |
#13 |
SING |
4 |
0 |
8.324D-24 |
4.907D-17 |
22 |
22 |
F |
Default tolerance too large on norm(G) |
#14 |
WOOD |
4 |
0 |
7.8769659 |
0.0012210 |
7 |
7 |
F |
Default tolerance too large on norm(G) |
#15 |
KOWOSB |
4 |
0.0003075 |
0.0003075 |
0.0000003 |
17 |
16 |
F |
Default tolerance too large on norm(G) |
#22 |
SINGX |
12 |
0 |
3.995D-22 |
6.800D-16 |
21 |
21 |
F |
Default tolerance too large on norm(G) |
#26 |
TRIG |
10 |
0 |
0.0000280 |
0.0000002 |
28 |
16 |
F |
Local Minimum (but not global) |
#27 |
ALMOST |
10 |
0 |
3.081D-30 |
3.830D-14 |
15 |
13 |
F |
Failure on accuracy of X. |
Most failures can be solved:
- Problems #2 and #26 are associated with a local minimum, which is correctly found by lsqrsolve.
- Problems #13, #14, #15, #22 are associated with a too large default tolerance on norm(G).
For the problem #27, lsqrsolve with exact J fails to achieve a good accuracy on X:
lsqrsolve: F eval: 15 G eval: 13 Status X: F Status F: T f(X)=1.109D-31 (f*=0) ||G(X)||= 6.955D-15 ||X-X*||=0.2147539 Rel. Err. on X: 0.2056970 |f(x)-f(x*)|=1.109D-31
The exact X is [1,1,...,1], but xopt seems to be far from this :
-->xopt xopt = 0.9794303 0.9794303 0.9794303 0.9794303 0.9794303 0.9794303 0.9794303 0.9794303 0.9794303 1.205697
Reducing the tolerances does not allow to get a more accurate X. We see that the status of the optimization is
-->info info = 2.
According to the help page, this corresponds to "number of calls to fcn reached". This message is certainly wrong, since the function was called only 15 times. With our own Levenberg-Marquardt implementation, we were able to solve this problem with 23 function evaluations and 23 Jacobian evaluations.
1.6. Numerical results for leastsq
In this section, we present the results of the leastsq function on the MGH benchmark. We performed two different tests:
- leastsq with function values only (and Jacobian computed by leastsq with finite differences),
- leastsq with exact Jacobian.
1.6.1. Introduction
The leastsq function solves nonlinear least squares problems with a Quasi-Newton method (with BFGS), a L-BFGS method or a nonsmooth method. This function is dedicated to nonlinear least squares problems, so that it should perform very well on this type of test cases.
The leastsq function is connected to the optim function. When the exact Jacobian matrix is not provided by the user, leastsq calls the numdiff function to compute the approximate Jacobian matrix with finite differences of order 1 and an optimal step size.
Each table presents :
- n: the number of unknowns
- F*: the known function global minimum
- F(X) : the function value at computed solution
- norm(G(X)) : the norm of the gradient at computed solution
- Feval: the number of function evaluations
- Jeval: the number of Jacobian evaluations
- Status: the status of optimization. Status=T if the solver succeeded, and status=F if the solver failed.
We used the tolerances:
rtolF = 1.e-4; atolF = 1.e-10; rtolX = 1.e-4; atolX = 1.e-6;
The status of the optimization is T if:
- the convergence on F is achieved if
abs(fopt-fstar)< rtolF*abs(fstar) + atolF
- the convergence on X is achieved if
norm(xopt-xstar)< rtolX*norm(xstar) + atolX
The following table summarizes the results of the lsqrsolve function with or without the exact Jacobian matrix.
Solver |
Pass |
Fail |
Feval |
Jevals |
leastsq without J |
27 |
8 |
19569 |
- |
leastsq with exact J |
27 |
8 |
2138 |
2138 |
1.6.2. leastsq without J
In this section, we present the results of the leastsq function when the Jacobian matrix is not provided to the leastsq function.
More precisely, the calling sequence is the following.
objfun = list(uncprb_leastsqfun, m, nprob, n); [fopt,xopt,gopt]=leastsq(objfun,x0);
In this case, leastsq calls the numdiff function to compute the approximate Jacobian matrix with finite differences of order 1 and an optimal step size.
Problem |
Name |
n |
F* |
F(X) |
norm(G(X)) |
Feval |
Status |
#1 |
ROSE |
2 |
0 |
0 |
0 |
184 |
T |
#2 |
FROTH |
2 |
0 |
48.984254 |
0.0000025 |
148 |
F |
#3 |
BADSCP |
2 |
0 |
3.679D-08 |
3.1357641 |
400 |
F |
#4 |
BADSCB |
2 |
0 |
0 |
0 |
96 |
T |
#5 |
BEALE |
2 |
0 |
0 |
0 |
88 |
T |
#6 |
JENSAM |
2 |
124.362 |
124.36218 |
0.0000058 |
156 |
T |
#7 |
HELIX |
3 |
0 |
4.455D-40 |
7.075D-20 |
170 |
T |
#8 |
BARD |
3 |
0.0082149 |
0.0082149 |
4.833D-09 |
310 |
T |
#9 |
GAUSS |
3 |
1.128D-08 |
1.128D-08 |
7.320D-13 |
120 |
T |
#10 |
MEYER |
3 |
87.9458 |
63312.18 |
44142117 |
500 |
F |
#11 |
GULF |
3 |
0 |
9.763D-31 |
4.198D-16 |
305 |
T |
#12 |
BOX |
3 |
0 |
1.233D-32 |
1.212D-16 |
190 |
T |
#13 |
SING |
4 |
0 |
2.423D-25 |
2.643D-17 |
492 |
T |
#14 |
WOOD |
4 |
0 |
8.522D-27 |
6.786D-13 |
600 |
T |
#15 |
KOWOSB |
4 |
0.0003075 |
0.0003075 |
8.891D-11 |
228 |
T |
#16 |
BD |
4 |
85822.2 |
85822.202 |
0.0012600 |
438 |
T |
#17 |
OSB1 |
5 |
0.0000546 |
0.0000546 |
2.507D-08 |
700 |
T |
#18 |
BIGGS |
6 |
0 |
0.0000030 |
0.0019911 |
800 |
F |
#19 |
OSB2 |
11 |
0.0401377 |
0.0401377 |
1.792D-08 |
923 |
T |
#20 |
WATSON |
9 |
0.0000014 |
0.0000014 |
7.422D-09 |
1100 |
T |
#21 |
ROSEX |
20 |
0 |
0.0020960 |
0.1170615 |
2200 |
F |
#22 |
SINGX |
12 |
0 |
3.827D-15 |
1.508D-09 |
1400 |
F |
#23 |
PEN1 |
10 |
0.0000709 |
0.0000709 |
1.183D-11 |
1200 |
T |
#24 |
PEN2 |
10 |
0.0002937 |
0.0002948 |
0.0003697 |
1200 |
F |
#25 |
VARDIM |
10 |
0 |
1.060D-30 |
3.932D-14 |
384 |
T |
#26 |
TRIG |
10 |
0 |
0.0000280 |
5.881D-09 |
588 |
F |
#27 |
ALMOST |
10 |
0 |
2.219D-31 |
7.682D-15 |
312 |
T |
#28 |
BV |
8 |
0 |
3.865D-33 |
4.474D-16 |
240 |
T |
#29 |
IE |
8 |
0 |
0 |
0 |
330 |
T |
#30 |
TRID |
12 |
0 |
1.208D-30 |
1.572D-14 |
574 |
T |
#31 |
BAND |
15 |
0 |
2.041D-30 |
2.082D-14 |
901 |
T |
#32 |
LIN |
10 |
5 |
5 |
0.0000002 |
996 |
T |
#33 |
LIN1 |
10 |
3.3870968 |
3.3870968 |
4.743D-11 |
84 |
T |
#34 |
LIN0 |
10 |
4.8888889 |
4.8888889 |
2.313D-11 |
108 |
T |
#35 |
CHEB |
10 |
0.0065039 |
0.0065040 |
1.418D-09 |
1104 |
T |
Time: 30 (s) Total number of problems:35 Number of successes:27 Number of failures:8 Total Feval:19569
1.6.2.1. Failure of leastsq without J
The following table presents the problems so that leastsq without J fails.
Problem |
Name |
n |
F* |
F(X) |
norm(G(X)) |
Feval |
Status |
#2 |
FROTH |
2 |
0 |
48.984254 |
0.0000025 |
148 |
F |
#3 |
BADSCP |
2 |
0 |
3.679D-08 |
3.1357641 |
400 |
F |
#10 |
MEYER |
3 |
87.9458 |
63312.18 |
44142117 |
500 |
F |
#18 |
BIGGS |
6 |
0 |
0.0000030 |
0.0019911 |
800 |
F |
#21 |
ROSEX |
20 |
0 |
0.0020960 |
0.1170615 |
2200 |
F |
#22 |
SINGX |
12 |
0 |
3.827D-15 |
1.508D-09 |
1400 |
F |
#24 |
PEN2 |
10 |
0.0002937 |
0.0002948 |
0.0003697 |
1200 |
F |
#26 |
TRIG |
10 |
0 |
0.0000280 |
5.881D-09 |
588 |
F |
1.6.3. leastsq with exact J
In this section, we present the results of the leastsq function when the Jacobian matrix is provided to the leastsq function.
More precisely, the calling sequence is the following.
objfun = list(uncprb_leastsqfun, m, nprob, n); [fopt,xopt,gopt]=leastsq(objfun,uncprb_leastsqjac,x0);
Problem |
Name |
n |
F* |
F(X) |
norm(G(X)) |
Feval |
Jeval |
Status |
#1 |
ROSE |
2 |
0 |
0 |
0 |
46 |
46 |
T |
#2 |
FROTH |
2 |
0 |
48.984254 |
6.851D-12 |
48 |
48 |
F |
#3 |
BADSCP |
2 |
0 |
5.941D-08 |
9.3720597 |
100 |
100 |
F |
#4 |
BADSCB |
2 |
0 |
0 |
0 |
25 |
25 |
T |
#5 |
BEALE |
2 |
0 |
0 |
0 |
22 |
22 |
T |
#6 |
JENSAM |
2 |
124.362 |
124.36218 |
0.0000027 |
50 |
50 |
T |
#7 |
HELIX |
3 |
0 |
0.3799609 |
10.322509 |
100 |
100 |
F |
#8 |
BARD |
3 |
0.0082149 |
0.0082149 |
7.432D-11 |
33 |
33 |
T |
#9 |
GAUSS |
3 |
1.128D-08 |
1.128D-08 |
1.198D-16 |
13 |
13 |
T |
#10 |
MEYER |
3 |
87.9458 |
60503.084 |
32259509 |
100 |
100 |
F |
#11 |
GULF |
3 |
0 |
1.042D-30 |
4.052D-15 |
60 |
60 |
T |
#12 |
BOX |
3 |
0 |
0 |
0 |
36 |
36 |
T |
#13 |
SING |
4 |
0 |
1.514D-33 |
1.620D-23 |
100 |
100 |
T |
#14 |
WOOD |
4 |
0 |
1.274D-21 |
4.366D-10 |
100 |
100 |
T |
#15 |
KOWOSB |
4 |
0.0003075 |
0.0003075 |
3.938D-11 |
48 |
48 |
T |
#16 |
BD |
4 |
85822.2 |
85822.202 |
0.0001132 |
100 |
100 |
T |
#17 |
OSB1 |
5 |
0.0000546 |
0.0000546 |
0.0000001 |
100 |
100 |
T |
#18 |
BIGGS |
6 |
0 |
0.0056556 |
3.184D-09 |
82 |
82 |
F |
#19 |
OSB2 |
11 |
0.0401377 |
0.0401377 |
6.181D-09 |
100 |
100 |
T |
#20 |
WATSON |
9 |
0.0000014 |
0.0000014 |
4.978D-12 |
100 |
100 |
T |
#21 |
ROSEX |
20 |
0 |
3.977D-13 |
0.0000010 |
100 |
100 |
T |
#22 |
SINGX |
12 |
0 |
3.796D-22 |
1.101D-13 |
100 |
100 |
F |
#23 |
PEN1 |
10 |
0.0000709 |
0.0000709 |
1.826D-13 |
93 |
93 |
T |
#24 |
PEN2 |
10 |
0.0002937 |
0.0002949 |
0.0012526 |
100 |
100 |
F |
#25 |
VARDIM |
10 |
0 |
7.913D-30 |
3.966D-14 |
24 |
24 |
T |
#26 |
TRIG |
10 |
0 |
0.0000280 |
3.114D-11 |
99 |
99 |
F |
#27 |
ALMOST |
10 |
0 |
0 |
0 |
23 |
23 |
T |
#28 |
BV |
8 |
0 |
3.865D-33 |
4.474D-16 |
24 |
24 |
T |
#29 |
IE |
8 |
0 |
0 |
0 |
33 |
33 |
T |
#30 |
TRID |
12 |
0 |
2.034D-30 |
1.996D-14 |
39 |
39 |
T |
#31 |
BAND |
15 |
0 |
1.729D-30 |
1.929D-14 |
53 |
53 |
T |
#32 |
LIN |
10 |
5 |
5 |
8.028D-15 |
8 |
8 |
T |
#33 |
LIN1 |
10 |
3.3870968 |
3.3870968 |
8.221D-11 |
7 |
7 |
T |
#34 |
LIN0 |
10 |
4.8888889 |
4.8888889 |
1.362D-11 |
5 |
5 |
T |
#35 |
CHEB |
10 |
0.0065039 |
0.0065040 |
9.780D-10 |
67 |
67 |
T |
Time: 6 (s) Total number of problems:35 Number of successes:27 Number of failures:8 Total Feval:2138 Total Jeval:2138
1.7. Conclusion
In this section, we analyze the performances of the various solvers we tested in this benchmark.
The following table presents the global performances of the solvers on the MGH benchmark.
Solver |
Pass |
Fail |
Feval |
Jevals |
Time (s) |
fminsearch |
25 |
10 |
50089 |
- |
273 (s) |
optim UNC "qn" |
31 |
4 |
3770 |
- |
9 (s) |
optim BND "qn" |
31 |
4 |
5011 |
- |
13 (s) |
optim UNC "gc" |
32 |
3 |
5556 |
- |
15 (s) |
optim BND "gc" |
25 |
10 |
17975 |
- |
48 (s) |
optim UNC "nd" |
15 |
25 |
4983 |
- |
13 (s) |
lsqrsolve without J |
30 |
5 |
6107 |
- |
9 (s) |
lsqrsolve with J |
28 |
7 |
975 |
826 |
3 (s) |
leastsq without J |
27 |
8 |
19569 |
- |
30 (s) |
leastsq with exact J |
27 |
8 |
2138 |
2138 |
6 (s) |
The previous experiments suggest the following conclusions with respect to the problems.
- The solvers can produce a local minimum which is not a global minimum. This is the expected behavior for the solvers that we tested, which are only local minimization solvers. For example, for the problem #2 FROTH, all solvers find the local minimum F*=48.984254 instead of the global minimum F*=0.
- However, a solver might find a local minimum where another solver may find the global minimum or not converge at all. For example, consider the problem #18 BIGGS : this problem is not solved by optim/UNC/"qn", while optim/BND/"qn" solves this problem and fminsearch finds a local minimum, but not the global minimum.
The previous experiments suggest the following conclusions with respect to the solvers.
- The fminsearch function requires much more function evaluations than the other solvers to achieve the same accuracy. This causes optimization processes which require more CPU time. The solver is relatively robust : it can solve most problems, and, when it fails, its failure is indicated by the number of function evaluations which reaches it maximum value. On the other hand, fminsearch can fail to produce an accurate optimum X or F, without signaling its non-convergence. This is the case for example for the problem #25 "VARDIM", for which fminsearch produces values of X and F which are no accurate, with a large gradient. In this case, fminsearch satisfies its termination rule, and does not reach the maximum number of function evaluations or iterations. For this problem, fminsearch does not warn that the problem is not solved.
- The optim solvers are very robust when used with the "qn" algorithm, since it solves most problems and find, at worst a local minimum. Still, there are two problems (#7 "HELIX" and #18 "BIGGS") so that optim/UNC/"qn" does not produces a local minimum. In the problem #7 "HELIX", the large gradient (norm(G) is close to 10.37) is a strong indication that the problem is unsolved. On the other hand, on the problem #18 "BIGGS", the gradient is small and there no clear indication that the problem is unsolved.
- The lsqrsolve function is very robust with or without exact Jacobian, since it solves most problems (this was for lsqrsolve on the problems #2 FROTH and #26 TRIG). It may find local optimums, as any other local solver. The lsqrsolve function may produce an inaccurate value of X or F because the default tolerance on the norm of the gradient is set to a too large value (1.e-5). In this case, reducing the tolerance, for example down to 1.e-8 may allow to solve the problem (this was for lsqrsolve on the problems #14 WOOD and #15 KOWOSB). We have found that the problem #27 "ALMOST" produced a failure of lsqrsolve, where the solver returns a small gradient but produces an inaccurate X. We were not able to solve this problem with lsqrsolve, although other implementations of lsqrsolve were able to solve this same problem. Providing the exact Jacobian matrix reduces the number of function evaluations.
- The leastsq function was found to be relatively reliable, but required more function evaluations in these tests. As for lsqrsolve, providing the exact Jacobian matrix reduces drastically the number of function evaluations.
Overall, it seems that the lsqrsolve function is, in general, the best nonlinear least squares solver in Scilab. It is both fast and reliable, producing most of the time the solution of the problem. Moreover, in order to check the optimality of the output, we should check that optimality conditions are satisfied, that is, that the gradient is zero and that the eigenvalues of the Hessian matrix (which may be computed with finite differences and the "derivative" function) are positive. Still, it may happen that the lsqrsolve function may require to tune the tolerances, especially the tolerance on the norm of the gradient, for which the default value (1.e-5) may be too large in some cases. Another good choice is the leastsq function, although the optim/"qn" solver produces almost the same results. In many cases, the fminsearch function converges but requires more function evaluations.
In a future benchmark, we may consider the nonlinear least squares problems of the Statistical Reference Datasets from NIST. This work may be based on the nistdataset toolbox:
http://atoms.scilab.org/toolboxes/nistdataset
This work was the opportunity to detect and analyze a number of problems in the Scilab functions. The following bugs were reported during this analyzis:
The leastsq function poorly documents the mem option: http://bugzilla.scilab.org/show_bug.cgi?id=9830
The value ind=1 of optim is not supported by all algorithms: http://bugzilla.scilab.org/show_bug.cgi?id=9822
fminsearch may produce a warning, but output.message is wrong: http://bugzilla.scilab.org/show_bug.cgi?id=9811
optim can fail to converge but does not produce a warning: http://bugzilla.scilab.org/show_bug.cgi?id=9789
neldermead may fail to converge without producing a warning : http://bugzilla.scilab.org/show_bug.cgi?id=9788
The optim function does not report the status of the optimization: http://bugzilla.scilab.org/show_bug.cgi?id=9837
The help of lsqrsolve is wrong for the info argument: http://bugzilla.scilab.org/show_bug.cgi?id=9657
The Least Squares functions should be gathered into a help sub-section: http://bugzilla.scilab.org/show_bug.cgi?id=9305
The default gtol parameter of lsqrsolve might be too large: http://bugzilla.scilab.org/show_bug.cgi?id=9840
This work was also the opportunity to analyze the consequences of a certain number of known bugs in Scilab, notably the following:
The optim function does not report neither niter nor nevalf: http://bugzilla.scilab.org/show_bug.cgi?id=9208
The "numdiff" and "derivative" function are duplicate: http://bugzilla.scilab.org/show_bug.cgi?id=4083
1.8. References
[1] "Algorithm 566: FORTRAN Subroutines for Testing Unconstrained Optimization Software", ACM Transactions on Mathematical Software (TOMS), Volume 7 , Issue 1, March 1981, Pages: 136 - 140, J. J. Moré, Burton S. Garbow, Kenneth E. Hillstrom
[2] "HESFCN - A Fortran Package Of Hessian Subroutines For Testing Nonlinear Optimization Software", Victoria Averbukh, Samuel igueroa, And Tamar Schlick Courant Institue Of Mathematical Sciences
[5] "Grid Restrained Nelder-Mead Algorithm", Arpad Burmen, Janez Puhan, and Tadej Tuma, 2006, Comput. Optim. Appl. 34, 3 (July 2006), 359-375
[6] "Unconstrained derivative-free optimization by successive approximation", Arpad Burmen, Tadej Tuma, Journal of Computational and Applied Mathematics, Volume 223, Issue 1, 1 January 2009, Pages 62-74