Friday, December 25, 2009

GAMS: very large listing file

I managed to create a ridiculously large listing file where the IDE has troubles displaying it. The size is 3,509,018,994 bytes, and more than 40 million lines. The GAMS IDE is quite smart in displaying large listing files: it does not store the whole file in memory. However in this case there is probably a problem in addressing file offsets. The IDE is not the only program that has problems with this. Also the tail command does not work:

C:\projects\xxx>dir lsterr.lst
Volume in drive C is HP
Volume Serial Number is 8AA3-C6D2

Directory of C:\projects\xxx

12/23/2009  06:50 PM     3,509,018,994 lsterr.lst
               1 File(s)  3,509,018,994 bytes
               0 Dir(s)  341,544,075,264 bytes free

C:\projects\xxx>tail lsterr.lst
tail: lsterr.lst: Invalid argument

In general I prefer to generate verbose listing files so I have a good chance I can find some relevant messages in case something goes wrong without rerunning the model. In this case the series of models just generated too much output. With option solprint=off; the listing file was small enough to be handled by the IDE without problems.

Tuesday, December 22, 2009

Presolve

Not every day I see something like:

--- Generating MIP model xxxxx
--- Untitled_18.gms(477) 2726 Mb
--- 11,865,559 rows 11,868,588 columns 25,453,085 non-zeroes
--- Untitled_18.gms(477) 2726 Mb

Starting Cplex...
Presolve has eliminated 10143846 rows and 10138198 columns...
Presolve has eliminated 11845896 rows and 11840248 columns...
Aggregator has done 1 substitutions...
Presolve has eliminated 11845896 rows and 11840314 columns...
Aggregator has done 18762 substitutions...
Tried aggregator 1 time.
LP Presolve eliminated 11846440 rows and 11846708 columns.
Aggregator did 18762 substitutions.
Reduced LP has 357 rows, 3118 columns, and 3726 nonzeros.

I.e. GAMS generates a huge model (it is actually an LP – all integers are relaxed) that is reduced to almost nothing by the presolver in Cplex. Turns out there are many accounting rows in the model. Nowadays these are better formulated outside the model with assignment statements instead of using model equations.

GAMS actually generates this model quite quickly and efficiently. Opposed to what many people think, the larger the LP model, the better the relative performance of a modeling system. They often can create the model in (hopefully) linear time (linear in the number of nonzero elements), while even the best LP solvers don’t have this linear time behavior.

Sunday, December 20, 2009

Remote desktop

This works very well for me at home: http://www.teamviewer.com

Fortran GDX support

Although there are a few Fortran support files in the \apifiles subdirectory of recent GAMS systems, the actual usage is not always completely straightforward as I found out by some experiences of a client. This is probably foremost a documentation issue. Here are my notes.

  1. The example example1.f90 in apifiles\examples is for Visual Fortran (some versions) and Microsoft C compiler. The example works ok with with my versions:
    Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 13.10.3077 for 80x86
    Compaq Visual Fortran Optimizing Compiler Version 6.6 (Update C)

    Y:\doe\ftn\ftn>type compile.cmd
    cl -DAPIWRAP_LCASE_NODECOR -c gdxf9glu.c
    f90 -c gamsglobals_mod.f90
    f90 -c gdxf9def.f90
    f90 test.f90 gdxf9def.obj gdxf9glu.obj

    Y:\doe\ftn\ftn>compile

    Y:\doe\ftn\ftn>cl -DAPIWRAP_LCASE_NODECOR -c gdxf9glu.c
    Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 13.10.3077 for 80x86
    Copyright (C) Microsoft Corporation 1984-2002. All rights reserved.

    gdxf9glu.c

    Y:\doe\ftn\ftn>f90 -c gamsglobals_mod.f90
    Compaq Visual Fortran Optimizing Compiler Version 6.6 (Update C)
    Copyright 2003 Compaq Computer Corp. All rights reserved.

    gamsglobals_mod.f90

    Y:\doe\ftn\ftn>f90 -c gdxf9def.f90
    Compaq Visual Fortran Optimizing Compiler Version 6.6 (Update C)
    Copyright 2003 Compaq Computer Corp. All rights reserved.

    gdxf9def.f90

    Y:\doe\ftn\ftn>f90 test.f90 gdxf9def.obj gdxf9glu.obj
    Compaq Visual Fortran Optimizing Compiler Version 6.6 (Update C)
    Copyright 2003 Compaq Computer Corp. All rights reserved.

    test.f90
    Microsoft (R) Incremental Linker Version 6.00.8447
    Copyright (C) Microsoft Corp 1992-1998. All rights reserved.

    /subsystem:console
    /entry:mainCRTStartup
    /debugtype:cv
    /pdb:none
    C:\DOCUME~1\HP_Owner\LOCALS~1\Temp\objEC.tmp
    gdxf9def.obj
    gdxf9glu.obj
    dfor.lib
    libc.lib
    dfconsol.lib
    dfport.lib
    kernel32.lib
    /out:test.exe

    Y:\doe\ftn\ftn>

  2. Visual fortran 6.0 does not work (no support for integer(kind=8); that could be implemented differently if needed).
  3. It would be easier to have some 100% fortran interfaces instead of only this Fortran+C thing. It would be simpler to use and could be a little bit faster (no extra indirection).
  4. There is some Lahey code (100% fortran code). More direct support for other compilers would be good.
  5. It can be made to work with Intel Fortran + MS Visual C by added libc.lib to the link step.
  6. It can be made to work with Mingw gcc+g95 as follows:

    Install MINGW from http://sourceforge.net/projects/mingw/files/Automated%20MinGW%20Installer/
    Install g95 from http://ftp.g95.org/g95-MinGW.exe

    Compile as:

    C:\projects\doe\ftn\ftn2>type compile.cmd
    gcc -DAPIWRAP_LCASE_NODECOR -c gdxf9glu.c
    g95 -c -fno-underscoring gamsglobals_mod.f90
    g95 -c -fno-underscoring -w gdxf9def.f90
    g95 -w -fno-underscoring test.f90 gdxf9def.o gdxf9glu.o -o test.exe -Xlinker --enable-stdcall-fixup

    C:\projects\doe\ftn\ftn2>compile

    C:\projects\doe\ftn\ftn2>gcc -DAPIWRAP_LCASE_NODECOR -c gdxf9glu.c

    C:\projects\doe\ftn\ftn2>g95 -c -fno-underscoring gamsglobals_mod.f90

    C:\projects\doe\ftn\ftn2>g95 -c -fno-underscoring -w gdxf9def.f90

    C:\projects\doe\ftn\ftn2>g95 -w -fno-underscoring test.f90 gdxf9def.o gdxf9glu.o -o test.exe -Xlinker --enable-stdcall-fixup

    C:\projects\doe\ftn\ftn2>test
    done

    C:\projects\doe\ftn\ftn2>




    The –fno_underscoring flag may cause problems with precompiled fortran libraries. Better support for this compiler would be good so we don’t need these compilation flags.


  7. The interface code could benefit from some comments.


  8. Some documentation about what compilers are supported and how would be good.


  9. May be some more explicit support+documentation for the (currently) leading fortran compilers. E.g. Visual Fortran is no longer available, so that is not a very good platform for the examples. Probably on Windows the following are the most popular compilers: Lahey/Fujitsu, Intel, g95, gfortran. I also suspect g77 is still used a lot.

LS solver 2.1

LS Solver version 2.1 is available from http://www.amsterdamoptimization.com/statistics.html. It adds calculating standardized and studentized residuals so ls.gdx. Docs are here: http://www.amsterdamoptimization.com/pdf/regression.pdf.

Friday, December 18, 2009

Nonlinear least squares help

Sometimes it can help to have different solvers to debug a non-linear least squares problem. The R function nls did not return a solution, but CONOPT clearly showed the regression model was not very good. An improved model is proposed.

Afterwards an even better model was contributed by another poster.

$ontext

Problem:

I was trying to estimate the weibull model using nls after putting OLS
values as the initial inputs to NLS.
I tried multiple times but still i m getting the same error of Error in
nlsModel(formula, mf, start, wts) :
 
singular gradient matrix at initial parameter estimates.

The Program is as below

> vel <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,14)
> df <- data.frame(conc, vel)
> df
     
conc vel
1    0.077   1
2    0.328   2
3    0.882   3
4    1.195   4
5    1.884   5
6    3.577   6
7    6.549   7
8   13.000   8
9   33.690   9
10  52.220  10
11  90.140  11
12 166.050  12
13 233.620  13
14 346.890  14
> plot(df$vel, df$conc)
> para0.st <- c(K=450,
+       alpha=0.054,beta=3.398 )
>  fit0 <- nls(
+      conc~ K-(K*exp(-(vel/alpha)^beta)), df, start= para0.st,trace=T)
Error in nlsModel(formula, mf, start, wts) :
 
singular gradient matrix at initial parameter estimates




Indeed: this model does not seem not to be suited for the data. WITH CONOPT I can solve the
least squares problem but the results are:


----    102 VARIABLE r.L  residuals

i1  -67.787,    i2  -67.536,    i3  -66.982,    i4  -66.669,    i5  -65.980,    i6  -64.287,    i7  -61.315
i8  -54.864,    i9  -34.174,    i10 -15.644,    i11  22.276,    i12  98.186,    i13 165.756,    i14 279.026


----    102 VARIABLE sse.L                 =   150223.167  sum of squared errors
           
VARIABLE K.L                   =       67.864  to be estimated
           
VARIABLE alpha.L               =        0.054  to be estimated
           
VARIABLE beta.L                =        3.398  to be estimated


A better fit results with the model   conc = K + alpha*vel^beta

----    125 VARIABLE r.L  residuals

i1   0.833,    i2   1.073,    i3   1.541,    i4   1.503,    i5   1.174,    i6   0.469,    i7  -1.466,    i8  -4.083
i9   1.077,    i10 -5.453,    i11 -6.091,    i12 12.766,    i13 -1.379,    i14 -1.964


----    125 VARIABLE sse.L                 =      263.633  sum of squared errors
           
VARIABLE K.L                   =       -0.756  to be estimated
           
VARIABLE alpha.L               =  2.816336E-4  to be estimated
           
VARIABLE beta.L                =        5.317  to be estimated


Regression results for this model are:

        
Parameter      Estimate    Std. Error       t value      Pr(>|t|)
                
K  -7.55855E-01   1.88999E+00  -3.99926E-01   6.96867E-01
            
alpha   2.81634E-04   1.11730E-04   2.52066E+00   2.84420E-02 *
             
beta   5.31695E+00   1.51652E-01   3.50602E+01   1.22025E-12 ***


In R we can do now:

> vel <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,14)
> conc <- c(0.077,0.328,0.882,1.195,1.884,3.577,6.549,13.000,33.690,52.220,90.140,166.050,233.620,346.890)
> df<-data.frame(conc,vel)
> nls(conc~K+alpha*vel^beta,df,start<-c(K=-0.7,alpha=0.0003,beta=5.3),trace=T)
349.1687 :  -0.7000  0.0003  5.3000
264.4474 :  -0.7507993362  0.0002808334  5.3172851705
263.6331 :  -0.7559514801  0.0002816438  5.3169303333
263.6331 :  -0.7558506044  0.0002816331  5.3169456595
Nonlinear regression model
 
model:  conc ~ K + alpha * vel^beta
  
data:  df
        
K      alpha       beta
-0.7558506  0.0002816  5.3169457
 
residual sum-of-squares: 263.6

Number of iterations to convergence: 3
Achieved convergence tolerance: 1.599e-06


An even better model was suggested: conc = alpha*exp[beta/vel]:

        
Parameter      Estimate    Std. Error       t value      Pr(>|t|)
            
alpha   3.57740E+04   4.18811E+03   8.54180E+00   1.91040E-06 ***
             
beta  -6.50221E+01   1.54097E+00  -4.21955E+01   2.03681E-14 ***

with sse: 249.6362

See: http://permalink.gmane.org/gmane.comp.lang.r.general/173730

$offtext

set i /i1*i14/;

table data(i,*)
      
conc vel
i1    0.077   1
i2    0.328   2
i3    0.882   3
i4    1.195   4
i5    1.884   5
i6    3.577   6
i7    6.549   7
i8   13.000   8
i9   33.690   9
i10  52.220  10
i11  90.140  11
i12 166.050  12
i13 233.620  13
i14 346.890  14
;

parameters conc(i),vel(i);
conc(i) = data(i,
'conc');
vel(i) = data(i,
'vel');

variable
   sse  
'sum of squared errors'
   r(i) 
'residuals'
   K    
'to be estimated'
   alpha
'to be estimated'
   beta 
'to be estimated'
;

equations
   calcsse
   nlfit(i)
'non-linear fit'
;

K.L=450;
alpha.L=0.054;
beta.L=3.398;


calcsse.. sse=e=
sum(i,sqr(r(i)));
nlfit(i).. conc(i) =e= K-(K*exp(-[vel(i)/alpha]**beta)) + r(i);


model m /calcsse,nlfit/;
option nlp=conopt;
solve m minimizing sse using nlp;
display r.L,SSE.L,K.L,alpha.L,beta.L;

*
* the next model shows a better fit
*

equation nlfit2(i) 'non-linear fit';

nlfit2(i).. conc(i) =e= K+alpha*[vel(i)**beta] + r(i);

model m2 /calcsse,nlfit2/;
solve m2 minimizing sse using nlp;
display r.L,SSE.L,K.L,alpha.L,beta.L;

option nlp=nls;
solve m2 minimizing sse using nlp;

parameter r1(i); r1(i) = r.l(i);
scalars a1,b1; a1=alpha.l; b1=beta.l;

*
* even better (suggested by http://permalink.gmane.org/gmane.comp.lang.r.general/173730)
*

equation nlfit3(i) 'non-linear fit';
nlfit3(i).. conc(i) =e= alpha*exp[beta/vel(i)] + r(i);

model m3 /calcsse,nlfit3/;
option nlp=conopt;
solve m3 minimizing sse using nlp;
display r.L,SSE.L,K.L,alpha.L,beta.L;

option nlp=nls;
solve m3 minimizing sse using nlp;

parameter r2(i); r2(i) = r.l(i);
scalars a2,b2; a2=alpha.l; b2=beta.l;

*
* plot results
*
file pltdat /exp.dat/;
loop(i,
put pltdat vel(i):17:5,conc(i):17:5,r1(i):17:5,r2(i):17:5/;
);
putclose;
file plt /exp.plt/;
put plt;
put "K=",K.l:0:16/;
put "a1=",a1:0:16/;
put "b1=",b1:0:16/;
put "f1(x)=K+a1*x**b1"/;
put "a2=",a2:0:16/;
put "b2=",b2:0:16/;
put "f2(x)=a2*exp(b2/x)"/;
putclose 'set term png size 600,800'/
        
'set output "exp.png"'/
        
'set key left'/
        
'set multiplot layout 2,1'/
        
'plot "exp.dat" using 1:2 title "data",f1(x) title "f1(x)=k+a*x**b",f2(x) title "f2(x)=a*exp(b/x)"'/
        
'plot "exp.dat" using 1:3 title "residuals f1(x)","exp.dat" using 1:4 title "residuals f2(x)"'/
        
'unset multiplot'/;

execute '=wgnuplot.exe exp.plt';
execute '=shellexecute exp.png';


Tuesday, December 15, 2009

Nice Mathematica intro

http://www.wolfram.com/broadcast/screencasts/numericalandsymbolicsolutions/

Two small examples, one being a tiny optimization problem. Nevertheless, a nice introduction, and I was impressed with the ease of adding GUI elements to the Mathematica interface.

Podcast about Solver Foundation

Nice intro for programmers.

http://www.hanselman.com/blog/HanselminutesPodcast191SolverFoundationAndOptimizationWithNathanBrixius.aspx

The intended audience: .Net programmers, so not very much emphasis on modeling.

Monday, December 14, 2009

GAMS/Mosek feedback

GAMS/Mosek seems to have some issues with a few (very non-critical) test problems: http://amsterdamoptimization.com/models/blog/mst.gms and http://amsterdamoptimization.com/models/blog/mst51.gms.

Some suggestions/observations:

  1. The listing file does not give a hint about why the run failed. I just see:

                   S O L V E      S U M M A R Y

         MODEL   m                   OBJECTIVE  z
         TYPE    RMIP                DIRECTION  MINIMIZE
         SOLVER  MOSEK               FROM LINE  134

    **** SOLVER STATUS     4 Terminated By Solver     
    **** MODEL STATUS      6 Intermediate Infeasible  
    **** OBJECTIVE VALUE                0.0000

    RESOURCE USAGE, LIMIT          5.053      1000.000
    ITERATION COUNT, LIMIT         0        100000

    MOSEK Link       Nov  1, 2009 23.3.2 WEX 13908.14598 WEI x86_64/MS Windows

    M O S E K       version 6.0.0.53 (Build date: 2009-11-11 07:16:11)
    Copyright (C)   MOSEK ApS, Fruebjergvej 3, Box 16
                     DK-2100 Copenhagen, Denmark
                    
    http://www.mosek.com

    It would be nice to see some error message about what happened.
  2. The last message on the screen is Return code - 1051 [MSK_RES_ERR_SPACE]. This message is not explained in the GAMS/Mosek documentation. I would prefer a message in readable English so I don't even have to check the manual. 
  3. I am guessing that the message has to do with running out of memory. However this is a 64 bit system and the models are not extremely large. So may be I stumbled on a bug here. I would guess that Mosek would normally be a top-contender on this kind of model.

PS. Later MOSEK versions solve these models ok (see comments).

PS 2: Unfortunately, I still see the same problems with:

               S O L V E      S U M M A R Y

     MODEL   m                   OBJECTIVE  z
     TYPE    RMIP                DIRECTION  MINIMIZE
     SOLVER  MOSEK               FROM LINE  133

**** SOLVER STATUS     4 Terminated By Solver     
**** MODEL STATUS      6 Intermediate Infeasible  
**** OBJECTIVE VALUE                0.0000

RESOURCE USAGE, LIMIT          5.272      1000.000
ITERATION COUNT, LIMIT         0        100000

MOSEK Link       Nov  1, 2009 23.3.3 WEX 14734.15043 WEI x86_64/MS Windows

M O S E K       version 6.0.0.55 (Build date: 2009-11-24 09:29:07)
Copyright (C)   MOSEK ApS, Fruebjergvej 3, Box 16
                 DK-2100 Copenhagen, Denmark
                
http://www.mosek.com

MSF Enterprise Pricing

> I want info on pricing of Microsoft Solver Foundation Enterprise Edition.


The enterprise version of Solver Foundation is offered exclusively through Gurobi Optimization. Please contact them directly at: info@gurobi.com or http://www.gurobi.com/.

Monday, December 7, 2009

Want some difficult MIP models

I need to convince my boss to approve budget for a commercial MIP solver compared to a free one. I just started to do some modeling with GAMS so my models are still small. You have some bigger ones I can use to show some hopefully significant performance improvement?

Here are some simple but difficult to solve MIP models: http://www.amsterdamoptimization.com/benchmarkmodels.html.

Saturday, December 5, 2009

LS Solver and single precision

I was looking to add some facilities to the LS solver. In the process I was shown how the solver was built for inclusion in the GAMS distribution. Turns out parts of it are incorrectly compiled using single precision. I just about fell off my chair with astonishment when I saw this. After recovering somewhat, I investigated a bit what the damage could be. Luckily the main linear algebra routines are compiled correctly, so the main results (estimates) are fine. The affected pieces involve the p-values and related statistics. Comparing the results with the correctly compiled version from http://www.amsterdamoptimization.com/statistics.html, the differences seem minimal. Of course at this point it is better to use this version instead of the version included in the GAMS distribution. Let’s hope this will be fixed soon.

The background is some older code from TOMS (Transaction on Mathematical Software), algorithm TOMS708 for calculating the incomplete Beta Integral. This code was originally developed for the CDC 6600 architecture, which stores single precision numbers in 60 bits. For current IEEE machines that means this should be compiled as double precision. Similar issues always popped up for code developed on Cray super computers (the old vector machines had 64 bit single precision numbers). Luckily all fortran compilers have command line switches to make it easy to compile such code with double precision.

I encountered this brochure for the Control Data Cyber 170-750: the machine I used to work on (mainly Pascal and Fortran; the machine was running an operating system called NOS/BE, see http://en.wikipedia.org/wiki/CDC_Cyber).