Yet Another Math Programming Consultant: 2008

Wednesday, December 31, 2008

GAMS/Cplex out-of-memory issue

This model runs out of memory in the branch-and-bound.

Elapsed real time = 15952.23 sec. (tree size = 1929.62 MB, solutions = 5)
Nodefile size = 1802.18 MB (732.60 MB after compression)
 2237100 2069567        2.0773    84        2.1636        2.0742 53354065    4.13%
 2237200 2069665        2.0742    86        2.1636        2.0742 53356034    4.13%
 2237300 2069763        2.0773    87        2.1636        2.0742 53358318    4.13%
 2237400 2069857        2.0758    84        2.1636        2.0742 53360030    4.13%
 2237500 2069955        2.0758    84        2.1636        2.0742 53362567    4.13%
 2237600 2070055        2.0742    86        2.1636        2.0742 53364576    4.13%
 
Warning: MIP starts not constructed because of out-of-memory status.
 
Consider using CPLEX node files to reduce memory usage.
CPLEX Error  1001: Out of memory.

It would be useful if this case would be handled more gracefully. Now it just stops and no solution is returned:

**** SOLVER STATUS     4 TERMINATED BY SOLVER
**** MODEL STATUS      14 NO SOLUTION RETURNED
**** OBJECTIVE VALUE                0.0000
 
 RESOURCE USAGE, LIMIT      15955.664     21600.000
 ITERATION COUNT, LIMIT         0         10000

It should be possible to allocate memory before starting the branch-and-bound so that always the best integer solution so far can be reported. Note also that the iteration count reported is wrong.

Monday, December 29, 2008

Another difficult MIP

Some MIP models are small but very difficult. An example is described here (this model has 252 binary variables). Now I am working on a very small scheduling application with only 217 binary variables. Cplex finds good solutions very quickly, but it is not able to close the gap in any significant way in 6 hours.

Some fragments of the log:


       Nodes                                         Cuts/
 Node  Left     Objective  IInf  Best Integer     Best Node    ItCnt     Gap
 
*   0+    0                           16.1636                      0     ---
    0     0        0.1000   195       16.1636        0.1000     1406   99.38%
*   0+    0                           14.1636        0.1000     1406   99.29%
    0     0        0.1000   189       14.1636      Fract: 5     2050   99.29%
*   0+    0                           13.1636        0.1000     2050   99.24%
    0     2        0.1000   189       13.1636        0.1000     2050   99.24%
*  10+    4                            3.1636        0.1000    14281   96.84%
*  40+   32                            2.1636        0.1000    42176   95.38%
  100    94        1.3142    63        2.1636        0.1000    86823   95.38%
  200   171        0.3685   127        2.1636        0.1000   129897   95.38%
  300   259        0.3646   109        2.1636        0.1000   187075   95.38%
  400   346        0.2071   106        2.1636        0.1000   228382   95.38%
  500   428        0.1500   118        2.1636        0.1000   266899   95.38%
  600   167        1.3995    65        2.1636        0.1000   331430   95.38%
  700   208        2.1535    46        2.1636        0.1000   378191   95.38%
  800   247        0.8702    74        2.1636        0.1000   443716   95.38%
  900   323        0.1286   135        2.1636        0.1000   482839   95.38%
 1000   409        1.4927    48        2.1636        0.1000   518560   95.38%
Elapsed real time = 399.27 sec. (tree size =  0.39 MB, solutions = 5)
.....
61100 52481        1.1364    98        2.1636        0.1358 30494218   93.72%
61200 52569        1.2453    56        2.1636        0.1359 30547758   93.72%
61300 52657        0.9057   121        2.1636        0.1360 30590947   93.71%
61400 52749        0.7052   114        2.1636        0.1361 30643445   93.71%
61500 52837        2.1447    58        2.1636        0.1364 30695673   93.70%
61600 52921        0.8114   105        2.1636        0.1364 30735655   93.70%
61700 53005        2.1333    89        2.1636        0.1364 30790214   93.70%
61800 53087        cutoff              2.1636        0.1364 30824849   93.70%
61900 53174        0.1500   118        2.1636        0.1364 30878206   93.70%
62000 53262        0.6538   111        2.1636        0.1364 30942477   93.70%
Elapsed real time = 21434.12 sec. (tree size = 51.02 MB, solutions = 7)
62100 53348        1.1516    75        2.1636        0.1364 31008777   93.70%
62200 53434        1.1409   113        2.1636        0.1364 31064891   93.70%
62300 53527        1.2003    90        2.1636        0.1364 31107576   93.70%
62400 53610        0.9534    91        2.1636        0.1364 31154949   93.70%
62500 53696        cutoff              2.1636        0.1364 31192217   93.70%
...
Resource limit exceeded.
  
MIP Solution:            2.163636    (31215703 iterations, 62517 nodes)
Final Solve:             2.163636    (0 iterations)
  
Best possible:           0.136364
Absolute gap:            2.027273
Relative gap:            0.936975

An integer solution with obj=2.1636 is quickly found. Two other integer solutions are found down the road, but they don't improve on this significantly. The best relaxation bound is moving very slowly, and as a result the closing of the gap is sluggish.

Wednesday, December 17, 2008

Mona Lisa assignment in GAMS

This is the assignment problem described in Donald E. Knuth, "The Stanford GraphBase: A Platform for Combinatorial Computing", ACM Press, 1993. The gray scale data was generated by lisa(360,250,255,0,360,0,250,0,22950000) and is used to color (small) Excel cells.

The GAMS model is as follows:

$ontext
  
   Assignment problem for lisa(360,250,255,0,360,0,250,0,22950000)
  
   Erwin Kalvelagen, March 2002
  
   Reference: Donald E. Knuth, "The Stanford GraphBase:
        A Platform for Combinatorial Computing", ACM Press, 1993.
  
  
$offtext
  
  
set i 'rows' /0*359/;
set j 'columns' /0*249/;
  
$if exist input.gdx $call 'rm -f input.gdx'
$if exist output.xls $call 'rm -f output.xls'
 
 
$call '=gdxxrw.exe i=input.xls par=c rng=Data!A1 rdim=1 cdim=1'
 
parameter c(i,j);
$gdxin input.gdx
$load c
  
equations
    objective 'maximize highlighted parts'
    column(j) 'column sum'
    row(i)    'row sum'
;
 
variables
    z         'objective variable'
    x(i,j)    'pixels'
;
 
binary variable x;
 
objective..      z =e= sum((i,j),c(i,j)*x(i,j));
row(i)..         sum(j, x(i,j)) =l= 1;
column(j)..      sum(i, x(i,j)) =e= 1;
 
model m /all/;
solve m using rmip maximizing z;
 
set solution(i,j);
solution(i,j)$(x.l(i,j)>0.5) = yes;
execute_unload 'output.gdx',solution;
 
execute '=gdxxrw.exe i=output.gdx o=output.xls set=solution rng=Solution!A1 rdim=2 cdim=0'

This model is stored inside the Excel spreadsheet. The Excel front-end will export the model and the input data to a temp directory. Then GAMS is called, and finally the resulting solution data is read. To make it possible to use gdxxrw inside the GAMS model, we use the following algorithm:

Write GAMS model to %TEMP%\MonaLisa.gms
Write Data sheet to %TEMP%\input.xls
Call gams to execute %TEMP%\MonaLisa.gms
Copy %TEMP%\output.xls to Solution sheet
Draw points in Solution sheet with different color

Note that GDXXRW cannot read/write directly to the spreadsheet that has VBA code running, neither can we make it shareable. This algorithm bypasses this problem by copying input and output data to external workbooks.

The model is a large but easy RMIP: 600 equations and 90,000 variables. The free LP solver CBC can solve this quickly: 0.6 seconds. Note: the model is too large to solve with MS Solver Foundation Express and Standard Editions (see: note). Using the Check button in the MS Solver Foundation Excel plug-in seems to indicate that the Excel binding is somewhat slow. In addition the plug-in does not provide a progress window (see second picture below) which is very useful for long running MIP models. Although the GAMS/EXCEL/VBA route requires more programming, for some classes of models this approach seems more flexible than the MS Solver Foundation Excel plug-in solution. Of course for ultimate flexibility at the cost of even more programming effort you can use the MS Solver Foundation .NET API's.

Coin OR's CBC performance

COIN OR's CBC solver is often showing excellent performance. Here is an example of a somewhat difficult MIP:

LP solve: 4 hours (as reported by user)
Cplex: 135 seconds
CBC: 183 seconds

(all using defaults).

If you have the funds to buy Cplex: running Cplex on 4 cores and more aggressive cut generation reduced the solution time to 33 seconds.

Sunday, December 14, 2008

Max Water Retaining Magic Squares

This is new to me: http://www.knechtmagicsquare.paulscomputing.com/topographical.html.

Is there more information on this? Can this formulated as a MIP or CSP model?

Thursday, December 11, 2008

MSF CSP: Domain Over Weighted Degree

The setting Domain Over Weighted Degree seems to be useful for a number of CSP models in the Microsoft Solver Foundation.

Examples:

GAMS $macro

GAMS 22.9 has some macro facilities. I don't completely understand it. The following model:

$macro a b
scalar a/1/;
display a;

gives:

   1  $macro a b
   2  scalar a/1/;
****         $195,409
**** 195  Symbol redefined with a different type
**** 409  Unrecognizable item - skip to find a new statement
****        looking for a ';' or a key word to get started again
   3  display b;
****          $140
**** 140  Unknown symbol

**** 3 ERROR(S)   0 WARNING(S)

Apparently it is skipping some substitutions. This is not like what I am used to in macro preprocessors for languages like C and Delphi.

I would expect this model to behave like:

scalar b/1/;
display b;

Tuesday, December 9, 2008

How to generate a PNG/GIF file with multiple plots?

Gnuplot allows to generate bitmaps containing multiple plots. See below. The code can be found in rabbitplot.gms. If you want to do more with Gnuplot you may want to keep an eye on this upcoming book.

MSFCli.exe

Some issues I raised wrt MSFCLi.exe will be fixed in the next release of MS Solver Foundation.

Wrapper around INVERT.EXE

Bruce McCarl built some wrapper around INVERT.

Rebuttal: I have used this tool for many years in many different projects without ever having the need to invert a square matrix over different sets. I would conjecture that the model is in all likelihood incorrectly organized if you need such functionality. Besides, it was my thought that if needed I can always use a simple mapping to overcome this limitation (again: I never had to do this). But of course it is an excellent idea to extend the functionality of INVERT using the proposed wrapper.

Solution of large NLP model with Mosek

This report shows some good results with Mosek on a very large convex NLP model.

Monday, December 8, 2008

LS advert in mccarl guide

3.4.3.3.27 LS

LS is a solver for estimating linear regression models in GAMS. It solves the normal equations (X'X)b=X'y to introduce numerical instability. It was originally developed by Erwin Kalvelagen is the original author [sic] and further information can be found in the solver manual or at the Amsterdam Optimization Modeling Group's web site.

This statement is not completely correct. I surely don't want to solve the normal equations to introduce numerical instability. Actually I don't form the normal equations at all (for good reasons). The algorithm is based on a stable QR decomposition (see section 5 of LS solver documentation). The code has been verified to solve some very numerically challenging problems. Some users have used LS to solve regression problems with hundreds of coefficients to estimate.

Sunday, December 7, 2008

How to express a normal distribution function in GAMS.

> How can I express a Normal distribution function (CDF)

> in GAMS?

errorf((X-mu)/sigma)

See also: http://amsterdamoptimization.com/pdf/specfun.pdf for more information.

Saturday, December 6, 2008

Redeclarations in GAMS

> Erwin:
> To my surprise GAMS allows
> variable x,x;
> Any reason for this?

GAMS allows redeclarations of symbols unless there is a contradiction. I.e. one often sees:

variable x(i),y,z;

positive variable x;

This is perfectly legal GAMS. Not allowed is:

positive variable x;

negative variable x;

Your example is allowed as there is no contradiction.

GAMS/GDX workshop at USDA/ERS

I have been busy these weeks. Taught a GAMS/GDX workshop at USDA/ERS:

Invited Talk at "Scheduling in the Process Industry", Bad Honnef

Urged by Josef Kallrath, I presented at the meeting Scheduling in the Process Industry this talk.

This was a great venue.

Wednesday, November 19, 2008

MS Solver Foundation: Excel/OML issue

I was sometimes loosing the model information when exiting Excel. Turns out that when you save the spreadsheet after closing it (Excel will prompt you if you want to save your changes) the OML model gets lost. (Somewhat a warped way of saving the workbook, but that is how my brain is wired.) If you manually save the model before closing Excel, the model will be saved without a problem. Microsoft could reproduce it and it is quickly fixed in the next version. (See issue report).

I had hoped to solve the assignment problem suggested by Knuth. It is supposed to find "interesting" spots on the Mona Lisa. Unfortunately, the standard edition of MS Solver Foundation is "throttled" too tightly to be able to solve this large but easy model. I am sure this is not a good business case to purchase the Enterprise version! (See limits).

Sunday, November 16, 2008

Powerpoint of GAMS presentations

Below are some of the powerpoint presentations on a GAMS workshop taught at Univ of Tennessee (the sheets of the presentation on some UTK application specific issues are not repreduced).

Friday, November 14, 2008

Microsoft Solver Foundation v1

The Express edition can be downloaded from
http://code.msdn.microsoft.com/solverfoundation/Release/ProjectReleases.aspx?ReleaseId=1799

Tuesday, November 11, 2008

Start with optimal solution as initial point

Large NLP's are difficult, not in the least because the chance increases that numerics become more unstable. Here is an example of such a problem with a square system (having 224k equations and variables):

solve m minimizing dummy using nlp;
x.fx(ivar,t) = x.l(ivar,t);
solve m minimizing dummy using nlp;

We fix the variables to their optimal values and solve again. This gives for the second solve:

               S O L V E      S U M M A R Y
 
     MODEL   m                   OBJECTIVE  dummy
     TYPE    NLP                 DIRECTION  MINIMIZE
     SOLVER  CONOPT              FROM LINE  554557
 
**** SOLVER STATUS     1 NORMAL COMPLETION         
**** MODEL STATUS      4 INFEASIBLE                
**** OBJECTIVE VALUE                0.0000

We get some good advice how to work around this by loosening some tolerance:

 ** An equation is inconsistent with other equations in the
    pre-triangular part of the model.
  
    Residual=           3.78931873E-08
    Tolerance (Rtnwtr)= 2.00000000E-08
  
    The pre-triangular feasibility tolerance may be relaxed with
    option:
  
    Rtnwtr=x.xx
  
    E_CRIMSONSTEOQEN(2017): Inconsistency in pre-triangular part of model.
 
    The solution order of the critical equations and
    variables is:
 
    All variables in equation E_CRIMSONSTEOQEN(2017) are now fixed
    and the equation is infeasible. Residual =-3.7893187255E-08

Monday, November 10, 2008

Sudoku: Constraint Programming vs MIP

The Sudoku puzzle can be coded easily using constraint programming thanks to built-in constructs that handle the all-different constraint. Here is an example using the Microsoft Solver Foundation CSP solver:

(click to enlarge)

The only difficulty was to model: for all i,j such that d[i,j]>0 require that x[i,j]=d[i,j]. I used the following formulation: Foreach[{i,I},{j,J},x[i,j]==d[i,j] | d[i,j]==0] where | is an "or" operator.

The complete model looks like:


Model[
 
  Parameters[Sets,I,J],
  Parameters[Integers,d[I,J]],
 
  Decisions[Integers[1,9],x[I,J]],
 
  Constraints[
 
    // Alternative 1: (does not work in the beta)
    //FilteredForeach[{i,I},{j,J},d[i,j]>0,x[i,j]==d[i,j]],
    // Alternative 2: 
    Foreach[{i,I},{j,J},x[i,j]==d[i,j] | d[i,j]==0],
 
    Foreach[{i,I},Unequal[Foreach[{j,J},x[i,j]]]],
    Foreach[{j,J},Unequal[Foreach[{i,I},x[i,j]]]],  
    Foreach[{ib,3},
        Foreach[{jb,3},
            Unequal[Foreach[{i,ib*3,ib*3+3},{j,jb*3,jb*3+3},x[i,j]]]
         ]
    ]
  ]
 
]

The above sudoku can also be coded in GAMS:

$ontext
  Sudoku solver
  September 2005
 
  Code written by:
     Paul van der Eijk
     GAMS Development Corporation
     1217 Potomac Street NW
     Washington DC, 20007
     Tel: (202) 342-0180 Fax: (202) 342-0181
     Email: paul@gams.com
     WWW: http://www.gams.com
 
  Based on Latin Square example in GAMS model library.
 
  Adapted to solve a Sudoku problem by:
     Sherman Robinson
     Economics Department
     University of Sussex
     Brighton BN1 9SN
     UK
     email: sherman.robinson@sussex.ac.uk
 
 
$offtext
 
sets
 r  rows    / r1 * r9/
 c  columns / c1 * c9/
 v  values  / v1 * v9/
 sc sub-col /sc1 * sc3/
 sr sub-row /sr1 * sr3/
 scs(sc, c) /sc1.(c1*c3), sc2.(c4*c6), sc3.(c7*c9)/
 srs(sr, r) /sr1.(r1*r3), sr2.(r4*r6), sr3.(r7*r9)/
;
 
table problem(r,c)
      c1  c2  c3  c4  c5  c6  c7  c8  c9
r1     5   3           7
r2     6           1   9   5
r3         9   8                   6
r4     8               6               3
r5     4           8       3           1
r6     7               2               6
r7         6                   2   8
r8                 4   1   9           5
r9                     8           7   9
;
display problem;
 
Parameter  value(v) "Values";
value(v) = ord(v) ;
 
binary variable x(r,c,v);
variable w;

equations
  eq1(r,c)        "exactly one value for each cell"
  eq2(c, v)       "columns entries have to be unique"
  eq3(r, v)       "row entries have to be unique"
  eq4(sr, sc, v)  "sub-squares have to be unique"
  nobj            "definition of objective - anything"
;
 
eq1(r, c)..   sum(v, x(r, c, v)) =E= 1;
eq2(c, v)..   sum(r, x(r, c, v)) =E= 1;
eq3(r, v)..   sum(c, x(r, c, v)) =E= 1;
eq4(sr, sc, v).. sum((c,r)$(scs(sc, c) and srs(sr, r)), x(r, c, v)) =E= 1;
 
nobj..      w =E= sum((r, c, v), x(r, c, v));
 
*SR Fix values in the problem
x.l(r,c,v)$(problem(r,c) = value(v)) = 1 ;
x.fx(r,c,v)$(x.l(r,c,v) = 1)         = 1 ;
 
option limrow=1000, limcol=0;
option mip=cplex;
model sudoku /eq1, eq2, eq3, eq4, nobj/;
solve sudoku minimizing w using mip;
 
parameter  solution(r,c)   "Display of results";
loop(v,
  solution(r, c)$(x.l(r, c, v)) = Ord(v);
);
option solution:0:1:1;
display solution;

I wrote an Excel based GUI for this model as an example, see http://www.amsterdamoptimization.com/packaging.html.

Friday, November 7, 2008

Delphi Usenet Groups

are no longer the place to be. Go instead to: https://forums.codegear.com/

The best one is of course Delphi Non-technical: https://forums.codegear.com/forum.jspa?forumID=67 (see how developers digest the new unicode strings or complain about the help functionality).

Thursday, November 6, 2008

Infeasible initial point: reporting

This model parses a listing file and produces a sorted list of the infeasibilities of the initial point. Note that you need to set OPTION LIMROW to a large number for this to work.

No need to set LIMCOL: all variables are always feasible.


$ontext
 
If a calibration is not correct, the benchmark solve
with LIMROW set to a large number may produce many
infeasibilities. This script will produce a sorted
list of these infeasibilities so it is easy to find the
largest ones.
 
Erwin Kalvelagen, nov 2005
Updated, oct 2007
 
$offtext
 
$set LISTING china.lst
$set UNSORTED unsorted.txt
$set SORTED sorted.txt
$set SCRIPT infes.awk
 
 
$onecho > %SCRIPT%
 
/^Equation\ Listing/ {
 eqlisting=1
}
 
/^.*\.\./ {
 if (eqlisting==1) {
    equation = $1
    gsub(/\./,"",equation)
 }
}
 
/INFES\ =\ [0-9A-Z.]*\ \*\*\*/ {
 if (eqlisting==0) {
   next
 }
 
 match($0,"(INFES = )([0-9A-Z.]*)",arr)
 print equation " " arr[2]
 print equation " " arr[2] >> "%UNSORTED%"
}
 
 
/^Model Statistics/ {
 exit
}
 
 
$offecho
 
execute '=rm -f %UNSORTED%';
execute '=rm -f %SORTED%';
execute '=awk -f %SCRIPT% %LISTING%';
execute '=gsort -n -k 2 -r -o %SORTED% %UNSORTED%';
execute 'echo inspect %SORTED%';

Wednesday, November 5, 2008

Microsoft Solver Foundation: OML

I have now been playing a bit with the Microsoft Modeling Language. It is really amazingly simple, and the integration with Excel is very interesting.

Of course we miss a number of things like data manipulation, loops, checks etc. Also there are still a few bugs, but this is a beta. Some syntax is somewhat surprising. E.g. {i,1,4} means i=1,2,3. It is better to make all arrays zero based. In that case one can use e.g. Sum[{i,N},x[i]] as the default lower limit in a sum is zero (i.e. {i,N} is the same as {i,0,N}, or i=0,..,N-1).

The next screen shot illustrates all these points.

With a little UI:

Saturday, November 1, 2008

Office 2007: Saveas pdf

There is an Office add-in that allows to save documents as PDF files:

http://www.microsoft.com/downloads/details.aspx?FamilyId=4D951911-3E7E-4AE6-B059-A2E79ED87041&displaylang=en

Another tool is CutePDF from http://www.cutepdf.com

The Microsoft utility gives better pdf rendering in documents where I have embedded screenshots.

Friday, October 31, 2008

Gdxls: call only if needed

Here is a trick to call gdxls from GAMS only if the source file input.xls is newer than the destination file input.gdx:

$call test input.xls -nt input.gdx && gdxls -propertyfile properties.txt

This will cause that gdxls will only be called if input.xls has been changed since the last time this conversion was performed.

Thursday, October 30, 2008

Lp2gams update

Added semi-continuous variables so we can translate lp-solve models with such variables.

See: http://yetanothermathprogrammingconsultant.blogspot.com/2008/06/lpsolve-to-gams-conversion.html

Have not checked if fixed semi-continuous variables work identical at both sides. There is confusion about the meaning of this.

GAMS obscure error message

When we read using GAMS 22.7 a GAMS restart file produced by GAMS 22.8 we get a nasty error message. Looks like the versioning mechanism on workfiles does not completely work correctly here:

C:\Program Files\GAMS22.7>gams x r=x
--- Job x Start 10/30/08 16:56:41
*** Workfile synchronization problem in section BAS-END
*** Generated under GAMS Ver = WEX228-228
*** Processed under GAMS Ver = WEX227-227
*** Status: Terminated due to systems error in WF Section error: Expected = BAS-END, Found =
***         Inspect listing file for more information
***
*** GAMS Base Module May  1, 2008 22.7.2 WEX 4648.4799 WEI x86_64/MS Windows
***
*** GAMS Development Corporation
*** 1217 Potomac Street, NW
*** Washington, DC 20007, USA
*** 202-342-0180, 202-342-0181 fax
*** support@gams.com, www.gams.com
***
*** opentext   W=0 FN="225a\gamsnext.cmd"
--- Job x.gms Stop 10/30/08 16:56:41 elapsed 0:00:00.020

***
*** Gams Exit Msg = WF Section error: Expected = BAS-END, Found =
***

C:\Program Files\GAMS22.7>

I was stumped when a client of mine reported this error. I expected an error message about version mismatch in such a case.

GAMS Function and Derivative Evaluation

I have two different formulations of a linearly constrained NLP (i.e. the objective is nonlinear). The difference is how we formulate the objective. In the next fragments x is a variable and c and e are parameters.

Original Version

cost1 ..        z  =e=  sum((i,j,k), x(i,j,k)*log(x(i,j,k)+e)) -
                     sum((i,j,k), x(i,j,k)*log(c(i,j)+e));
 
Total CPU secs in IPOPT (w/o function evaluations)   =      6.033
Total CPU secs in NLP function evaluations           =     33.971

Fast Version


cost2 ..        z  =e=  sum((i,j,k), x(i,j,k)*log((x(i,j,k)+e)/(c(i,j)+e)));
 
Total CPU secs in IPOPT (w/o function evaluations)   =      5.905
Total CPU secs in NLP function evaluations           =      0.541

There is probably something wrong how GAMS deals with the derivatives for the first version. I passed this on to GAMS so they can investigate.

The real model is much larger (> half a million variables), and then this difference becomes really significant. One version can solve the model quickly to near optimality (with Mosek) and the other version is not able to finish in reasonable time.

Langford's problem

The problem is to find integer sequences with special characteristics. See http://www.lclark.edu/~miller/langford.html. An example of a sequence is

I.e. we have 4 sets of 2 numbers 1,2,3,4 where between number k we have k other numbers.

For a MIP formulation see: http://www.amsterdamoptimization.com/pdf/langford.pdf

With Microsoft Solver Foundation it is easy to create a constraint programming version that solves much quicker and in addition it is easy to add a simple user interface.

Note that there are improved CP formulations. See e.g. http://4c.ucc.ie/~bms/Papers/ResearchReports/2000_18.ps

Tuesday, October 28, 2008

Grooml modeling language

This looks interesting for linear models:

Monday, October 27, 2008

gdxls: write xls files on Linux

The gdxls tool is a Java based alternative for GDXXRW on Linux and other non-Windows operating systems.

Besides converting xls→gdx (as described in http://yetanothermathprogrammingconsultant.blogspot.com/2008/10/gdxls-read-xls-files-on-linux.html) we can also convert gdx→xls.

As simple example model is shown below:

$onecho > trnsport2.properties
o=trnsport2.xls
i=trnsport2.gdx
  
parameter=c
rng=a1
rdim=1
cdim=1
 
parameter.2=c
rng.2=f1
rdim.2=2
cdim.2=0
  
parameter.3=c
rng.3=sheet1!A10
rdim.3=0
cdim.3=2

$offecho
 
$call gamslib trnsport
$call gams trnsport lo=2 gdx=trnsport2
  
$call ./gdxls -propertyfile trnsport2.properties
 
$call oocalc trnsport2.xls

We run the trnsport model from the Model Library and save the results in a gdx file. We then extract the c(i,j) parameter from the gdx file and write it to the spreadsheet in different formats.

Note that we use a numbered suffix in the properties file to indicate which options belong to the same item. No suffix is identical to suffix ".1". So we have three items to process and they are numbered 1,2,3.

The result rendered by Open Office Calc looks like the picture shown below (click to enlarge).

Saturday, October 25, 2008

gdxls: read xls files on Linux

The GAMS tool GDXXRW is not available under Linux, Unix and OS X. As I needed to have this functionality in some projects, I developed a Java based alternative. Technobabble: The tool is using GCJ,GCC and CNI so it should be easy to port to any platform where the GNU compilers GCC and GCJ are available.

The tool gdxls can read xls files on Linux (and other Unix-based operating systems) to generate gdx files. Similar to GDXXRW it works as described below.

Consider the following spreadsheet (it can be an xls file from Excel or as shown from Open Office):

The GAMS model to import this file via a gdx file is:

$onecho > trnsport.properties
i=trnsport.xls
o=trnsport.gdx
 
parameter=c
rng=b4
rdim=1
cdim=1
$offecho
 
$call gdxls -propertyfile trnsport.properties
 
sets
  i 'canning plants' / seattle, san-diego /
  j 'markets'        / new-york, chicago, topeka /
;
 
parameter c(i,j);
$gdxin trnsport.gdx
$load c
 
display c;

The log will look like:

erwin@erwin-desktop:~/workspace/gdxls$ gams test1.gms
--- Job test1.gms Start 10/25/08 09:42:14
GAMS Rev 228  Copyright (C) 1987-2008 GAMS Development. All rights reserved
Licensee: GAMS Development Corporation, Washington, DC   G871201/0000CA-ANY
         Free Demo,  202-342-0180,  sales@gams.com,  www.gams.com   DC0000
--- Starting compilation
--- test1.gms(12) 2 Mb
--- call gdxls -propertyfile trnsport.properties
GDXLS V 0.1, Amsterdam Optimization (c) 2008
xls2gdx,input=trnsport.xls,output=trnsport.gdx
Parameter;name=c;range=B4:E6;rdim=1;cdim=1
xls2gdx done
--- test1.gms(20) 3 Mb
--- GDXin=/home/erwin/workspace/gdxls/trnsport.gdx
--- test1.gms(23) 3 Mb
--- Starting execution: elapsed 0:00:00.186
--- test1.gms(23) 4 Mb
*** Status: Normal completion
--- Job test1.gms Stop 10/25/08 09:42:14 elapsed 0:00:00.188
erwin@erwin-desktop:~/workspace/gdxls$

The result is:

----     23 PARAMETER c  gdxls range:B4:E6
 
            new-york     chicago      topeka
 
seattle         2.500       1.700       1.800
san-diego       2.500       1.800       1.400

Another posting will describe gdx to xls conversion.

Thursday, October 23, 2008

Magic squares in GAMS

In http://yetanothermathprogrammingconsultant.blogspot.com/2008/10/magic-squares-with-ms-csp-solver.html we describe a CSP formulation using the Microsoft Solver Foundation tools. Here are some GAMS formulations using a MIP.

Instead of x(i,j) we use a single index: x(i). That makes it easier to model the "all-different" constraints, at the expense of making the row, column and diagonal sums more difficult. The latter complication can be alleviated by using a mapping set that maps between the two representations.

This version uses a permutation matrix paradigm to model the all-different constraint:

$ontext
Magic Squares
 
Take consecutive numbers from 1 to n^2 and
arrange them in a n x n array such that
every row, every column and the two diagonals
all add up to the same total.
 
See also: http://forum.swarthmore.edu/alejandre/magic.square.html
 
Erwin Kalvelagen, nov 1999
$offtext
  
set k "board size" /k1*k5/;
alias (k,kk);
set i "numbering of cells" /i1*i25/;
alias(i,j);
 
abort$(sqr(card(k)) <> card(i)) "card(i) should be card(k)^k";
 
$ontext
  cells are numbered:
 
      1  2  3  4
      5  6  7  8
      9 10 11 12
     13 14 15 16
 
$offtext
 
 
 
variable x(i)          "value of board cell i";
binary variable y(i,j) "permutation matrix";
variable s             "sum";
 
 
parameter p(i);
p(i)=ord(i);
 
 
equations
   unique(i) "make sure each x(i) is unique"
   yrow(i)   "row sum of y"
   ycol(j)   "col sum of y"
   row(k)    "row of board"
   col(k)    "col of board"
   diag1     "first diagonal"
   diag2     "second diagonal"
;
 
 
 
set map(k,kk,i);
scalar cnt /0/;
loop((k,kk),
    cnt = cnt + 1;
    map(k,kk,i)$(ord(i)=cnt)=yes;
);
display map;
 
set revdiag(i);
loop(map(k,kk,i)$((ord(k)+ord(kk))=card(k)+1),
    revdiag(i) = yes;
);
 
display revdiag;
 
unique(i)..x(i)=e=sum(j,y(i,j)*p(j));
yrow(i)..  sum(j,y(i,j))=e=1;
ycol(j)..  sum(i,y(i,j))=e=1;
 
row(k).. sum(map(k,kk,i), x(i)) =e= s;
col(k).. sum(map(kk,k,i), x(i)) =e= s;
diag1..  sum(map(k,k,i), x(i)) =e= s;
diag2..  sum(revdiag, x(revdiag)) =e= s;
 
 
equations
   extra1    "extra constraint for performance"
   extra2    "another redundant constraint for performance"
 ;
 
* additional constraints below
scalar n  "size of set i";
n = card(i);
extra1.. sum(i,x(i)) =e= n*(n+1)/2;
extra2.. sum((i,j),y(i,j)) =e= n;
 
* fix s with its known value
scalar magicsum;
magicsum = card(k)*(n+1)/2;
s.fx = magicsum;
 
 
model m1 /all/;
solve m1 using mip minimizing s;
 
display x.l;
 
parameter board(k,kk);
board(k,kk) = sum(map(k,kk,i), x.l(i));
display board;

This version is using or-constraints to model the all-different condition.

$ontext
Magic Squares
 
Take consecutive numbers from 1 to n^2 and
arrange them in a n x n array such that
every row, every column and the two diagonals
all add up to the same total.
 
See also: http://forum.swarthmore.edu/alejandre/magic.square.html
 
Erwin Kalvelagen, nov 1999
$offtext
 
set k "board size" /k1*k5/;
alias (k,kk);
set i "numbering of cells" /i1*i25/;
alias(i,j);
 
abort$(sqr(card(k)) <> card(i)) "card(i) should be card(k)^k";
 
$ontext
  cells are numbered:
 
      1  2  3  4
      5  6  7  8
      9 10 11 12
     13 14 15 16
 
$offtext
 
 
 
variable x(i)          "value of board cell i";
binary variable y(i,j) "permutation matrix";
variable s             "sum";
 
 
parameter p(i);
p(i)=ord(i);
 
 
equations
   unique(i) "make sure each x(i) is unique"
   yrow(i)   "row sum of y"
   ycol(j)   "col sum of y"
   row(k)    "row of board"
   col(k)    "col of board"
   diag1     "first diagonal"
   diag2     "second diagonal"
;
 
 
 
set map(k,kk,i);
scalar cnt /0/;
loop((k,kk),
    cnt = cnt + 1;
    map(k,kk,i)$(ord(i)=cnt)=yes;
);
display map;
 
set revdiag(i);
loop(map(k,kk,i)$((ord(k)+ord(kk))=card(k)+1),
    revdiag(i) = yes;
);
 
display revdiag;
 
unique(i)..x(i)=e=sum(j,y(i,j)*p(j));
yrow(i)..  sum(j,y(i,j))=e=1;
ycol(j)..  sum(i,y(i,j))=e=1;
 
row(k).. sum(map(k,kk,i), x(i)) =e= s;
col(k).. sum(map(kk,k,i), x(i)) =e= s;
diag1..  sum(map(k,k,i), x(i)) =e= s;
diag2..  sum(revdiag, x(revdiag)) =e= s;
 
 
equations
   extra1    "extra constraint for performance"
   extra2    "another redundant constraint for performance"
 ;
 
* additional constraints below
scalar n  "size of set i";
n = card(i);
extra1.. sum(i,x(i)) =e= n*(n+1)/2;
extra2.. sum((i,j),y(i,j)) =e= n;
 
* fix s with its known value
scalar magicsum;
magicsum = card(k)*(n+1)/2;
s.fx = magicsum;
 
 
model m1 /all/;
solve m1 using mip minimizing s;
 
display x.l;
 
parameter board(k,kk);
board(k,kk) = sum(map(k,kk,i), x.l(i));
display board;

The mapping set looks like:

----     60 SET map  
 
               i1          i2          i3          i4          i5          i6          i7          i8          i9
 
k1.k1         YES
k1.k2                     YES
k1.k3                                 YES
k1.k4                                             YES
k1.k5                                                         YES
k2.k1                                                                     YES
k2.k2                                                                                 YES
k2.k3                                                                                             YES
k2.k4                                                                                                         YES

    +         i10         i11         i12         i13         i14         i15         i16         i17         i18

k2.k5         YES
k3.k1                     YES
k3.k2                                 YES
k3.k3                                             YES
k3.k4                                                         YES
k3.k5                                                                     YES
k4.k1                                                                                 YES
k4.k2                                                                                             YES
k4.k3                                                                                                         YES
 
    +         i19         i20         i21         i22         i23         i24         i25
 
k4.k4         YES
k4.k5                     YES
k5.k1                                 YES
k5.k2                                             YES
k5.k3                                                         YES
k5.k4                                                                     YES
k5.k5                                                                                 YES

The solution is displayed as:

----     99 VARIABLE x.L  value of board cell i
 
i1   6.000,    i2  22.000,    i3   7.000,    i4  13.000,    i5  17.000,    i6  20.000,    i7  24.000,    i8   2.000
i9  18.000,    i10  1.000,    i11 19.000,    i12 12.000,    i13 16.000,    i14  8.000,    i15 10.000,    i16  9.000
i17  3.000,    i18 25.000,    i19  5.000,    i20 23.000,    i21 11.000,    i22  4.000,    i23 15.000,    i24 21.000
i25 14.000
 
 
----    103 PARAMETER board  
 
            k1          k2          k3          k4          k5
 
k1       6.000      22.000       7.000      13.000      17.000
k2      20.000      24.000       2.000      18.000       1.000
k3      19.000      12.000      16.000       8.000      10.000
k4       9.000       3.000      25.000       5.000      23.000
k5      11.000       4.000      15.000      21.000      14.000

The N=5 problem is already very difficult to solve.

Reading Excel XLS files on Linux

Using the code from http://poi.apache.org/ I can now read and process Excel XLS files on Linux (and Mac).

Note: this code implements the Compound Document Format, so does not depend on Windows.

Sunday, October 19, 2008

Magic Squares with MS CSP solver

This model generates Magic Squares (http://en.wikipedia.org/wiki/Magic_square) using the Microsoft Solver Foundation Excel tool.

As we have no objective, this can be formulated as a constraint programming problem. This makes it easy to use the alldifferent constraint (using the unequal construct in OML).

This remains a difficult problem: only quick solutions up to N=5.

Update: I tightened the symmetry breaking constraints and used a different CP heuristic (Domain over Weighted Degree). Choosing the correct heuristic was not done by deep insight but rather trial-and-error. This allowed me to solve even up to N=10 quickly.

It will be difficult to get good performance from a MIP solver on this. The MIP formulations for the all-different constraint are very demanding for a MIP solver. See e.g. H.P.Williams, Hong Yan, "Representations of the all-different Predicate of Constraint Satisfaction in Integer Programming," INFORMS Journal on Computing, Vol. 13 (2001) 96-103,
http://www.lse.ac.uk/collections/operationalResearch/pdf/PWilliams_2001_IJOC.pdf.

Thursday, October 16, 2008

C#: sending email through gmail

I am a big fan of gmail. So this may be useful:

http://code.msdn.microsoft.com/CSharpGmail

Java: CNI vs JNI

Old discussion:

http://gcc.gnu.org/ml/java/2004-03/msg00110.html

Short summary article:

http://en.wikipedia.org/wiki/Compiled_Native_Interface

For one of my current projects CNI looks quite attractive.

Update: used GCJ/CNI to implement GDXLS (see: http://yetanothermathprogrammingconsultant.blogspot.com/2008/10/gdxls-read-xls-files-on-linux.html and http://yetanothermathprogrammingconsultant.blogspot.com/2008/10/gdxls-write-xls-files-on-linux.html). Much more fun than JNI!

Some links with background info:
http://per.bothner.com/papers/UsenixJVM01/CNI01.pdf
http://people.redhat.com/lockhart/.gcc2004/MasterGCC-2side.pdf#page=169

Monday, October 13, 2008

More solvers on the way

After Microsoft entering the Math Programming Software market, there also will be more solver competition at the high-end: http://www.gurobi.com/index.html. These guys were main developers of Cplex, so I would guess they aim towards a similarly positioned product. This is of course a very welcome development for us modelers.

Convex Linearly Constrained Model

I was given a very large convex linearly constrained model (exceeding half a million variables). The "standard" NLP solvers (Conopt,Minos,Snopt) were very slow on this model, not even being close to optimality after hours of working. Mosek solved the model very fast, but not to optimality: Solution status : NEAR_OPTIMAL.

Unfortunately this near optimal point could not be improved quickly by other solvers. In the end we solved a few QP approximations of the form:

min g(x)=f(x⁰) + ∇ f(x⁰) (x-x⁰) + 0.5 (x-x⁰)^T ∇ ² f(x⁰) (x-x⁰)
s.t. Ax=b

Here x⁰ is the current value of x.

Basically a Taylor series approximation of the objective. This actually converged rapidly to an optimal solution we could verify with Conopt. (Conopt complained about some reduced gradients being too large, but could not improve the solution further). What this algorithm lacks in refinement it gains in simplicity: just a loop around a solve statement.

Implementing such a problem specific algorithm is easy to do in GAMS using its procedural features (the complete code in this case is less than 20 lines), and for the most difficult problems this often can help. We further benefitted from having access to multiple NLP solvers (in this case Mosek and Conopt).

Saturday, October 11, 2008

Cannot start GAMS IDE

> I cannot start the GAMS IDE when I am not
> connected to the network. I get a message
> like:
> Problem with system file
> \ecomfp01\user$\My Documents\gamsdir\gamside.ini
> Error 1231: The network location cannot be reached.

The GAMS IDE uses the "My Documents" directory to store an initialization file. It will refuse to run if this directory can not be accessed. Usually this directory is on the local c: drive, but some system administrators may choose to have this directory on a network drive. The result is that the GAMS IDE will not start if your laptop is not connected to the network, even if GAMS itself is installed on the local c: drive.

The work-around is to create a shortcut on the desktop with target:

"C:\Program Files\GAMS22.8\gamside.exe" c:\projects\gams\gamside.ini

When you click on this shortcut the GAMS IDE will use the c:\projects\gams directory to store the initialization file. Make sure this directory exist!

The IDE Help has some notes on this (not being able to start the IDE, you would need to click on gamside.chm in the GAMS system directory).

Friday, October 10, 2008

Drunken noodle log

The GAMS/Ipopt log always somehow reminds me of drunken noodles because of the misaligned headers.

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  30 4.2974620e+002 3.78e+001 3.39e+001  -1.9 5.15e+002    -  8.47e-002 1.96e-001f  1
  31 4.2874611e+002 2.87e+001 2.01e+002  -1.9 5.03e+002    -  1.02e-001 2.40e-001f  1
  32 4.2792912e+002 2.28e+001 1.65e+002  -2.0 4.90e+002    -  2.02e-001 2.07e-001f  1

I would guess the formats are a little bit too tight for some platforms.

Thursday, October 9, 2008

GAMS/Mosek link bug

GAMS/Mosek stops but GAMS keeps it a secret why this happened. A few extra lines with some text explaining why the solve was aborted would not be unwelcome:

               S O L V E      S U M M A R Y
 
     MODEL   LANDentSN2          OBJECTIVE  ENTROPY
     TYPE    NLP                 DIRECTION  MINIMIZE
     SOLVER  MOSEK               FROM LINE  249734
 
**** SOLVER STATUS     4 TERMINATED BY SOLVER    
**** MODEL STATUS      7 INTERMEDIATE NONOPTIMAL 
**** OBJECTIVE VALUE              217.6762
 
 RESOURCE USAGE, LIMIT        539.256    900000.000
 ITERATION COUNT, LIMIT         0        900000
 EVALUATION ERRORS              0             0
 
 MOSEK Link       Aug  1, 2008 22.8.1 WEX 5438.6015 WEI x86_64/MS Windows
 
 M O S E K       version 5.0.0.90 (Build date: Jun  6 2008 14:57:22)
 Copyright (C)   MOSEK ApS, Fruebjergvej 3, Box 16
                 DK-2100 Copenhagen, Denmark
                 http://www.mosek.com

The log file may contain additional information. Unfortunately if you run GAMS from the command line (e.g. by Linux users) the log is not saved. (The IDE under Windows will save a copy).

Tuesday, October 7, 2008

MLE estimation in GAMS

This Max Likelihood Estimation example includes forming of confidence intervals. Standard errors are found by inverting the Hessian and normal quantiles are calculated using a minimal CNS model.

http://amsterdamoptimization.com/models/statistics/weibull.gms

This kind of analysis is only possible with new features in GAMS. However, it remains kind of klunky compared to Matlab, Gauss or SAS. The GAMS approach may be useful nevertheless, e.g. when using as part of a larger GAMS model (in which case calling a different system may be cumbersome), when the problem is very difficult (in which case stronger NLP solvers and automatic differentiation can help) or when you don't have access to these other systems.

Sunday, October 5, 2008

Microsoft Solver Foundation and indus89

Guess I was the first one to report an issue:
http://code.msdn.microsoft.com/solverfoundation/WorkItem/List.aspx

The Microsoft people react very promptly!

Thursday, October 2, 2008

Bug in GAMS/Xpress link

The GAMS/Xpress link can terminate without reason:

**** SOLVER STATUS 4 TERMINATED BY SOLVER
**** MODEL STATUS 8 INTEGER SOLUTION
**** OBJECTIVE VALUE 134400.0000

RESOURCE USAGE, LIMIT 30.379 1000.000
ITERATION COUNT, LIMIT 5515 10000

The listing files does not convey any messages indicating the reason for stopping. We have observed this before, but this bug has not been fixed as of now. If a solver stops it is important to say why. Just simple-mindedly reporting "terminated by solver" makes it very difficult to fix whatever the reason is for the termination of the job.

Note: the log file did not help me either with information why the solve was stopped.

trnsport in Microsoft Solver Foundation

These screendumps show how the trnsport.gms model from the GAMS model library can be expressed in OML. Note that all data manipulation is done in Excel.

OML does not allow for any data manipulation. That means that data preparation (model calibration, balancing, aggregation, set manipulation etc) needs to be done outside the modeling language. The same for data manipulation during report writing or when developing heuristic algorithms instead of a single solve.

Most of my GAMS models include a large part doing data manipulation. Often more than 80%. It would not be as convenient to be forced to move all data manipulation out of the modeling language and into a database, or spreadsheet.

I have not figured out yet if there is a mechanism to handle exceptions in OML like the $ such that operator in GAMS.

Wednesday, October 1, 2008

Tool to compare text files

> I am looking for a good visual tool to compare gams source files.

I highly recommend Beyond Compare from http://www.scootersoftware.com/

Handling data sources in GAMS

> I have data sitting in different databases and spreadsheets. How to I
> handle this most conveniently in GAMS.

I usually handle this is a separate GAMS model that imports from all these data sources using tools like MDB2GMS, SQL2GMS and store the data in a GDX file. Then your main GAMS model can read this GDX file. This has several advantages: your main model can focus on the optimization model itself, and will not be cluttered by all data handling and conversion code. In addition the intermediate GDX file is convenient as it now is a single self-contained file. This makes it easy to share with others, to use in other related models, to be inspected in the IDE etc. An additional advantage is that the GDX file functions as a snapshot. Even if later on databases are changing, you have a (hopefully consistent) snapshot of all input data, that allows you to run the model against a stable data set.

Attached are sheets illustrating this concept for some very large models.

Tuesday, September 30, 2008

Example MSFCli.exe

Example of using the Microsoft solver with an mps file

C:\projects\test>"c:\Program Files (x86)\Microsoft Solver Foundation\1.0.6.4960\Bin\MSFCli.exe" +type mps +verbose +log indus89.mps
===== Processing .\indus89.mps =====
Algorithm: Primal
CostingExact: SteepestEdge
CostingDouble: SteepestEdge
Variables: 6570 -> 6366 + 2363
Rows: 2727 -> 2564
NonZeros: 39490
Eliminated Slack Variables: 201
Phase 1 double Pivots: 4639
Phase 2 double Pivots: 1864
double Factorings: 138
Rational Factorings: 1
Degenerate pivots: 1367 (21.02 %)
{
x1 -> 114873.65553186,
x2 -> 1522.85610775999,
x3 -> 3460.00034188404,
x4 -> 479.152767198711,
x5 -> 22532.3511432455,
x6 -> 2490.55460455224,
....
x6530 -> 1,
x6550 -> 1,
x6570 -> 1
}
Zero variables: 3949
Positive variables: 2621
Negative variables: 0
Objective value: 'obj' Min Optimal -114873.65553186

===== Finished .\indus89.mps: 00:00:05.8776890 Optimal =====

The solver seems to feature some kind of rational arithmetic.

Not sure if this is related, but the Microsoft solver finds a correct solution to the Neumaier model http://www.gams.com/modlib/libhtml/badmip.htm.

Microsoft Solver Foundation

.NET based solvers:

Simplex solver (LP,MIP)
Interior Point solver (LP,QP)
Constraint solver (CSP)
Unconstrained non-linear solver (NLP)

Can be used as library via any .NET language, or input can be specified as MPS and a simple equation oriented format called OML. In addition, Excel bindings are available.

http://code.msdn.microsoft.com/solverfoundation

After downloading, documentation and samples are available in My Documents\Microsoft Solver Foundation.

Monday, September 29, 2008

GBD feasibility cut

>Dear Mr. Kalvelagen,
>
>I am trying to do a comparison between the performance
>of GBD and OA in the case of convex and nonconvex minlp
>problems.
>
>I am using the GBD-code from the script for MINLP methods
>that you have written and posted on your page, and comparing
>it with DICOPT-OA. My application is on transportation
>network design problem. The theory as highlighted in Floudas
>1995 about the difference between the two methods seems to
>match for the most part in my problem. However, I am facing
>infeasibilities in the sub-problem.
>
>Your GBD code on the portfolio optimization generates support
>functions only from the feasible primal. Can you give me an
>idea (if that is not much of a hassle for you) how a
>feasibility primal subproblem can be implemented in your
>GBD code to generate additional supports (from the infeasible
>subproblems), or if you have provided an example of such a
>case somewhere else I’d be grateful if you can direct me to that?
>
>I do find your webpage and the relevant files very useful when
>testing theoretical concepts through implementation in GAMS.
>
>Many thanks in advance for your time!

You can add feasibility cuts by forming a feasibility subproblem.

I do something like that for the dual subproblem in http://amsterdamoptimization.com/pdf/benders.pdf

A straightforward discussion can be found in Ullmann's Modeling and Simulation By Fritz Ullmann page 113, and 116.

Of course it may be possibly to bypass the problem by formulating an elastic version such that the subproblems are always feasible.

Conditional generation

> OK thanx for this information.
> Did you have time to look at my "corrected" constraints? I'm really sorry for
> having written an error ...
> The correct conditions are now written:
>------------------------------------------
> if s=1 then t > b
> if s=-1 then t < -b
> if s=0 then -b <= t <= b
> where:
> variables: t real in [-1,1] b real in [0,1]
> parameter: s integer in {-1,0,1}
> ------------------------------------------
>I just can't see how to write these conditions as linear constraints in AMPL ...
>Thanx again for your help or any advice which could guide me towards the solution ...

param s = 1;
var t >= -1, <= 1;
var b >= 0, <= 1;

minimize obj: b; subject to
e1{1..1:s=1}: t >= b;
e2{1..1:s=-1}: t <= -b;
e3{1..1:s=0}: -b <= t;
e4{1..1:s=0}: t <= b;

Conditional generation of equations is an important feature in formulating real-world models. You may want to consider hiring a consultant with experience in AMPL modeling. That would speed up the implementation considerably.

Wednesday, September 24, 2008

GAMS Command Prompt

Many Windows programs that have a command line interface install in the Start menu a Console Prompt that have path and environment automatically setup correctly. Examples are Lahey and the Microsoft SDK's. This would be useful for GAMS also for users that use the command line.

Thursday, September 18, 2008

$load x=x.l not allowed (GAMS bug)

When a variable x is stored in a GDX file we can read it as a parameter as follows:

parameter x(k,i,t);
execute_load 'merged.gdx',x=x.l;

Unfortunately at compile time we don't have the same facility:

parameter x(k,i,t);
$gdxin merged.gdx
$load x=x.l

yields

5923 parameter x(k,i,t);
GDXIN C:\projects\china\input\merged.gdx
5925 $load x=x.l
**** $490
**** LINE 19 INPUT C:\projects\china\input\report.gms
**** 490 Cannot load/unload this suffix

Tuesday, September 16, 2008

On solving a large system of nonlinear equation

When trying to solve a very large system of nonlinear F(x)=0 I encountered a number of problems. Large meant here about 224k variables and equations. Solving as CNS gave "pivot too small". An NLP formulation:

min dummy

F(x)=0

caused lack of progress and stopped with an infeasible solution. In the end I could solve the model reliably using a min absolute deviations formulation where explicit slacks are omitted for linear equations, i.e.:

min ∑ s_i

Ax−b=0

G(x)+s = 0

A standard variable splitting technique is used to model the absolute value:

min ∑ v_i+w_i

Ax−b=0

G(x)+v−w=0

v,w≥0

Monday, September 15, 2008

sub models

For large models it is often convenient to build models in separate modules. The module can exchange data via save/restart files or gdx files. A driver module will call the individual sum modules.

However this does not work well with the IDE. For instance the log of a submodel if called by

$call =gams submodel.gms

will be invisible and lost. There are few thing we can do to improve the situation. The settings I use are as follows:

* flags to run submodels better under the IDE
$setglobal ide "ide=%gams.ide% lo=%gams.lo% errorlog=%gams.errorlog% errmsg=1"
 
$call =gams submodel.gms %ide%
$if errorlevel 1 $abort submodel.gms has issues

This will solve most problems of calling gams via $call when using the IDE, and also works good when not running under the IDE.

The one remaining and somewhat annoying problem (bug) is that clicking on black lines of the submodel log, will cause the IDE to go to listing file of the main model. I.e. the wrong listing file is loaded. Clicking on the red lines work fine: the IDE will go the correct source file.

I think it would be an improvement if GAMS has better built-in support for executing submodels (compile time and execution time) where we don't have to worry about all these parameters and with better interaction with the IDE.

Friday, September 12, 2008

Restarts

> I am dealing with a class of large scale LPs. The class has nice
> structure though, let's say it has 1000 constraints. The objective
> function and the first 999 constraints are identical among all
> instances of the problems in this class. So everytime that user wants
> to solve a new problem it is only different from other problems in the
> 1000th constraint.
>
> I was wondering if this feature can be exploited to improve the speed
> of solving problems in this class. Is it possible that something be
> pre-computed and stored about the problem from the fixed part
> (objective and 999 constraints) and then everytime that user inserts
> the 1000'th constraint, that knowledge be used efficiently?

Absolutely. 1000 constraints is very small, but in general simplex based LP solvers have excellent restart facilities using the concept of a basis. Using the GAMS modeling language this even completely automated. I.e. the fragment:

set scenario /s1*s5/;
 
parameter growthq_s(scenario)   /
 s1 2
 s2 2.2
 s3 2.3
 s4 2.5
 s5 2.7
/;
 
parameter report(scenario,*);
 
loop(scenario,
   growthq = growthq_s(scenario);
   solve wsisn maximizing cps using lp;
   report(scenario,'time') = wsisn.resusd;
   report(scenario,'iterations') = wsisn.iterusd;
);
 
display report;

solves a series of related LP's: one coefficient in the model
changes here. Iteration count and cpu time are recorded and
they are displayed as:

 ----   3642 PARAMETER report
 
          time  iterations
 
 s1       0.630    3704.000
 s2       0.108      76.000
 s3       0.070      76.000
 s4       0.066      76.000
 s5       0.069      76.000

so indeed the first solve is the most expensive one.

Bug in gams

The fragment below demonstrates a bug in GAMS. From early analysis it looks like this is fairly difficult to encounter in a real model, so it may be less alarming than it looks like at first sight.


   1   SET NUM /1*2/;
   2   VARIABLE X(NUM);
   3   X.L(NUM) = ORD(NUM);
   4
   5   X.FX('2') $(X.LO('2') EQ X.UP('2')) = 234;
   6   DISPLAY X.LO,X.L,X.UP;
 
COMPILATION TIME     =        0.000 SECONDS      3 Mb  WEX228-228 Jul 26, 2008
 
----      6 VARIABLE X.Lo
 2 234.000
----      6 VARIABLE X.L
 1   1.000,    2 234.000
----      6 VARIABLE X.Up
 2 234.000

GAMS/Xpress documentation bug

loadmipsol in xpress.pdf is documented as having a default of 1. That is hard to believe. Indeed the def file indicates this is wrong.

Thursday, September 4, 2008

Smooth approximations

Some examples of using the logistic function in smooth approximations for ifthen and max/min:

ifthen(x1>x2, y, z) = z + (y-z)*0.5*(1+tanh(k*(x1-x2)))
ifthen(x1<x2, y, z) = y + (z-y)*0.5*(1+tanh(k*(x1-x2)))
max(x,y) = ifthen(x>y,x,y) = y + (x-y)*0.5*(1+tanh(k*(x-y)))
min(x,y) = ifthen(x<y,x,y) = x + (y-x)*0.5*(1+tanh(k*(x-y)))

Some graphs with k=100:

Wednesday, September 3, 2008

Bug in ifthen (GAMS)

Consider the models:

   1  variable x;
  2  equation e1,e2;
  3
  4  scalar s /1/;
  5
  6  e1.. x$(s>0) =e= 0;
  7  e2.. ifthen(s>0,x,0) =e= 0;
  8
  9  model m1 /e1/;
 10  solve m1 using cns;
 11
 12  model m2 /e2/;
 13  solve m2 using cns;

The models are the same. The ifthen is exogenous but GAMS does not recognize this. As a result the fist model solves fine and the second model leads to numeric problems:

** Error in Square System: Pivot too small.

Turns out the gradients are not correct for ifthen:

---- e2  =E= 
e2..  (0)*x =E= 0 ; (LHS = 0)

Advice: don't use ifthen in equations as the derivatives of this function are wrong. If possible use $ conditions (exogenous if) or a smooth approximation (endogenous if). Smooth approximations for ifthen are discussed in this blog.

Arithmetic on indices

>Is it possible to write an objective function as
>"obj.. PROFIT =e= sum((l,m,dig) $(ord(l) le p and ord(m) le p and ord(dig) le digmax ), x('(1+p*(ord(l)-1))','(1+p*(ord(m)-1))',dig)*ord(dig)); "
>if I only want to add particular x's to my objective function (indexed as 1+p*(ord(l)-1) and 1+p*(ord(m)-1) where p is a constant )?
>In this case, gams gives me domain violation error for first and second index and i couldn't find a way to solve it
>(the domain is not violated mathematically for example for p=3 and digmax= 9).
>Is there a way to do arithmetic operations in indexes?

Indices are always strings in GAMS. But you can do what you want as follows:

alias (l,i), (m,j); 
obj.. PROFIT =e=
  sum((l,m,dig,i,j) $(ord(l) le p and ord(m) le p and ord(dig) le digmax and ord(i)=(1+p*(ord(l)-1)) and ord(j)=(1+p*(ord(m)-1))),
       x(i,j,dig)*ord(dig));

This is somewhat messy, but we can do this better. You can simplify this by introducing extra sets:

alias (l,i), (m,j);  
set l2(l);     l2(l)$(ord(l)<=p) = yes;
set m2(m);     m2(m)$(ord(m)<=p) = yes;
set dig2(dig); dig2(dig)$(ord(dig)<=digmax) = yes;
set mapl(l,i); mapl(l2(l),i)$(ord(i)=(1+p*(ord(l)-1))) = yes;
set mapm(m,j); mapm(m2(m),j)$(ord(j)=(1+p*(ord(m)-1))) = yes;
display mapl,mapm;
...
equation obj;
obj.. PROFIT =e= sum((mapl(l,i),mapm(m,j),dig2(dig)) , x(i,j,dig)*ord(dig));

This way you can display the intermediate sets and debug the model more easily. The equation becomes much simpler.

For large, complex models this strategy is very helpful: move complexities from the equations to set operations. Sets can be debugged easily using display or using the gdx viewer. In addition they can be debugged before the complete model is ready to be generated.

Tuesday, September 2, 2008

Privacy.pdf not found

> You told me to look at http://www.gams.com/docs/privacy.pdf but I get
> 403 Forbidden. Please help!!!!

Sorry. GAMS moved their web pages and this caused problems here and there. I'll email a copy.

Monday, September 1, 2008

GAMS bug in redeclaration

GAMS allows redeclarations of symbols as long as they don't cause a contradiction. I.e. the following is valid GAMS:

set i;
set i /1,2/;
set i(*);

The GAMS compiler gets confused if i is used as domain:

   1  set i;
   2  parameter p(i);
   3  set i /1,2/;
   4  set i(*);
****         $184
**** 184  Domain list redefined

Work-around: make sure line 1 and 4 use both i or i(*).

Wednesday, August 27, 2008

Smooth approximation for ifthen()

The construct ifthen( x1 > x2, y, z) can be approximated by the smooth function:

z + (y-z)*0.5*(1+tanh(k*(x1-x2)))

where k is a large constant (e.g. k=1000). This will allow you to use NLP and CNS model types instead of DNLP.

See also: http://en.wikipedia.org/wiki/Heaviside_step_function

The logistic function (http://en.wikipedia.org/wiki/Logistic_function) can be expressed in two ways:

f(x,k) = (1+tanh(k·x))/2
f(x,k) = 1/(1+exp(-2k·x))

The exp() version can overflow more easily. Newer versions of GAMS implement a version of exp() that does not overflow (by truncation) so either choice should work.

This approximation can be used also for max(x,y) and min(x,y) as we have:

max(x,y) = ifthen(x>y, x, y)
min(x,y) = ifthen( y>x, x, y)

Tuesday, August 26, 2008

Invert.exe and Eigenvector.exe

> I have used these programs for years with release 22.1. Upgrading to
> 22.6, I cannot make them work.

Yes, GAMS has changed the GDX API causing a lot of compatibility problems. Invert.exe is now part of 22.8. Eigenvector.exe somehow is not. If you are a client of Amsterdam Optimization contact info@amsterdamoptimization.com and they can give you access to 22.6 versions of both.

Sunday, August 24, 2008

Minimum downtime

>Is there a chance to increase an index in a loop in gams?
>To illustrate, a(t) variable
> loop(t,
> if(a(t) eq 0,
> a(t+1) =0;
> t=t+2;);
> I want a(t) makes a(t+1) zero but dont want a(t+1) makes a(t+2) zero. Thus,I should skip t+1 step in the loop.

This approach is doomed. Remember for a MIP you need to formulate linear constraints. You need to introduce binary variables to model a construct like this. The following may work:

The general idea is to use binary variables say
x(t) = 0 if a(t) =0,
x(t) = 1 otherwise
(this is fairly simple to do) and then to forbid the x pattern
... 1 0 1 ....
e.g. by a cut of the form x(t) - x(t+1) + x(t+2) ≤ 1;

This type of construct is often used in unit commitment modeling in power planning: you don't want to turn on and off generators all the time. A minimum downtime restriction is often used to prevent this from happening.

You may want to consider hiring a consultant to speed up your modeling efforts. Someone with experience can implement these constructs quickly and reliably, while it can be a long struggle for someone who has never done this before, as you have experienced with your efforts.

Note: I suggested to use binary variables in an earlier email exchange, but based on advise from people on the GAMS-L mailing list the user stayed with the loop construct. Largely because the questions posed were too much focused on narrow loop syntax issues, nobody said: this is wrong, you need to change course. If the user would have provided more background info, he would probably have been steered to more promising and time-tested approaches.

Note 2: Don't use ifthen() to model x. That causes the model to become an MINLP with discontinuous relaxations, which is really bad! This can be done in a linear fashion. Assuming a(t)≥0 we can write:
x(t) ≤ a(t)*M;
x(t) ≥ a(t)/a.up(t);