The company says that Swift apps are significantly faster than Objective-C apps, outperforming them by over 93x.Wow. That is some programming language...
I am a full-time consultant and provide services related to the design, implementation and deployment of mathematical programming, optimization and data-science applications. I also teach courses and workshops. Usually I cannot blog about projects I am doing, but there are many technical notes I'd like to share. Not in the least so I have an easy way to search and find them again myself. You can reach me at erwin@amsterdamoptimization.com.
Tuesday, June 3, 2014
Analytics for journalists
Generating Larger Pivot tables in Excel from GAMS
I was working on generating some very large pivot tables to view model results. Some issues came up. Users tend to use things differently than expected: they may download and install 64 bit versions of Excel and they may run with different language settings (decimal point vs. decimal comma). And they may generate huge output: data tables with more than a million rows can not be stored inside Excel, so we need to store these outside (external data source). The reason for the large data is that we have a multi dimensional structure: results(impactparameter,scenario,commodity,region,productiontype,year). Indeed looking at the CSV file I see for a result set with just two scenarios:
or for the Windows aficionados:
i.e. 4 million records.
Here are some notes.
Management GUI in Excel
Generated Excel worksheet
Friday, May 23, 2014
Reporting infeasible LP solutions in GAMS
x.fx(i,j) = solutionx(i,j);
solve m using lp minimizing z;
The infeasibilities are reported in the solution report in the listing file.
Unfortunately not all solvers support this. Here is a small sample:
- MINOS: ok
- BDMLP: ok
- Cplex: ok
- Gurobi: no solution reported so nothing to look at
- Mosek: not a normal solution reported
- Lindo: no solution reported
- Xpress: no solution reported
- CBC: no solution reported
Sunday, April 20, 2014
How do they make a bug like this?
Unfortunately (for me) this did not work and I see:Call Solver.SolverAdd(CellRef:=sumcell, Relation:=2, FormulaText:=1)
After much experimentation I used instead:
Call Solver.SolverAdd(CellRef:=sumcell, Relation:=2, FormulaText:=0.999999999999999)Now I see:
How in the world can such a bug be introduced? Some bugs make sense, but the logic of this one is difficult to fathom. I cannot but think there must be a statement somewhere:
if rhs=1 and (not interactive) and (user=’Erwin.Kalvelagen’) thenSolver works correctly with a =1 equation when used from the GUI.
call delete_constraint_to_confuse_user
Well, found some more reports on this strange bug:
- http://www.pcreview.co.uk/forums/problem-solveradd-can-anyone-help-t961905.html
- http://excelforum.com/excel-programming-vba-macros/492647-solved-missing-condition-after-solver-called-by-macro.html
- http://stackoverflow.com/questions/15620177/excel-solver-ignoring-constraint-in-vba
Sunday, April 13, 2014
Playing with job shop problem ft10 (5)
In http://yetanothermathprogrammingconsultant.blogspot.com/2014/04/playing-with-ft10-job-shop-1.html we saw Gurobi was very successful in solving ft10.gms to optimality. How would the open source solver CBC do? Actually it could not close the gap before hitting an imposed time limit of 1 hour.
The picture does not give us much hope that by extending the time limit we will gain much.
This performance is substantially worse than what we saw with Gurobi.
Playing with job shop problem ft10 (4)
We could not solve my job shop formulations in http://yetanothermathprogrammingconsultant.blogspot.com/2014/04/playing-with-job-shop-problem-ft10-2.html but actually Or-tools has a pretty good C++ based implementation: https://code.google.com/p/or-tools/source/browse/trunk/examples/cpp/jobshop.cc. This can solve the ft10 problem in reasonable time:
| c:\Users\erwin\tmp>jobshop --data_file=ft10 c:\Users\erwin\tmp> |
The problem is solved to a proven optimal solution in less than 5 minutes.
Some obvious questions arise:
- So how is this implementation different from the Minizinc models I did before? One difference is that it is using interval variables. These variables have a start and a duration, and are often used in scheduling models. I am not sure how to use these variable in a Minizinc model.
- Can we create a Minizinc model with similar performance?
- The optimization is probably done by:
- Find solution
- If infeasible stop
- Else add the constraint obj < obj_found to the model and go to step 1
- When invoked without a data file name, the program crashes:
Looks like this is caused by an abort() statement (replace by exit(-1) to stop more gracefully). - The performance over time looks vey much like a MIP solver:
i.e. quick progress at the beginning and slowing down later on.
Playing with job shop problem ft10 (3)
To see how the two different formulations mentioned in http://yetanothermathprogrammingconsultant.blogspot.com/2014/04/playing-with-job-shop-problem-ft10-2.html compare we use a real small example: ft06. Here are the models:
Original formulation with OR constraints: ft06.mzn
| int: numjobs = 6; set of int: Jobs = 1..numjobs; array[Jobs,Stages] of int: ProcTimes = array[Jobs,Stages] of int: Machine = int: horizon = sum([ ProcTimes[j,s] | j in Jobs, s in Stages]); array[Jobs,Stages] of var 0..horizon: Start; % order % no overlap constraint solve minimize makespan; output ["makespan:",show(makespan),"\n",show(Start)] |
Formulation with cumulative constraint: ft06c.mzn
| include "globals.mzn"; int: numjobs = 6; set of int: Jobs = 1..numjobs; array[Jobs,Stages] of int: ProcTimes = array[Jobs,Stages] of Machines: Machine = int: horizon = sum([ ProcTimes[j,s] | j in Jobs, s in Stages]); array[Jobs,Stages] of var 0..horizon: Start; % order % no overlap solve minimize makespan; output ["makespan:",show(makespan),"\n",show(Start)] |
Comparison
Solution time in seconds
| Solver | or ft06.mzn | cumulative ft06c.mnz | Command line |
| g12fd | 0.3 | 0.2 | mzn-g12fd –s ft06.mzn |
| g12lazy | 0.2 | 6.5 | mzn-g12lazy -s ft06.mzn |
| gecode | 0.1 | 0.1 | mzn-gecode -s ft06.mzn |
| or-tools | 147 | 0.1 | minizinc -G or-tools -f \tmp\fz.exe ft06.mzn |
Gecode and or-tools show some statistics for these case:
Playing with job shop problem ft10 (2)
How would a Constraint Programming formulation do on this problem: http://yetanothermathprogrammingconsultant.blogspot.com/2014/04/playing-with-ft10-job-shop-1.html. A straightforward translation of the MIP formulation into Minizinc can look like:
| int: numjobs = 10; set of int: Jobs = 1..numjobs; array[Jobs,Stages] of int: ProcTimes = array[Jobs,Stages] of int: MachineNumbers = int: horizon = sum([ ProcTimes[j,s] | j in Jobs, s in Stages]); array[Jobs,Stages] of var 0..horizon: Start; % positive variables % order % no overlap constraint solve minimize makespan; |
This looks actually quite nice. We can drop the binary variables that deal with the overlap, and the objective is more intuitive. As most CP solvers don’t deal very well or not at all with continuous variables, we used integer variables. This is ok here as we deal with integer valued processing times. In the real world you may need to rescale and/or approximate these numbers. Unfortunately this model does not solve. I am looking at the cursor and don’t see any feedback about any progress:
There is an alternative formulation that is often much better: replace the no overlap constraints by a global constraint called cumulative. Not totally intuitive but we can translate the model quite easily. In addition we added some bounds on all variables:
| include "globals.mzn"; int: numjobs = 10; set of int: Jobs = 1..numjobs; array[Jobs,Stages] of int: ProcTimes = array[Jobs,Stages] of int: Machine = int: last_end = sum([ ProcTimes[j,s] | j in Jobs, s in Stages]); array[Jobs,Stages] of var 0..last_start: Start; % order % no overlap constraint % helper |
Unfortunately still just a blinking cursor…
Any better Minizinc formulation available? Here are some alternative formulations which I have not tried out:
- https://github.com/MiniZinc/minizinc-benchmarks/blob/master/jobshop/jobshop.mzn
- https://github.com/hakank/hakank/blob/master/minizinc/jobshop.mzn
Some questions:
- No log when solving the models? One would like to see some progress log.
- How to debug models? Something like a limrow/limcol in GAMS or expand in AMPL? One could look at the “flatzinc” output, but that is difficult to digest.
- Can we set a time limit and then get the best solution so far?
Playing with job shop problem ft10 (1)
As discussed in http://yetanothermathprogrammingconsultant.blogspot.com/2009/09/job-shop-scheduling.html a MIP solver like Gurobi can solve this difficult problem quite easily.
Here we show the performance with 1 and 4 threads:
The big advantage of using a MIP solver is that any time (after finding the first integer solution) we have both a best found (upper bound) and best possible or best bound (lower bound) available. This not only gives us a proven optimal solution at the end but a sense of the quality of the solution during the solution process. We can interrupt before the end using this information (e.g. by stopping on a small gap).
There are a few interesting observations:
- The 4 thread integer solutions are not uniformly better than their 1 thread counterparts: quite often the red line (top part of the graph) is below the blue line
- But in the end 4 threads wins big
- Most progress is achieved in the beginning w.r.t. to finding good integer solutions. So when in a hurry we can stop early.
- This also means that building a heuristic that finds a good solution quickly is of limited value: the MIP solver finds good solutions quickly on its own. I.e. such a heuristic helps where the MIP solver does not need much help.
- The plot was produced with R and ggplot. I think it looks quite nice.
GAMS model
| $ontext |
R Code to create the plot
library(ggplot2) Created by Pretty R at inside-R.org |
Wednesday, April 9, 2014
Cleaning up the GAMS log
After cleaning this up it looks a little bit more frugal, but actually gives a better idea of where you are in the solution progress: