Tuesday, May 14, 2013

Scalar models: CP models in AMPL

AMPL has since a long time facilities to write Constraint Programming models (see http://users.iems.northwestern.edu/~4er/WRITINGS/cp.pdf), but only recently this became available for the general public. In http://www.hakank.org/ampl/ there is a nice collection of CP models written in AMPL. Very worthwhile to have a look if you want to write CP models in AMPL.
Of course I always have a few minor beefs with how models are written. Here is the first model:
 ```/* "A puzzle" in AMPL+CP. From "God plays dice" "A puzzle" http://gottwurfelt.wordpress.com/2012/02/22/a-puzzle/ And the sequel "Answer to a puzzle" http://gottwurfelt.wordpress.com/2012/02/24/an-answer-to-a-puzzle/ This problem instance was taken from the latter blog post. """ 8809 = 6 7111 = 0 2172 = 0 6666 = 4 1111 = 0 3213 = 0 7662 = 2 9312 = 1 0000 = 4 2222 = 0 3333 = 0 5555 = 0 8193 = 3 8096 = 5 7777 = 0 9999 = 4 7756 = 1 6855 = 3 9881 = 5 5531 = 0 2581 = ? """ Note: This model yields 10 solutions, since x4 is not restricted in the constraints. All solutions has x assigned to the correct result. The problem stated in "A puzzle" http://gottwurfelt.wordpress.com/2012/02/22/a-puzzle/ is """ 8809 = 6 7662 = 2 9312 = 1 8193 = 3 8096 = 5 7756 = 1 6855 = 3 9881 = 5 2581 = ? """ This problem instance - using the same principle - yields two different solutions of x, one is the same (correct) as for the above problem instance, and one is not. This is because here both x4 and x1 are underdefined. Note: This problem has another non-algebraic and - let us say - topological approach which yield the same solution as the first problem and one of the solutions of the second problem. This AMPL model was created by Hakan Kjellerstrand, hakank@gmail.com See also my AMPL page: http://www.hakank.org/ampl/ */ param n; # decision variables var x0 >= 0 <= 9 integer; var x1 >= 0 <= 9 integer; var x2 >= 0 <= 9 integer; var x3 >= 0 <= 9 integer; var x4 >= 0 <= 9 integer; var x5 >= 0 <= 9 integer; var x6 >= 0 <= 9 integer; var x7 >= 0 <= 9 integer; var x8 >= 0 <= 9 integer; var x9 >= 0 <= 9 integer; var x >= 0 <= 9 integer; # # constraints # s.t. c1: x8+x8+x0+x9 = 6 and x7+x1+x1+x1 = 0 and x2+x1+x7+x2 = 0 and x6+x6+x6+x6 = 4 and x1+x1+x1+x1 = 0 and x3+x2+x1+x3 = 0 and x7+x6+x6+x2 = 2 and x9+x3+x1+x2 = 1 and x0+x0+x0+x0 = 4 and x2+x2+x2+x2 = 0 and x3+x3+x3+x3 = 0 and x5+x5+x5+x5 = 0 and x8+x1+x9+x3 = 3 and x8+x0+x9+x6 = 5 and x7+x7+x7+x7 = 0 and x9+x9+x9+x9 = 4 and x7+x7+x5+x6 = 1 and x6+x8+x5+x5 = 3 and x9+x8+x8+x1 = 5 and x5+x5+x3+x1 = 0 and x2+x5+x8+x1 = x ; # This smaller variant yields two different values for x. # s.t. c2: # x8+x8+x0+x9 = 6 and # x7+x6+x6+x2 = 2 and # x9+x3+x1+x2 = 1 and # x8+x1+x9+x3 = 3 and # x8+x0+x9+x6 = 5 and # x7+x7+x5+x6 = 1 and # x6+x8+x5+x5 = 3 and # x9+x8+x8+x1 = 5 and # x2+x5+x8+x1 = x # ; data; # option presolve 0; option show_stats 2; option solver gecode; option gecode_options "var_branching=size_min val_branching=min outlev=1 outfreq=1"; # option solver ilogcp; # option ilogcp_options "optimizer=cp alldiffinferencelevel=4 debugexpr=0 logperiod=10 logverbosity=0"; solve; display x; display x0,x1,x2,x3,x4,x5,x6,x7,x8,x9;```

The first thing we notice is that the model is written using scalar variables and equations. We are not using the indexing facilities of AMPL although variable names like x0,x1,x2, suggest we could exploit indexing.

The one big equation is really a set of equations. If we write a set of simultaneous equations this is really like we use the AND operator in between the equations.

The different data sets are formulated as different models. It make sense to ask: “can we write this as one and the same model with just different data?”. I.e. we want to make the model data driven. Of course we want to keep the model structure. If our only goal was to make the model 100% data drive we just could formulate Ax=b and we are done. But such a model is not at the correct abstraction level, and moves all the complexity to the correct organization of the data. If you ever used the Matlab solvers from the Optimization Toolbox you know what I am talking about. We definitely want to convey the structure and purpose of the model inside the AMPL model (that is why modeling systems have been developed). Here is my attempt:

 Model ``` set R; # number of records in training set set P; # data record param raw {R,P}; # raw data set I = 0..9; # x{I} variables set Px := P diff {'y'}; param datax {r in R, i in I} = sum{p in Px: raw[r,p]=i} 1; var x{I} >=0 <=9 integer; var y >=0 <=9 integer; s.t. e{r in R diff {'r0'}}: sum{i in I} datax[r,i] * x[i] = raw[r,'y']; s.t. ydef: y = sum{i in I} datax['r0',i] * x[i];``` Data set 1 ``` set R := r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 r14 r15 r16 r17 r18 r19 r20; set P := c1 c2 c3 c4 y; param raw: c1 c2 c3 c4 y := r0 2 5 8 1 . r1 8 8 0 9 6 r2 7 1 1 1 0 r3 2 1 7 2 0 r4 6 6 6 6 4 r5 1 1 1 1 0 r6 3 2 1 3 0 r7 7 6 6 2 2 r8 9 3 1 2 1 r9 0 0 0 0 4 r10 2 2 2 2 0 r11 3 3 3 3 0 r12 5 5 5 5 0 r13 8 1 9 3 3 r14 8 0 9 6 5 r15 7 7 7 7 0 r16 9 9 9 9 4 r17 7 7 5 6 1 r18 6 8 5 5 3 r19 9 8 8 1 5 r20 5 5 3 1 0 ;``` Data set 2 ``` set R := r0 r1 r2 r3 r4 r5 r6 r7 r8; set P := c1 c2 c3 c4 y; param raw: c1 c2 c3 c4 y := r0 2 5 8 1 . r1 8 8 0 9 6 r2 7 6 6 2 2 r3 9 3 1 2 1 r4 8 1 9 3 3 r5 8 0 9 6 5 r6 7 7 5 6 1 r7 6 8 5 5 3 r8 9 8 8 1 5 ;```

This can now be solved with any CP and MIP solver. As we can see the model is now a little bit more complex but more compact compared to the scalar model.

Note: In many cases it is possible to let AMPL extract the driving sets from a parameter in a data file. I believe this is not possible in this case where we use a tabular notation. That means we have to form the sets R and P explicitly. It would have been helpful if we could have used something like set R := r0..r8; but that is not something that is supported by AMPL.

1 comment:

1. Thanks for your pointer to my AMPL+CP page, Erwin.

You are - of course - right about my "A Puzzle" model, it's quite sloppy modeling. Your version is much neater and more general...

/Hakan