Tuesday, May 14, 2013

Scalar models: CP models in AMPL

AMPL has since a long time facilities to write Constraint Programming models (see http://users.iems.northwestern.edu/~4er/WRITINGS/cp.pdf), but only recently this became available for the general public. In http://www.hakank.org/ampl/ there is a nice collection of CP models written in AMPL. Very worthwhile to have a look if you want to write CP models in AMPL.
Of course I always have a few minor beefs with how models are written. Here is the first model:
/*

  "A puzzle" in AMPL+CP.

  From "God plays dice"
  "A puzzle"
  http://gottwurfelt.wordpress.com/2012/02/22/a-puzzle/
  And the sequel "Answer to a puzzle"
  http://gottwurfelt.wordpress.com/2012/02/24/an-answer-to-a-puzzle/

  This problem instance was taken from the latter blog post.
  """
  8809 = 6
  7111 = 0
  2172 = 0
  6666 = 4
  1111 = 0
  3213 = 0
  7662 = 2
  9312 = 1
  0000 = 4
  2222 = 0
  3333 = 0
  5555 = 0
  8193 = 3
  8096 = 5
  7777 = 0
  9999 = 4
  7756 = 1
  6855 = 3
  9881 = 5
  5531 = 0

  2581 = ?
  """

  Note: 
  This model yields 10 solutions, since x4 is not 
  restricted in the constraints. 
  All solutions has x assigned to the correct result. 
  

  The problem stated in "A puzzle"
  http://gottwurfelt.wordpress.com/2012/02/22/a-puzzle/
  is
  """
  8809 = 6
  7662 = 2
  9312 = 1
  8193 = 3
  8096 = 5
  7756 = 1
  6855 = 3
  9881 = 5

  2581 = ?
  """
  This problem instance - using the same principle - yields 
  two different solutions of x, one is the same (correct) as 
  for the above problem instance, and one is not.
  This is because here both x4 and x1 are underdefined.
  

  Note: 
  This problem has another non-algebraic and - let us say - topological
  approach which yield the same solution as the first problem and one
  of the solutions of the second problem.


  This AMPL model was created by Hakan Kjellerstrand, hakank@gmail.com
  See also my AMPL page: http://www.hakank.org/ampl/

*/

param n;

# decision variables
var x0 >= 0 <= 9 integer;
var x1 >= 0 <= 9 integer;
var x2 >= 0 <= 9 integer;
var x3 >= 0 <= 9 integer;
var x4 >= 0 <= 9 integer;
var x5 >= 0 <= 9 integer;
var x6 >= 0 <= 9 integer;
var x7 >= 0 <= 9 integer;
var x8 >= 0 <= 9 integer;
var x9 >= 0 <= 9 integer;

var x >= 0 <= 9 integer;

#
# constraints
#
s.t. c1:
  x8+x8+x0+x9 = 6 and
  x7+x1+x1+x1 = 0 and
  x2+x1+x7+x2 = 0 and
  x6+x6+x6+x6 = 4 and
  x1+x1+x1+x1 = 0 and
  x3+x2+x1+x3 = 0 and
  x7+x6+x6+x2 = 2 and
  x9+x3+x1+x2 = 1 and
  x0+x0+x0+x0 = 4 and
  x2+x2+x2+x2 = 0 and
  x3+x3+x3+x3 = 0 and
  x5+x5+x5+x5 = 0 and
  x8+x1+x9+x3 = 3 and
  x8+x0+x9+x6 = 5 and
  x7+x7+x7+x7 = 0 and
  x9+x9+x9+x9 = 4 and
  x7+x7+x5+x6 = 1 and
  x6+x8+x5+x5 = 3 and
  x9+x8+x8+x1 = 5 and
  x5+x5+x3+x1 = 0 and

  x2+x5+x8+x1 = x
;

# This smaller variant yields two different values for x.
# s.t. c2:
#  x8+x8+x0+x9 = 6 and
#  x7+x6+x6+x2 = 2 and
#  x9+x3+x1+x2 = 1 and
#  x8+x1+x9+x3 = 3 and
#  x8+x0+x9+x6 = 5 and
#  x7+x7+x5+x6 = 1 and
#  x6+x8+x5+x5 = 3 and
#  x9+x8+x8+x1 = 5 and

#  x2+x5+x8+x1 = x
# ;

data;


# option presolve 0;
option show_stats 2;

option solver gecode;
option gecode_options "var_branching=size_min val_branching=min outlev=1 outfreq=1";

# option solver ilogcp;
# option ilogcp_options "optimizer=cp alldiffinferencelevel=4 debugexpr=0 logperiod=10 logverbosity=0";

solve;

display x;
display x0,x1,x2,x3,x4,x5,x6,x7,x8,x9;

The first thing we notice is that the model is written using scalar variables and equations. We are not using the indexing facilities of AMPL although variable names like x0,x1,x2, suggest we could exploit indexing.

The one big equation is really a set of equations. If we write a set of simultaneous equations this is really like we use the AND operator in between the equations.

The different data sets are formulated as different models. It make sense to ask: “can we write this as one and the same model with just different data?”. I.e. we want to make the model data driven. Of course we want to keep the model structure. If our only goal was to make the model 100% data drive we just could formulate Ax=b and we are done. But such a model is not at the correct abstraction level, and moves all the complexity to the correct organization of the data. If you ever used the Matlab solvers from the Optimization Toolbox you know what I am talking about. We definitely want to convey the structure and purpose of the model inside the AMPL model (that is why modeling systems have been developed). Here is my attempt:

Model
 
set R;   # number of records in training set
set P;   # data record

param raw {R,P};  # raw data

set I = 0..9;   # x{I} variables

set Px := P diff {'y'};
param datax {r in R, i in I} = sum{p in Px: raw[r,p]=i} 1;


var x{I} >=0 <=9 integer;
var y >=0 <=9 integer;

s.t. e{r in R diff {'r0'}}:
    sum{i in I} datax[r,i] * x[i] = raw[r,'y'];

s.t. ydef:
     y = sum{i in I} datax['r0',i] * x[i];
Data set 1

set R := r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 r14 r15 r16 r17 r18 r19 r20;
set P := c1 c2 c3 c4 y;
param raw: c1 c2 c3 c4 y :=
  r0  2  5  8  1  .
  r1  8  8  0  9  6
  r2  7  1  1  1  0
  r3  2  1  7  2  0
  r4  6  6  6  6  4
  r5  1  1  1  1  0
  r6  3  2  1  3  0
  r7  7  6  6  2  2
  r8  9  3  1  2  1
  r9  0  0  0  0  4
  r10 2  2  2  2  0
  r11 3  3  3  3  0
  r12 5  5  5  5  0
  r13 8  1  9  3  3
  r14 8  0  9  6  5
  r15 7  7  7  7  0
  r16 9  9  9  9  4
  r17 7  7  5  6  1
  r18 6  8  5  5  3
  r19 9  8  8  1  5
  r20 5  5  3  1  0
;
Data set 2

set R := r0 r1 r2 r3 r4 r5 r6 r7 r8;
set P := c1 c2 c3 c4 y;
param raw: c1 c2 c3 c4 y :=
  r0  2  5  8  1  .
  r1  8  8  0  9  6
  r2  7  6  6  2  2
  r3  9  3  1  2  1
  r4  8  1  9  3  3
  r5  8  0  9  6  5
  r6  7  7  5  6  1
  r7  6  8  5  5  3
  r8  9  8  8  1  5
;

This can now be solved with any CP and MIP solver. As we can see the model is now a little bit more complex but more compact compared to the scalar model.

Note: In many cases it is possible to let AMPL extract the driving sets from a parameter in a data file. I believe this is not possible in this case where we use a tabular notation. That means we have to form the sets R and P explicitly. It would have been helpful if we could have used something like set R := r0..r8; but that is not something that is supported by AMPL.