Tuesday, November 22, 2022

A strange series

In [1] a question is posed about a somewhat strange series. Let \(i \in \{ 0,\dots,n\}\). Then we require:

  • \(\color{darkred}x_i \in \{0,1,2,\dots\}\),
  • \(\color{darkred}x_i\) is equal to the number of times the value \(i\) occurs in the series. In mathematical terms, one could say: \[\color{darkred}x_i = |\{j:\color{darkred}x_j=i\}|\] (here \(|S|\) is the cardinality of the set \(S\)).

An example for \(n=4\) is: 

index  0 1 2 3 4 
value  2 1 2 0 0

Note that we can tighten the bounds on \(\color{darkred}x_i\) a bit. We know that \(\color{darkred}x_i \in \{0,1,2,\dots,n\}\) (note that \(\color{darkred}x_i = n+1\) can't happen).

The question is: how can we, for a given \(n\), generate such a series? This problem can be conveniently expressed as a constraint programming (CP) problem. The constraint in Docplex is simply:

for i in I:
    mdl.add(x[i] == sum(x[j] == i for j in I))

But what about a MIP formulation? The CP formulation is not a linear constraint so we cannot use that directly. One general modeling rule is: 

In a MIP model, when we need to do some counting, use binary variables 

We therefore introduce binary variables \(\color{darkred}y_{i,j}\in \{0,1\}\), defined by \[\color{darkred}y_{i,j}=1 \iff\color{darkred}x_i=j\] We can do this by: \[\begin{align}&\color{darkred}x_i = \sum_j j\cdot\color{darkred}y_{i,j} \\ & \sum_j \color{darkred}y_{i,j} = 1\end{align}\] The counting of occurrences can now be stated as: \[\color{darkred}x_i = \sum_j \color{darkred}y_{j,i}\] These three linear constraints now form our MIP model. Note that the variables \(\color{darkred}x_i\) can be substituted out (they can be recovered afterward from the optimal levels of  \(\color{darkred}y_{i,j}\)). So, the remaining linear model is:

Linear MIP constraints
\[\begin{align}& \sum_j j\cdot\color{darkred}y_{i,j} = \sum_j \color{darkred}y_{j,i} && \forall i \\ & \sum_j \color{darkred}y_{i,j} = 1 && \forall i \\ & \color{darkred}y_{i,j}\in \{0,1\} \end{align}\]

Just add a dummy objective and you are ready to go.

The solution looks like:

----     30 SET i  size of problem

i0,    i1,    i2,    i3,    i4

----     30 VARIABLE y.L  

            i0          i1          i2

i0                               1.000
i1                   1.000
i2                               1.000
i3       1.000
i4       1.000

----     30 PARAMETER x  

i0 2.000,    i1 1.000,    i2 2.000

  • Not all \(n\) yield a feasible solution. E.g. \(n=0,1,2,5\) are infeasible. 
  • But a feasible solution for any \(n\ge 6\) is:

      index  0   1  2  n-3
      value n-3  2  1   1

    The zero values are not printed here. We see that the solution is very sparse. Of course, this general solution makes the problem slightly less interesting.
  • The solutions seem unique. A no-good cut that forbids a previously found solution \(\color{darkblue}y^*_{i,j}\) is \[\sum_{i,j} \color{darkblue}y^*_{i,j}\cdot \color{darkred}y_{i,j} \le \color{darkblue}n\] When running with this additional constraint, the model becomes infeasible, indicating there is no other solution.


It is not uncommon that a CP (Constraint Programming) formulation can use integer variables, whereas the corresponding MIP formulation needs binary variables. This small model is a good example of this.

Update: I looked at the AMPL and Minizinc code provided in the comments a bit. Of course, that yielded some more thoughts. My notes are in [2].


Appendix: GAMS model

'size of problem' /i0*i4/
alias (i,j);

parameter v(i) 'values';
v(i) =

binary variable y(i,j);
variable z 'dummy objective';

'substituted out x'
'one value for each x(i)'
'dummy objective'

sum(j,v(j)*y(i,j)) =e= sum(j,y(j,i));
sum(j, y(i,j)) =e= 1;
obj.. z =e= 0;

model m /all/;
solve m minimizing z using mip;

abort$(m.modelstat<>%ModelStat.Optimal% and m.modelstat<>%ModelStat.Integer Solution%) "No feasible solution";

* recover x
parameter x(i);
x(i) =

display i,y.l,x;


* optional:
* check if solution is unique
equation nogood;
sum((i,j),y.l(i,j)*y(i,j)) =l= card(i)-1;

model m2/all/;
solve m2 minimizing z using mip;

abort$(m2.modelstat<>%ModelStat.Infeasible% and m2.modelstat<>%ModelStat.Integer Infeasible%) "Multiple solutions exist";


  1. The problem can be formulated in MiniZinc as follows

    int: n = 100;
    set of int: N = 0..n;

    array[N] of var N: x;
    array[N] of N: index = array1d(N, [i | i in N]);

    % Constraint variant 1: Individual sums for each value
    constraint forall (i in N) (
    x[i] = sum(j in N) (x[j] = i)

    %Constraint variant 2: Global cardinality constraint
    include "globals.mzn";
    constraint global_cardinality(x, index, x);

    solve satisfy;

    output [show(x)];

    Running in the latest MiniZinc IDE with Gecode 6.3.0 with all solutions search takes 1.8s and 74011 failures for an array of length 100 using constraint variant 1 only that uses the same formulation as in the Docplex model.

    The second constraint variant instead uses the global cardinality constraint (https://www.minizinc.org/doc-2.6.4/en/lib-globals-counting.html#mzn-ref-globals-counting-global-cardinality) to specify the counting structure. For the same data as above, it takes just 0.118s and 197 failures to find all 1 solutions.

    1. Here's a formulation in AMPL,

      param n integer > 0;
      var x {0..n} in 0..n;
      subject to con {i in 0..n}:
      x[i] = count {j in 0..n} (x[j] = i);

      solved by Gurobi 10:

      ampl: let n := 4;
      ampl: option solver gurobi;
      ampl: solve;
      Gurobi 10.0.0: optimal solution
      ampl: display x;
      x [*] :=
      0 2
      1 1
      2 2
      3 0
      4 0

      ampl: let n := 10;
      ampl: solve;
      Gurobi 10.0.0: optimal solution
      ampl: option omit_zero_rows 1;
      ampl: display x;
      x [*] :=
      0 7
      1 2
      2 1
      7 1

      To look for a different solution, this constraint can be added,

      subject to Different:
      atleast 1 {i in 0..n} (x[i] != x[i].val);

      but then the problem is infeasible:

      ampl: solve;
      Gurobi 10.0.0: infeasible problem

      Of course, a conversion to an equivalent MIP is being made before the problem is sent to the solver.
      So this only works with solvers that have the new "MP" interface, currently Gurobi 10, COPT, and HiGHS.