Wednesday, February 14, 2018

Van der Waerden Puzzle

In [1] a puzzle is mentioned:

Find a binary sequence \(x_1, \dots , x_8\) that has no three equally spaced \(0\)s and no three equally spaced \(1\)s

I.e. we want to forbid patterns like \(111\),\(1x1x1\), \(1xx1xx1\) and \(000\),\(0x0x0\), \(0xx0xx0\). It is solved as a SAT problem in [1], but we can also write this down as a MIP. Instead of just 0s and 1s we make it a bit more general: any color from a set \(c\) can be used (this allows us to use more colors than just two and we can make nice pictures of the results). A MIP formulation can look like the two-liner:

\[\begin{align}& x_{i,c} + x_{i+k,c} + x_{i+2k,c} \le 2 && \forall i,k,c  \text{ such that } i+2k \le n\\ & \sum_c x_{i,c}=1 && \forall i\end{align} \]

where \(x_{i,c} \in \{0,1\}\) indicating whether color \(c\) is selected for position \(i\). For \(n=8\) and two colors we find six solutions:


For \(n=9\) we can not find any solution. This is sometimes denoted as the Van der Waerden number \(W(3,3)=9\).

For small problems we can enumerate these by adding cuts and resolving. For slightly larger problems the solution pool technique in solvers like Cplex and Gurobi can help.

For three colors \(W(3,3,3)=27\) i.e. for \(n=27\) we cannot find a solution without three equally spaced colors. For \(n=26\) we can enumerate all 48 solutions:


Only making this problem a little bit bigger is bringing our model to a screeching halt. E.g. \(W(3,3,3,3)=76\). I could not find solutions for \(n=75\) within 1,000 seconds, let alone enumerating them all.

MIP vs CP formulation


In the MIP formulation we used a binary representation of the decision variables \(x_{i,c}\). If we would use a Constraint Programming solver (or an SMT Solver), we can can use integer variables \(x_i\) directly.

Binary MIP formulationInteger CP formulation
\[\large{\begin{align}\min\>&0\\& x_{i,c} + x_{i+k,c} + x_{i+2k,c} \le 2 &&  \forall  i+2k \le n\>\forall c \\ & \sum_c x_{i,c}=1 && \forall i\\ & x_{i,c} \in \{0,1\} \end{align}} \] \[ \large{\begin{align} &x_{i} \ne x_{i+k} \vee x_{i+k} \ne x_{i+2k} && \forall  i+2k \le n\\ & x_i \in \{1,\dots,nc\} \end{align} }\]


Python/Z3 version


This is to illustrate the integer CP formulation. Indexing in Python starts at 0 so we have to slightly adapt the model for this:

from z3 import *

n = 26
nc = 3

# integer variables X[i]
X = [ Int('x%s' % i ) for i in range(n) ]

# lower bounds X[i] >= 1
Xlo = And([X[i] >= 1 for i in range(n)])
# upper bounds X[i] <= nc
Xup = And([X[i] <= nc for i in range(n)])
# forbid equally spaced colors 
Xforbid = And( [ Or(X[i] != X[i+k+1], X[i+k+1] != X[i+2*(k+1)] ) for i in range(n) \
               for k in range(n) if i+2*(k+1) < n])
# combine all constraints
Cons = [Xlo,Xup,Xforbid]
       
# find all solutions    
s = Solver()
s.add(Cons)
k = 0
res = []
while s.check() == sat:
    m = s.model()
    k = k + 1
    sol = [ m.evaluate(X[i]) for i in range(n)]
    res = res + [(k,sol)]
    forbid = Or([X[i] != sol[i] for i in range(n) ])
    s.add(forbid)     
print(res)

Here we use andor and != to model our Xforbid constraint. We could have used the same binary variable approach as done in the integer programming model. I tested that also: it turned out to be slower than using integers directly.

Performance


Here are some timings for generating these 48 solutions for the above example:

Model/Solver Statistics Solution Time
Binary formulation with Cplex (solution pool) 494 rows
78 binary variables
51092 simplex iterations
3721 nodes
1.6 seconds
Integer formulation with Z3 26 integer variables
693 boolean variables (first model)
5158 clauses (first model)
871 boolean variables (last model)
13990 clauses (last model)
5.5 seconds
Binary formulation with Z3 78 binary variables
860 boolean variables (first model)
2974 clauses (first model)
1015 boolean variables (last model)
6703 clauses (last model)
137 seconds


The binary MIP formulation is not doing poorly at all when solved with a state-of-the-art MIP solver. Z3 really does not like the same binary formulation: we need to use the integer formulation instead to get good performance. Different formulations of the same problem can make a substantial difference in performance.


Van der Waerden (1903-1996)


References


  1. Donald E. Knuth, The Art of Computer Programming, Volume 4, Fascicle 6, Satisfiability, 2015
  2. Van der Waerden Number, https://en.wikipedia.org/wiki/Van_der_Waerden_number


Saturday, February 10, 2018

Singular system of nonlinear equations

For a project I am faced with solving large, sparse systems of nonlinear equations\[F(x)=0\] In some cases I get a a singular system: the Jacobian at the feasible point is not invertable. These models are also solved using a simple iterative scheme, which tolerates such a condition. How would standard solvers handle this?

I generated a small test problem as follows:

where \(F(x)=0\) has no issues, but \(G(y)=0\) is singular. I simply used:\[\begin{align}&y_1+y_2=1\\&y_1+y_2=1\end{align}\] Let's see what feedback we get:

Solver Modeltype Model Status Feedback/Notes
Conopt CNS Locally Infeasible
** Error in Square System: Pivot too small.

**** ERRORS/WARNINGS IN EQUATION g(k1)
     1 error(s): Pivot too small.

**** ERRORS/WARNINGS IN VARIABLE y(k1)
     1 error(s): Pivot too small.
Ipopt CNS Solved Feasible solution
Knitro CNS Solved Feasible solution
Miles MCP Intermediate Infeasible
Failure to converge
Minos CNS Unbounded
EXIT - The problem is unbounded (or badly scaled).
Minos CNS+basis Solved Feasible solution
Path CNS Solved Feasible solution
Path MCP Optimal Feasible solution
Snopt CNS Solved Feasible solution

Some solvers will report a feasible solution (without a further hint). Conopt does not, but gives a good indication where things go wrong. We are not given the set of dependent equations, but Conopt at least points us clearly to one culprit.

Not related to my problem, but another question is: what if singular but no solution exists? E.g. use: \[\begin{align}&y_1+y_2=1\\&y_1+y_2=2\end{align}\] Conopt gives the same results as above. The other solvers will typically report "infeasible". In most cases a solution corresponding to a phase I "min sum of infeasibilities" objective is provided where \(x\) is feasible and \(y\) is infeasible. Some solvers just give back a solution with many infeasibilities.

Monday, February 5, 2018

Fun with a quadratic model


In [1] a type of assignment problem is described:

I'd like to match n ~= 100 people to m ~= 5 teams, subject to a couple of constraints. For each of the n people, I have a vector of k numbers that correspond to various skill ratings. Each person also has a preference rating for each of the 5 teams, as well as a list of people they'd like to work with and a list of people they would like not to work with. 
Given some criteria for a 'good' team (for example, some linear combination of skill levels of all its team members, everyone is happy with the team they're on (given their preferences), etc.) that can be used for evaluation, how should I design an algorithm to match the people to teams?
The foundation for an optimization model for this problem can be an assignment structure with:

\[x_{p,t} = \begin{cases} 1 & \text{ if person $p$ is assigned to team $t$}\\  0 & \text{ otherwise}\end{cases}\]

The assignment constraints can look like: \[\begin{align} &\sum_t x_{p,t} \le 1 & \forall p\\&\sum_p x_{p,t} = \mathit{demand}_{t}& \forall t \end{align} \] The problem can be viewed as a multi-criteria optimization problem with objectives: \[\begin{align} &\max\> f_{skill} = \sum_{p,s,t} \mathit{skill}_{p,s} x_{p,t} && \text{skill sets} \\ &\max\> f_{\mathit{preft}} = \sum_{p,t} \mathit{prefteam}_{p,t} x_{p,t} && \text{team preferences}\\&\max\>f_{\mathit{prefw}} = \sum_{p,p',t|p' >p} \mathit{prefworker}_{p,p'} x_{p,t} x_{p',t} && \text{worker preferences}\end{align}\] where \(s\) is the set of different skills. A value \( \mathit{prefworker}_{p,p'}\lt 0 \) indicates workers \(p\) and \(p'\) dislike each other. Note that in the quadratic expression we sum only over \(p' \gt p\) to avoid double counting. A typical way to handle competing objectives is to form a weighted sum: \[\max\> z = \sum_k w_k f_k\] So our first model M1 can look like: \[\bbox[lightcyan,10px,border:3px solid darkblue]{ \begin{align} \max\> &z = \sum_k w_k f_k\\ & f_{skill} = \sum_{p,s,t} \mathit{skill}_{p,s} x_{p,t}\\ & f_{\mathit{preft}} = \sum_{p,t} \mathit{prefteam}_{p,t} x_{p,t} \\ & f_{\mathit{prefw}} = \sum_{p,p',t|p' >p} \mathit{prefworker}_{p,p'} x_{p,t} x_{p',t} \\ &\sum_t x_{p,t} \le 1 \\ &\sum_p x_{p,t} = \mathit{demand}_{t}\\ &x_{p,t} \in \{0,1\} \end{align}}\]


  • M1: Not many solvers can handle this problem as stated: Cplex is the major exception.
  • M2: We can convince Gurobi and Xpress to solve this by changing the quadratic equality constraint into a \(\le\) constraint. (Interestingly Cplex has much more of a tough time with this version than with M1).
  • M3: This model is formed by substituting out the \(f\) variables. We end up with an MIQP model.
  • M4: We can formulate this problem as a straight linear MIP by introducing binary variables \(y_{p,p',t}=x_{p,t} x_{p',t}\). This multiplication can be written as: \[\begin{align}&y_{p,p',t}\le x_{p,t}\\ &y_{p,p',t}\le x_{p',t}\\ &y_{p,p',t}\ge x_{p,t} + x_{p',t} -1\\ & f_{\mathit{prefw}} =  \sum_{p,p',t|p' >p} \mathit{prefworker}_{p,p'} y_{p,p',t}\\ & y_{p,p',t} \in \{0,1\} \end{align}\] This formulation removes any non-convexity from the problem.
  • M5: As suggested by Bob Fourer below in the comments, we can simplify the quadratic term in the objective \[\sum_{p,p',t|p' >p} \mathit{prefworker}_{p,p'} x_{p,t} x_{p',t} \] by \[\begin{align} & \sum_{p,t} x_{p,t} y_{p,t}\\& y_{p,t} = \sum_{p' > p} \mathit{prefworker}_{p,p'} x_{p',t}\\& y_{p,t} \text{ free}   \end{align}\] Note that this is a MIQP model but with a simpler Q matrix than M3.

In the table below I show how different solvers react on these different models. If you want your model to solve with as many solvers as possible, it is a good idea to reformulate this as a MIP model.

Model Solver Results
M1 (MIQCP)CplexOK
GurobiGurobi can only solve QCP with =L= or =G= quadratic rows. Could not load problem into Gurobi database (1)
MosekNonconvex model detected. Constraint 'objprefworker'(2) is a nonlinear equality. (*1290*)
XpressQuadratic equality constraint not supported: objprefworker. ERROR: Could not load problem into XPRESS optimizer.
M2 (MIQCP)CplexOK
GurobiOK
MosekConstraint 'objprefworker'(2) is not convex. Q should be positive semidefinite in a constraint with finite upper bound. (*1293*)
XpressOK with warning:
?900 Warning: The quadratic part of row 'R3' defines a nonconvex region. Please check your model or use Xpress-SLP. Could not solve final fixed LP. We report the unfixed solution.
M3 (MIQP)CplexOK
GurobiOK
MosekThe quadratic coefficient matrix in the objective is not negative semidefinite as expected for a maximization problem. (*1296*)
XpressOK
M4 (MIP)CplexOK
GurobiOK
MosekOK
XpressOK
M5 (MIQP)CplexOK
GurobiOK
MosekThe quadratic coefficient matrix in the objective is not negative semidefinite as expected for a maximization problem. (*1296*)
Xpress?899 Warning: The quadratic objective is not convex
Please check model or use Xpress-SLP
Problem is integer infeasible

Some of the messages are a bit over the top. If an LP and Q matrix  is called a 'database', what should we call a real database?

Often I make the argument that a MIP can perform better than a MIQP/MIQCP model. Here is another reason why such a reformulation may make sense: different solvers have different rules about what type of MIQCP/MIQP models they can solve. A formulation that works for one solver is no guarantee it will work for another solver. With a MIP formulation this is a little bit more predictable.

Performance


To show the unexpectedly large variation in performance of the models M1 through M4 we reproduce the Cplex results on a small randomly generated data set.

Model Rows Columns (discrete) Iterations Nodes Seconds Objective value
M1 109 504 (500) 3420 0 3.16 3178
M2 109 504 (500) time interrupt
M3 106 501 (500) 2977 0 4.13 3178
M4 74,359 25,254 (25,250) 19983 53 65.92 3178
M5 606 1,001 (500) 25203 1504 6 3178

Model M2 takes a very long time.

References


  1. Matching people with skills and preferences to groups, https://stackoverflow.com/questions/48482550/matching-people-with-skills-and-preferences-to-groups

Friday, February 2, 2018

Solving Sparse Systems of Equations with an optimizer: a basis trick


Sparse Systems of Linear Equations


Solving a sparse system of linear equations\[Ax=b\] is a task an LP solver should be quite well-equipped to do. A typical Simplex method will contain a rather sophisticated sparse LU factorization routine. Formulating this as a standard LP with a dummy objective leads to somewhat disappointing performance (see the table below). Here we try a very sparse system of \(n=9,801\) variables and equations.

One reason a simplex method is slow here it that the initial basis is chosen rather poorly. We can help the solver with this. We know that the basis status of the final solution is as follows:

  • All rows are nonbasic
  • All columns are basic
This suggests we can construct an advanced basis that reflects this prediction, and tell the LP solver to use this. In GAMS it is very easy to setup such a basis. A non-zero marginal indicates the corresponding row or column is non-basic. We can just use an assignment statement:

   equation e(i,j);
   e.m(i,j) = 1;

The variables just have their default marginal value of zero. The results are not something to sneeze at:


Solver Model Iterations Seconds
cplex LP 4358 27
cplex LP+basis 0 6.5

When we pass on an advanced basis the solver needs no simplex iterations (i.e. basis changes): it just factorizes the basis matrix and it is done.

Sparse Systems of Nonlinear Equations


When dealing with a large system of nonlinear equations \[F(x)=0\] we can do a similar thing. It helps only for algorithms that are simplex based such as MINOS. Here we solve a standard benchmark problem due to Broyden:

Broyden's tridiagonal problem

This is a quadratic problem with \(n=10,000\) equations and variables. Solvers like Cplex or Gurobi are not able to solve it. The model type CNS stands for Constrained Nonlinear System: a square system of non-linear equations (with some possible bounds) without an objective. With MINOS we see:

Solver Model Iterations Seconds
minos CNS The problem is unbounded (or badly scaled)
minos CNS+basis 0 0.2

We see again that a good starting basis can help: in this case it makes the difference between being able to solve the problem or not. Using an advanced basis, the solver does not need to do any simplex-type iterations and just can use some Newton algorithm which will converges extremely fast.

References


  1. Erwin Kalvelagen, Solving Systems of Linear Equations with GAMS, http://amsterdamoptimization.com/pdf/lineq.pdf
  2. C. G. Broyden, A Class of Methods for Solving Nonlinear Simultaneous Equations,  Mathematics of Computation, Vol. 19, No. 92 (Oct., 1965), pp. 577-593