Tuesday, March 23, 2021

SOS1 sets: when to use them

Short answer: almost never.


Beginners in optimization are always very excited when they read about SOS sets in the documentation. This looks like a very efficient tool to use for constructs like pick (at most) 1 out of \(n\) (SOS1) or for interpolation (SOS2). However, experienced modelers use SOS sets very sparingly. Why? Well, solvers like binary variables much better than SOS sets. Binary variables yield great bounding and cuts (there has been enormous progress in this area), while SOS constructs are really just about branching. 


Illustration


Consider the assignment problem: 


Assignment problem A
\[\begin{align}\max& \sum_{i,j} \color{darkblue}c_{i,j} \cdot \color{darkred}x_{i,j} \\ & \sum_j \color{darkred}x_{i,j} \le 1 && \forall i \\ & \sum_i \color{darkred}x_{i,j} \le 1 && \forall j \\ & \color{darkred}x_{i,j} \in \{0,1\} \end{align}\]


This MIP model can actually be solved as an LP (the variables are integer-valued automatically). A SOS1 version of this model can look like:

Assignment problem B, SOS1 version
\[\begin{align}\max& \sum_{i,j} \color{darkblue}c_{i,j} \cdot \color{darkred}x_{i,j} \\ & \color{darkred}x_{i,1},\color{darkred}x_{i,2},\dots,\color{darkred}x_{i,n}\in \mathbf{SOS1} && \forall i \\ & \color{darkred}x_{1,j},\color{darkred}x_{2,j},\dots,\color{darkred}x_{n,j}\in \mathbf{SOS1} && \forall j \\ & \color{darkred}x_{i,j} \in [0,1] \end{align}\]


Solver logs


Let's study some solver logs:

Model A (binary variables)Model B (SOS1 variables)
Found incumbent of value 0.000000 after 0.00 sec. (0.00 ticks)
Tried aggregator 1 time.
Reduced MIP has 20 rows, 100 columns, and 200 nonzeros.
Reduced MIP has 100 binaries, 0 generals, 0 SOSs, and 0 indicators.
Presolve time = 0.00 sec. (0.12 ticks)
Probing time = 0.00 sec. (0.08 ticks)
Tried aggregator 1 time.
Reduced MIP has 20 rows, 100 columns, and 200 nonzeros.
Reduced MIP has 100 binaries, 0 generals, 0 SOSs, and 0 indicators.
Presolve time = 0.00 sec. (0.13 ticks)
Probing time = 0.00 sec. (0.08 ticks)
Clique table members: 20.
MIP emphasis: balance optimality and feasibility.
MIP search method: dynamic search.
Parallel mode: deterministic, using up to 4 threads.
Root relaxation solution time = 0.00 sec. (0.09 ticks)

        Nodes                                         Cuts/
   Node  Left     Objective  IInf  Best Integer    Best Bound    ItCnt     Gap

*     0+    0                            0.0000      431.7088              ---
*     0+    0                           38.0450      431.7088              ---
*     0     0      integral     0       81.6087       81.6087       27    0.00%
Elapsed time = 0.00 sec. (0.64 ticks, tree = 0.00 MB, solutions = 3)

Root node processing (before b&c):
  Real time             =    0.00 sec. (0.64 ticks)
Parallel b&c, 4 threads:
  Real time             =    0.00 sec. (0.00 ticks)
  Sync time (average)   =    0.00 sec.
  Wait time (average)   =    0.00 sec.
                          ------------
Total (root+branch&cut) =    0.00 sec. (0.64 ticks)

Solution pool: 3 solutions saved.

MIP - Integer optimal solution:  Objective =  8.1608732210e+01
Solution time =    0.00 sec.  Iterations = 27  Nodes = 0
Deterministic time = 0.64 ticks  (162.23 ticks/sec)    
Found incumbent of value 0.000000 after 0.00 sec. (0.00 ticks)
Tried aggregator 1 time.
MIP Presolve eliminated 160 rows and 0 columns.
MIP Presolve added 320 rows and 80 columns.
Reduced MIP has 160 rows, 180 columns, and 960 nonzeros.
Reduced MIP has 80 binaries, 0 generals, 0 SOSs, and 0 indicators.
Presolve time = 0.00 sec. (0.16 ticks)
Probing time = 0.00 sec. (0.16 ticks)
Tried aggregator 1 time.
Detecting symmetries...
Reduced MIP has 160 rows, 180 columns, and 960 nonzeros.
Reduced MIP has 80 binaries, 0 generals, 0 SOSs, and 0 indicators.
Presolve time = 0.00 sec. (0.40 ticks)
Probing time = 0.00 sec. (0.16 ticks)
MIP emphasis: balance optimality and feasibility.
MIP search method: dynamic search.
Parallel mode: deterministic, using up to 4 threads.
Root relaxation solution time = 0.00 sec. (0.36 ticks)

        Nodes                                         Cuts/
   Node  Left     Objective  IInf  Best Integer    Best Bound    ItCnt     Gap

*     0+    0                            0.0000      431.7088              ---
*     0+    0                            1.7175      431.7088              ---
*     0     0      integral     0       81.6087       81.6087       27    0.00%
Elapsed time = 0.00 sec. (1.58 ticks, tree = 0.00 MB, solutions = 3)

Root node processing (before b&c):
  Real time             =    0.00 sec. (1.59 ticks)
Parallel b&c, 4 threads:
  Real time             =    0.00 sec. (0.00 ticks)
  Sync time (average)   =    0.00 sec.
  Wait time (average)   =    0.00 sec.
                          ------------
Total (root+branch&cut) =    0.00 sec. (1.59 ticks)

Solution pool: 3 solutions saved.

MIP - Integer optimal solution:  Objective =  8.1608732210e+01
Solution time =    0.01 sec.  Iterations = 27  Nodes = 0
Deterministic time = 1.59 ticks  (301.88 ticks/sec)    

Optimize a model with 20 rows, 100 columns and 200 nonzeros
Model fingerprint: 0xc82507f1
Variable types: 0 continuous, 100 integer (100 binary)
Coefficient statistics:
  Matrix range     [1e+00, 1e+00]
  Objective range  [5e-02, 1e+01]
  Bounds range     [1e+00, 1e+00]
  RHS range        [1e+00, 1e+00]
Found heuristic solution: objective 38.0450051
Presolve time: 0.00s
Presolved: 20 rows, 100 columns, 200 nonzeros
Variable types: 0 continuous, 100 integer (100 binary)

Root relaxation: objective 8.160873e+01, 11 iterations, 0.00 seconds

    Nodes    |    Current Node    |     Objective Bounds      |     Work
 Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

*    0     0               0      81.6087322   81.60873  0.00%     -    0s

Explored 0 nodes (11 simplex iterations) in 0.00 seconds
Thread count was 4 (of 64 available processors)

Solution count 2: 81.6087 38.045

Optimal solution found (tolerance 1.00e-04)
Best objective 8.160873221000e+01, best bound 8.160873221000e+01, gap 0.0000%    
    

Optimize a model with 0 rows, 100 columns and 0 nonzeros
Model fingerprint: 0xfd55cf8e
Model has 20 SOS constraints
Variable types: 100 continuous, 0 integer (0 binary)
Coefficient statistics:
  Matrix range     [0e+00, 0e+00]
  Objective range  [5e-02, 1e+01]
  Bounds range     [1e+00, 1e+00]
  RHS range        [0e+00, 0e+00]
Found heuristic solution: objective -0.0000000
Presolve added 38 rows and 18 columns
Presolve time: 0.00s
Presolved: 38 rows, 118 columns, 236 nonzeros
Found heuristic solution: objective 73.9615015
Variable types: 0 continuous, 118 integer (118 binary)

Root relaxation: objective 8.160873e+01, 26 iterations, 0.00 seconds

    Nodes    |    Current Node    |     Objective Bounds      |     Work
 Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

*    0     0               0      81.6087322   81.60873  0.00%     -    0s

Explored 0 nodes (26 simplex iterations) in 0.00 seconds
Thread count was 4 (of 64 available processors)

Solution count 3: 81.6087 73.9615 -0

Optimal solution found (tolerance 1.00e-04)
Best objective 8.160873221000e+01, best bound 8.160873221000e+01, gap 0.0000%    
    

We see in the logs of the SOS1 formulation, that the SOS1 variables are replaced by binary variables.

In this case, SOS1 is not really helpful. There are cases when they are. For instance, consider the complementarity constraint \[\begin{align}&\color{darkred}x \cdot \color{darkred}y = 0\\ & \color{darkred}x\ge 0,\color{darkred}y \ge 0\end{align}\] We can write this as: \[\begin{align}& \color{darkred}x,\color{darkred}y \in \mathbf{SOS1}\\ &\color{darkred}x\ge 0,\color{darkred}y \ge 0\end{align}\] This can be much better than  \[\begin{align} & \color{darkred}x \le \color{darkblue}M \cdot \color{darkred}\delta \\ &\color{darkred}y \le \color{darkblue}M \cdot (1-\color{darkred}\delta) \\ &\color{darkred}x\ge 0,\color{darkred}y \ge 0\\ & \color{darkred}\delta\in\{0,1\}\end{align}\] as it prevents the need for big-M constants.  This goal can also be achieved with indicator constraints: \[\begin{align} & \color{darkred} \delta=0 \Rightarrow \color{darkred}x=0 \\ & \color{darkred} \delta=1 \Rightarrow \color{darkred}y=0 \\ &\color{darkred}x\ge 0,\color{darkred}y \ge 0\\ & \color{darkred}\delta\in\{0,1\}\end{align}\]

SOS2 sets simplify the modeling of piecewise linear functions. However, often better approaches can be implemented using binary variables. Some of the newer techniques for this, use a logarithmic number of binary variables.


References


3 comments:

  1. Another annoying detail about SOS constraints (and indicator constraints) is that, at least with Gurobi, using them prevents you from asking the solver for multiple different good solutions when you have continuous variables. Normally, when you ask for that, it considers solutions with the same integer values but different continuous values the same, but apparently not when you have SOS constraints in the model.

    ReplyDelete
    Replies
    1. That may be related to the introduction of binary variables behind the scenes (like was demonstrated in my example).

      Delete
  2. SOS1 stands for special ordered set of type 1, so if you see SOS1 being advocated for a situation where there's no ordering of any set involved, you have reason to be suspicious. I wrote about the origins and uses of SOS1 in http://bob4er.blogspot.com/2013/04/special-sos-part-1.html.

    ReplyDelete