Sunday, June 5, 2022

Maximizing Standard Deviation

In [1] a question was posed: how to maximize the standard deviation of a vector \(\color{darkred}x_i\), subject to side constraints using an LP/MIP formulation?

The standard deviation is usually defined as: \[\mathbf{sd}(x) = \sqrt{\frac{1}{n-1}\sum_i \left(x_i-\mu\right)^2}\] where \[\mu = \frac{\sum_i x_i}{n}\] and \(n\) is the number of elements in \(x\).

The first thing to note is that we can simplify this a bit. The solution will not change, if we replace \[\min/\max\> \mathbf{sd}(x)\] by \[\min/\max \>\sum_i \left(x_i-\mu\right)^2\]

To understand a bit what a solution with a high standard deviation looks like, let's have a look at the following model: \[\begin{align}\max & \sum_i \left(\color{darkred}x_i-\color{darkred}\mu\right)^2 \\ & \color{darkred}\mu = \frac{\sum_i \color{darkred}x_i}{\color{darkblue}n} \\ & \color{darkred}x_i \ge \color{darkred}x_{i-1}\\ & \color{darkred}x_i \in [1,4]\end{align}\] The solution is:

----     27 VARIABLE x.L  

i1 1.000,    i2 1.000,    i3 1.000,    i4 1.000,    i5 4.000,    i6 4.000,    i7 4.000,    i8 4.000


----     27 VARIABLE z.L                   =       18.000  objective
            PARAMETER stdev                =        1.604  standard deviation


We see two clusters: one at 1.0 and one at 4.0. Obviously, if your goal is to evenly distribute values, maximizing the standard deviation is not the way to go.

Note that this is a non-convex quadratic programming problem. You need a global solver for this problem. Possible solvers are Cplex, Gurobi, and general-purpose MINLP solvers like Baron and Antigone.

Back to the original model. We assume that we want to maximize the standard deviation, subject to a linear side-constraint and to overall bounds \([\color{darkblue}\ell,\color{darkblue}u]\). Here are some possible formulations and approximations.

Model 1: Nonconvex QP


The first formulation for the problem stated in [1] is:

Nonconvex QP Model
\[\begin{align}\max & \sum_i (\color{darkred}x_i-\color{darkred}\mu)^2 \\ & \color{darkred}\mu = \frac{\sum_i \color{darkred}x_i}{\color{darkblue}n} \\ & \sum_i \color{darkblue}a_i\cdot\color{darkred}x_i = \color{darkblue}b \\ & \color{darkred}x_i \in [\color{darkblue}\ell,\color{darkblue}u] \end{align}\]

Note that we have a single linear side constraint in this model.

Model 2: SOS1 variables


Here we try to replace the quadratic terms with absolute values: \[\max\>\sum_i |\color{darkred}x_i-\color{darkred}\mu|\] This will allow us to use a linear MIP solver. We use a simple variable splitting approach: \[s_i^+ - s_i^- = \color{darkred}x_i-\color{darkred}\mu\] where \(s_i^+, s_i^- \ge 0\). As we are maximizing the absolute values we need to make sure not both \(s_i^+, s_i^-\) are positive, leading to an unbounded problem. One way to enforce this is to form \(n\) SOS1 sets with two members \(s_i^+, s_i^-\). In a SOS1 set, only one member can be nonzero. So we have:

SOS1 Approximation Model
\[\begin{align}\max & \sum_i \left(\color{darkred}s_i^+ + \color{darkred}s_i^- \right) \\ & \color{darkred}s_i^+ - \color{darkred}s_i^- = \color{darkred}x_i-\color{darkred}\mu \\ & \color{darkred}\mu = \frac{\sum_i \color{darkred}x_i}{\color{darkblue}n} \\ & \sum_i \color{darkblue}a_i\cdot\color{darkred}x_i = \color{darkblue}b \\ & \color{darkred}x_i \in [\color{darkblue}\ell,\color{darkblue}u] \\ & \color{darkred}s_i^+, \color{darkred}s_i^- \ge 0 \\ & \{\color{darkred}s_i^+, \color{darkred}s_i^-\} \in {SOS1} \end{align}\]


As we will see later, there are problems with this formulation.

Model 3: Binary variables


Instead of using SOS1 sets to choose a single branch, we can also use binary variables. This is quite similar to the previous model. 

Binary Approximation Model
\[\begin{align}\max & \sum_i \left(\color{darkred}s_i^+ + \color{darkred}s_i^- \right) \\ & \color{darkred}s_i^+ - \color{darkred}s_i^- = \color{darkred}x_i-\color{darkred}\mu \\ & \color{darkred}s_i^+ \le \color{darkred}\delta_i \cdot (\color{darkblue}u-\color{darkblue}\ell)\\ & \color{darkred}s_i^- \le (1-\color{darkred}\delta_i) \cdot (\color{darkblue}u-\color{darkblue}\ell) \\ & \color{darkred}\mu = \frac{\sum_i \color{darkred}x_i}{\color{darkblue}n} \\ & \sum_i \color{darkblue}a_i\cdot\color{darkred}x_i = \color{darkblue}b \\ & \color{darkred}x_i \in [\color{darkblue}\ell,\color{darkblue}u] \\ & \color{darkred}s_i^+, \color{darkred}s_i^- \ge 0 \\ & \color{darkred}\delta_i \in \{0,1\} \end{align}\]


As all the variables \(\color{darkred}x_i\) live inside \([\color{darkblue}\ell,\color{darkblue}u]\), the mean will also be inside this interval. This means \(\color{darkblue}u-\color{darkblue}\ell\) is a valid upper bound on the variable \(\color{darkred}s_i^+, \color{darkred}s_i^-\).

Results


----    134 PARAMETER results  

              GlobalQP        SOS1    SOS1/bnd      Binary

i1               3.000       3.000       3.000       3.000
i2               0.500       0.500       0.500       0.500
i3               0.500       0.500       0.500       0.500
i4               3.000       3.000       3.000       3.000
i5               3.000       3.000       3.000       3.000
i6               3.000       3.000       3.000       3.000
i7               3.000       3.000       3.000       3.000
i8               0.500       0.500       0.500       0.500
i9               3.000       3.000       3.000       3.000
i10              0.500       0.500       0.500       0.500
i11              0.500       0.500       0.500       0.500
i12              0.500       0.500       0.500       0.500
i13              0.500       0.500       0.500       0.500
i14              0.500       0.500       0.500       0.500
i15              3.000       3.000       3.000       3.000
i16              0.500       0.500       0.500       0.500
i17              3.000       3.000       3.000       3.000
i18              3.000       3.000       3.000       3.000
i19              0.500       0.500       0.500       0.500
i20              0.500       0.500       0.500       0.500
i21              2.987       2.987       2.987       2.987
i22              3.000       3.000       3.000       3.000
i23              3.000       3.000       3.000       3.000
i24              3.000       3.000       3.000       3.000
i25              0.500       0.500       0.500       0.500
i26              0.500       0.500       0.500       0.500
i27              3.000       3.000       3.000       3.000
i28              0.500       0.500       0.500       0.500
i29              0.500       0.500       0.500       0.500
i30              0.500       0.500       0.500       0.500
status         Optimal     Optimal     Optimal     Optimal
obj             46.633      37.320      37.320      37.320
mu               1.666       1.666       1.666       1.666
stdev            1.268       1.268       1.268       1.268
time             0.125   12399.812       0.750       0.719
nodes          613.000 -2.14747E+9   14163.000   14307.000
iterations   10691.000 -1.28591E+8   69873.000   84000.000


We see some interesting things.

The SOS1 model is just taking forever. Largely because many of the nodes are unbounded. A small fragment of the log looks like:
 
Elapsed time = 35.09 sec. (12722.88 ticks, tree = 1.78 MB)
 5380775  6966        cutoff             36.6404               10482581    0.00%
 5763069  7166     unbounded             36.6404               11394958         
 6151718  7286     unbounded             36.6404               12102053         
 6533752  7353        cutoff             36.6404               12491050    0.00%
 6917221  7456        cutoff             36.6404               12955237    0.00%
 7294233  7603        cutoff             36.6404               14146645    0.00%
 7679187  7705        cutoff             36.6404               14930905    0.00%
 8062790  7714     unbounded             36.6404               15158021         
 8443859  7813     unbounded             36.6404               15935203         
 8829248  7941        cutoff             36.6404               16958187    0.00%

Note that the reported gaps of \(0\%\) are wrong. There is no valid upper bound, so there is no meaningful gap to report. A zero gap would mean: we are almost done. I can assure you this is not the case. I would classify this as a Clex bug. B.t.w., Gurobi is not doing any better on this model. 

The reported iteration and node counts for the SOS1 column are negative. This is typically the symptom of an integer overflow somewhere. That is likely a GAMS bug.  

We can remedy this disastrous performance by adding a bound on the objective. I used: \[\color{darkred}z \le \color{darkblue}n\cdot(\color{darkblue}u-\color{darkblue}\ell)\] This model corresponds to the column SOS1/bnd. This obviously gives a huge performance improvement. An alternative would be to impose bounds: \[\color{darkred}s_i^+, \color{darkred}s_i^- \le \color{darkblue}u-\color{darkblue}\ell\]

The linear approximations find the same optimal solution as the QP model. This is not guaranteed of course.

Conclusions


This is a small non-convex QP model that can be approximated by replacing a 2-norm penalty with a 1-norm. This exercise reveals a big problem in the SOS1 formulation: it just performs extremely poorly unless we add an upper bound to the objective. Along the way, we observed some bugs.

It looks like the most intuitive approach (a QP model) is also the best performer.

References


Appendix: GAMS model



*------------------------------------------------------------------
* data
*------------------------------------------------------------------

set i /i1*i30/;

scalars
   lo
/ 0.5 /
   up
/ 3 /
   range
;
range = up-lo;

parameter a(i) 'coefficients of constraint';
a(i) = uniform(0,2);


*------------------------------------------------------------------
* reporting macros
*------------------------------------------------------------------

parameter results(*,*);

acronym Optimal;

* macros for reporting
$macro stdev      sqrt(sum(i,sqr(x.l(i)-mu.l))/(card(i)-1))
$macro report(m,label)  \
    results(i,label) = x.l(i); \
    results(
'status',label)= m.solvestat; \
    results(
'status',label)$(m.solvestat=1) = Optimal; \
    results(
'obj',label) = z.l; \
    results(
'mu',label) = mu.l; \
    results(
'stdev',label) = stdev; \
    results(
'time',label) = m.resusd; \
    results(
'nodes',label) = m.nodusd; \
    results(
'iterations',label) = m.iterusd; \
   
display results;


*------------------------------------------------------------------
* non-convex QP model
*------------------------------------------------------------------


positive variable x(i);
x.lo(i) = lo;
x.up(i) = up;

variable
  z
'obj'
  mu
'mean'
;

Equations
   e    
'side constraint'
   obj  
'objective'
   mean 
'average'
;

e..
sum(i,a(i)*x(i)) =e= card(i);
mean.. mu =e=
sum(i,x(i))/card(i);

obj.. z =e=
sum(i, sqr(x(i)-mu));

model m1 /all/;
option qcp=cplex,threads=16;
m1.optfile=1;
solve m1 maximizing z using qcp;


$onecho > cplex.opt
OptimalityTarget=3
$offecho


report(m1,
'GlobalQP')


*------------------------------------------------------------------
* SOS1 based model
*------------------------------------------------------------------

set pm /'+','-'/;
sos1 variable s(i,pm);

equations
   split
   obj2
;

split(i).. s(i,
'+')-s(i,'-') =e= x(i)-mu;
obj2.. z =e=
sum((i,pm),s(i,pm));

model m2 /split,obj2,e,mean/;
solve m2 maximizing z using mip;

report(m2,
'SOS1')

*------------------------------------------------------------------
* SOS1 with bound on the objective variable
*------------------------------------------------------------------

z.up = range*
card(i);
solve m2 maximizing z using mip;

report(m2,
'SOS1/bnd')

z.up =
INF;

*------------------------------------------------------------------
* model using binary variables
*------------------------------------------------------------------

positive variable s2(i,pm);
binary variable b(i);

Equations
   split2a
   split2b
   split2c
   obj3
   ;

split2a(i).. s2(i,
'+')-s2(i,'-') =e= x(i)-mu;
split2b(i).. s2(i,
'+') =l= b(i)*range;
split2c(i).. s2(i,
'-') =l= (1-b(i))*range;
obj3.. z =e=
sum((i,pm),s2(i,pm));

model m3 /split2a,split2b,split2c,obj3,e,mean/;
solve m3 maximizing z using mip;

report(m3,
'Binary')

1 comment:

  1. Very interesting
    I love seeing how problems can be formulated into MILP

    ReplyDelete