## Monday, May 16, 2022

### Ranking using numpy.argsort

I needed to find a ranking of a large data set. Using Python, it makes sense to look at the numpy library for this.

Numpy has the function argsort, which returns index positions [1]. One would think these are exactly the ranks we are after. Unfortunately, this is not the case.

>>> import numpy as np
>>> a = [3.0, 1.0, 5.0, 2.0]
>>> indx = np.argsort(a)
>>> indx
array([1, 3, 0, 2], dtype=int64)


I would expect:

indx = [2, 0, 3, 1]

Actually, the reported indices are for a reversed mapping: from the (unknown) sorted vector to the original unsorted vector.

On the left is what I was after, and on the right is what argsort returns.

It is not very difficult to get the inverse mapping. Here are a few ways to do this:
1. Using loops. Use the simple fact that for an index $$p$$ and its inverse mapping $$p'$$, we have: $p_i=j \iff p'_j=i$
>>> rank = np.empty_like(indx)
>>> for i,j in enumerate(indx):
...     rank[j]=i
...
>>> rank
array([2, 0, 3, 1], dtype=int64)

Looping is in general quite slow in Python. So we may want to look into some alternatives.
2. Fancy indexing[2]. Here we use the array indx to permute the values [0,1,2,3].
>>> rank = np.empty_like(indx)
>>> rank[indx] = np.arange(len(indx))
>>> rank
array([2, 0, 3, 1], dtype=int64)

3. Applying argsort twice[2]. This is a bit of a surprise, but it does exactly what we want.
>>> rank=np.argsort(indx)
>>> rank
array([2, 0, 3, 1], dtype=int64)

This one is the most intriguing of course: argsort(argsort(a)) gives the ranking.

An alternative is to use scipy.stats.rankdata. The above ranking can be replicated with:

>>> import scipy.stats as stats
>>> rank = stats.rankdata(a,method='ordinal')-1
>>> rank
array([2, 0, 3, 1], dtype=int64)


## Wednesday, May 4, 2022

### Evil Sudoku

On sudoku.com[1] there is now an "evil" difficulty level. I found these very difficult to solve by hand (I usually give up).

Of course, a good MIP presolver/preprocessor devours these in no time. It should solve models like this in less than 0.1 seconds using zero Simplex iterations and zero B&B nodes.

Version identifier: 20.1.0.1 | 2021-04-07 | 3a818710c
CPXPARAM_MIP_Display                             4
CPXPARAM_MIP_Pool_Capacity                       0
CPXPARAM_MIP_Tolerances_AbsMIPGap                0
Generic callback                                 0x50
Tried aggregator 2 times.
MIP Presolve eliminated 168 rows and 580 columns.
Aggregator did 28 substitutions.
Reduced MIP has 128 rows, 122 columns, and 501 nonzeros.
Reduced MIP has 122 binaries, 0 generals, 0 SOSs, and 0 indicators.
Presolve time = 0.06 sec. (0.93 ticks)
Found incumbent of value 0.000000 after 0.06 sec. (1.66 ticks)

Root node processing (before b&c):
Real time             =    0.06 sec. (1.67 ticks)
Sequential b&c:
Real time             =    0.00 sec. (0.00 ticks)
------------
Total (root+branch&cut) =    0.06 sec. (1.67 ticks)

--- MIP status (101): integer optimal solution.
--- Cplex Time: 0.06sec (det. 1.67 ticks)

Proven optimal solution
MIP Solution:            0.000000    (0 iterations, 0 nodes)
Final Solve:             0.000000    (0 iterations)

Best possible:           0.000000
Absolute gap:            0.000000
Relative gap:            0.000000


CBC is a bit terser (not always the case):

COIN-OR CBC      38.2.1 96226ea8 Feb 19, 2022          WEI x86 64bit/MS Window

COIN-OR Branch and Cut (CBC Library 2.10)
written by J. Forrest

Calling CBC main solution routine...
No integer variables - nothing to do

Solved to optimality (within gap tolerances optca and optcr).
MIP solution:    0.000000e+00   (0 nodes, 0.019 seconds)

Best possible:   0.000000e+00
Absolute gap:    0.000000e+00   (absolute tolerance optca: 0)
Relative gap:    0.000000e+00   (relative tolerance optcr: 0.0001)


#### References

1. https://sudoku.com/evil/

#### Appendix: GAMS model

 $ontext Evil Sudoku from sudoku.com References: https://yetanothermathprogrammingconsultant.blogspot.com/2022/05/evil-sudoku.html$offtext set   a      'all areas (rows,columns,squares)' /r1*r9,c1*c9,s1*s9/   i(a)   'rows' /r1*r9/   j(a)   'columns' /c1*c9/   s(a)   'squares' /s1*s9/   u(a,i,j) 'areas with unique values a=area name, (i,j) are the cells'   k      'values' /1*9/; ; * * populate u(a,i,j) * u(i,i,j) = yes; u(j,i,j) = yes; u(s,i,j)$(ord(s)=3*ceil(ord(i)/3)+ceil(ord(j)/3)-3) = yes; display u; * * given values * table v0(i,j) c1 c2 c3 c4 c5 c6 c7 c8 c9 r1 9 5 r2 7 1 r3 8 2 7 9 r4 6 4 9 8 r5 4 1 r6 5 r7 2 3 r8 8 1 6 4 r9 7 ; * * MIP model * binary variable x(i,j,k); x.fx(i,j,k)$(v0(i,j)=k.val) = 1; variable z 'dummy objective'; equations    dummy 'objective'    unique(a,k) 'all-different'    values(i,j) 'one value per cell' ; dummy.. z =e= 0; unique(a,k).. sum(u(a,i,j), x(i,j,k)) =e= 1; values(i,j).. sum(k, x(i,j,k)) =e= 1; model sudoku /all/; * * solve * solve sudoku minimizing z using mip; * * display solution * parameter v(i,j) 'solution'; v(i,j) = sum(k, k.val*x.l(i,j,k)); option v:0; display v;

Most models have separate constraints for rows, columns, and squares. In this formulation, I generalize this to "areas". The multidimensional set $$u(a,i,j)$$ contains for each area $$a$$ the corresponding cells $$(i,j)$$.  You can display this set to view it. Once we have this set $$u$$ properly populated, the uniqueness constraints become exceedingly simple. The idea is that constraints are more difficult to debug than sets/parameters. So I am inclined to spend more effort in creating somewhat complex sets, expecting some pay-off in constraints becoming so simple that few things can go wrong.

 u(a,i,j) for different values of a

As usual, we have $\color{darkred}x_{i,j,k} = \begin{cases} 1 & \text{if cell (i,j) has value k} \\ 0 & \text{otherwise}\end{cases}$ To recover the Sudoku solution, we form $\color{darkblue}v_{i,j} = \sum_k \color{darkblue}k\cdot \color{darkred}x^*_{i,j,k}$

The final solution is:

----     73 PARAMETER v  solution

c1          c2          c3          c4          c5          c6          c7          c8          c9

r1           3           2           4           7           1           9           8           5           6
r2           7           5           9           8           6           3           4           1           2
r3           1           6           8           2           4           5           7           3           9
r4           6           7           1           3           5           4           9           2           8
r5           9           4           5           1           8           2           3           6           7
r6           2           8           3           9           7           6           1           4           5
r7           4           9           2           6           3           7           5           8           1
r8           8           3           7           5           2           1           6           9           4
r9           5           1           6           4           9           8           2           7           3


## Sunday, May 1, 2022

### Getting rid of non-linearities

In [1], an attempt was made to implement the following non-linear constraint: $\frac{\displaystyle\sum_t \max\left\{\sum_i CF_{i,t}\cdot q_i -L_t, 0 \right\}}{\displaystyle\sum_t L_t} \le 0.015$ Obviously code like:

cfm = quicksum( max(quicksum(cf[i][t] * q[i] - L[t] for i in range(I)), 0) for t in range(T) /  quicksum(L[t] for t in range(T)) <= 0.015


is really not what we want to see. It is best, not to directly start up your editor and write Python code like this. Rather, get a piece of paper, and focus on the math.

There are two non-linearities: a division and a $$\max()$$ function. Note that I am not sure which symbols are variables or parameters.  I'll try to explain some possible cases below.

#### Division

Exogenous divisions of the form $\frac{\color{darkred}x}{\color{darkblue}a}$ where $$\color{darkblue}a$$ is a constant, are not an issue. This is just really a linear coefficient $$1/\color{darkblue}a$$. If you have $\frac{\color{darkred}x}{\sum_i \color{darkblue}a_i}$ in the constraints, I would advice to calculate $\color{darkblue}\alpha :=\sum_i \color{darkblue}a_i$ in advance, so we can simplify things to $\frac{\color{darkred}x}{\color{darkblue}\alpha}$ It is a good habit to keep constraints as simple as possible, as they are more difficult to debug than parameter assignments.

If we have something like $\frac{\color{darkred}x}{\color{darkred}y}$ (endogenous division) we need to be careful and worry about division by zero (or, more generally, that the expression can assume very large values if $$\color{darkred}y \approx 0$$). If $$\color{darkred}y$$ is a positive variable, we can protect things a bit by using a lower bound, say $$\color{darkred}y\ge 0.0001$$. This is difficult to do if $$\color{darkred}y$$ is a free variable (we want a hole in the middle). If we have  $\frac{\color{darkred}x}{\sum_i\color{darkred}y_i}$ it may be better to write this as: \begin{align}&\frac{\color{darkred}x}{\color{darkred}z}\\ & \color{darkred}z= \sum_i \color{darkred}y_i \\ &\color{darkred}z\ge 0.0001 \end{align} at the expense of an extra variable and constraint. It has the additional advantage of making the model less non-linear (fewer non-linear variables, smaller non-linear part of the Jacobian). Note that we assumed here that $$\sum_i\color{darkred}y_i \ge 0$$. If this expression can assume both negative and positive values, this will not not work.

If the division is not embedded in other non-linear expressions, we often can multiply things out. E.g. if we have the constraint:  $\frac{\color{darkred}x}{\sum_i\color{darkred}y_i} \le 0.015$ then we can write: $\color{darkred}x \le 0.015 \cdot \sum_i\color{darkred}y_i$ This is now linear. This is what can apply to our original constraint:$\sum_t \max\left\{\sum_i CF_{i,t}\cdot q_i -L_t, 0 \right\} \le 0.015\cdot \sum_t L_t$

#### Max function

We can feed $$\max\{\color{darkred}x,0\}$$ to a general-purpose NLP solver. However, this is dangerous. Although this function is continuous, it is non-differentiable at $$\color{darkred}=0$$.

In the special case $\max\{\color{darkred}x,0\} \le \color{darkred}y$ we can write: \begin{align} \color{darkred}x &\le \color{darkred}y \\ 0 & \le \color{darkred}y\end{align} If $$\color{darkred}y$$ represents an expression, you may want to use an extra variable to prevent duplication.

If we have $\sum_i \max\{\color{darkred}x_i,0\} \le \color{darkred}y$ we can do: \begin{align} & \color{darkred}z_i \ge 0 \\ & \color{darkred}z_i \ge \color{darkred}x_i \\ &\sum_i \color{darkred}z_i \le \color{darkred}y \end{align} Note that the variables $$\color{darkred}z_i$$ are a bit difficult to interpret. They can be larger than $$\max\{\color{darkred}x_i,0\}$$ if the inequality is not tight. The reason I use $$\color{darkred}z_i \ge \max\{\color{darkred}x_i,0\}$$ is that it is sufficient for our purpose and that $$\color{darkred}z_i = \max\{\color{darkred}x_i,0\}$$ is just way more difficult to represent.

This is the reformulation we can apply to our original problem: \begin{align}&z_t \ge \sum_i CF_{i,t}\cdot q_i -L_t \\ & z_t \ge 0 \\ & \sum_t z_t \le 0.015\cdot \sum_t L_t \end{align} This is now completely linear: we have removed the division and the $$\max(.,0)$$ function.