Friday, February 5, 2021

Non-differentiable NLP vs LP

 In [1] the following simple problem is posed:

Given data 

p = [50, 50.5, 52, 53, 55, 55.5, 56, 57,60, 61]

try to track this as close as possible by minimizing the largest deviation while obeying the restrictions:

\[\begin{align}&\color{darkred}x_1 = \color{darkblue}p_1\\ &\color{darkred}x_i - \color{darkred}x_{i-1}\le 1.5\end{align}\]

Non-differentiable NLP model

An obvious way to formulate this is: \[\begin{align} \min&\max_i |\color{darkred}x_i-\color{darkblue}p_i| \\ &\color{darkred}x_1 = \color{darkblue}p_1\\ &\color{darkred}x_i - \color{darkred}x_{i-1}\le 1.5\end{align}\]

This is a non-differentiable formulation, and we are asking for trouble when trying to solve this using a standard nonlinear NLP solver. For instance, when I solve this with CONOPT, we may see:

               S O L V E      S U M M A R Y

     MODEL   m1                  OBJECTIVE  z
     TYPE    DNLP                DIRECTION  MINIMIZE
     SOLVER  CONOPT              FROM LINE  25

**** SOLVER STATUS     4 Terminated By Solver      
**** MODEL STATUS      7 Feasible Solution         
**** OBJECTIVE VALUE                1.2500

 RESOURCE USAGE, LIMIT          0.062 10000000000.000
 ITERATION COUNT, LIMIT        31    2147483647
 EVALUATION ERRORS              0             0
CONOPT 3         33.2.0 r4f23b21 Released Dec 01, 2020 WEI x86 64bit/MS Window
    C O N O P T 3   version 3.17L
    Copyright (C)   ARKI Consulting and Development A/S
                    Bagsvaerdvej 246 A
                    DK-2880 Bagsvaerd, Denmark
    The model has 11 variables and 10 constraints
    with 29 Jacobian elements, 10 of which are nonlinear.
    The Hessian of the Lagrangian has 10 elements on the diagonal,
    45 elements below the diagonal, and 10 nonlinear variables.
                   Pre-triangular equations:   0
                   Post-triangular equations:  10
 ** Feasible solution. Convergence too slow. The change in objective
    has been less than 3.7500E-12 for 20 consecutive iterations

Conopt realizes it has some troubles and declares the solution as just feasible instead of optimal. Indeed, we can do better than an objective of 1.25. Other NLP solvers may not even know they are in trouble and just report a solution and declare it (locally) optimal. 

There is a big problem with this model. Most NLP solvers assume the objective and non-linear constraints are smooth (i.e. differentiable). In this case, we don't have this, and this lack of smoothness is often a recipe for disaster. This is a nice example of this. 

I often complain about non-differentiability in models I see on forums. Often the reaction is: "Who cares?". Indeed, often NLP solvers will deliver some solution without warning. This post is about a small problem, where we really can do much better.   

Note that the initial point also plays a role here. With a different starting point, you may find different solutions.

A global NLP solver may help for this model. Some of the ones I tried, don't have support for the max() function, so we need to reformulate the model. When doing that, we can actually do a bit better and form a linear programming problem.

LP model

The model is easily reformulated into an LP model. It is always a good idea to do this. Here is the LP model: \[\begin{align}\min\>&\color{darkred}z\\ & \color{darkred}z \ge \color{darkred}x_i-\color{darkblue}p_i\\ &\color{darkred}z \ge \color{darkblue}p_i- \color{darkred}x_i\\ & \color{darkred}x_1 = \color{darkblue}p_1\\ &\color{darkred}x_i - \color{darkred}x_{i-1}\le 1.5\end{align}\]

This is a much more reliable model. Any LP solver will solve this very quickly and deliver:

----     40 VARIABLE x.L  

i1  50.000,    i2  49.750,    i3  51.250,    i4  52.750,    i5  54.250,    i6  54.750,    i7  56.250,    i8  57.750
i9  59.250,    i10 60.250

----     40 VARIABLE z.L                   =        0.750  

Even an NLP solver like CONOPT will do a good job on this LP:

               S O L V E      S U M M A R Y

     MODEL   m2                  OBJECTIVE  z
     TYPE    LP                  DIRECTION  MINIMIZE
     SOLVER  CONOPT              FROM LINE  63

**** SOLVER STATUS     1 Normal Completion         
**** MODEL STATUS      1 Optimal                   
**** OBJECTIVE VALUE                0.7500

 RESOURCE USAGE, LIMIT          0.062 10000000000.000
 ITERATION COUNT, LIMIT         4    2147483647
CONOPT 3         33.2.0 r4f23b21 Released Dec 01, 2020 WEI x86 64bit/MS Window
    C O N O P T 3   version 3.17L
    Copyright (C)   ARKI Consulting and Development A/S
                    Bagsvaerdvej 246 A
                    DK-2880 Bagsvaerd, Denmark
                   Pre-triangular equations:   0
                   Post-triangular equations:  0
 ** Optimal solution. There are no superbasic variables.

Note: CONOPT "knows" this is a completely linear model (GAMS provides information that tells whether a Jacobian element is linear or non-linear).

This model is not a particularly good tool to fit data. As we only minimize the maximum deviation, there is no incentive to stay really close to the data points.

The orange line (\(\color{darkred}x\)) stays within a distance of 0.75 of the blue line (\(\color{darkblue}p\)). But we could track the blue line much better than this.


Discontinuities and non-differentiable functions can cause major headaches for many NLP solvers. It is always a good idea to see if we can reformulate the model so it will behave better.  I see many NLP models that suffer from non-differentiable or even non-continuous functions. Often the modeler is not even aware of this or underestimates the problems this can cause. I always try to find a workaround, which is often not very difficult. This a nice demonstration of the problem and how we can address it.


  1. Minimizing the peak difference of elements of two lists given some constraints, Demonstration of the same problem using different tools and solvers. 

Appendix A: GAMS model

GAMS requires to use a DNLP model when functions like abs() and max() are used in the model. This is meant as a warning for the modeler: you are in dangerous territory. Here we see that is a useful hint. 

* data

set i /i1*i10/;

parameter p(i) 'data'/
i1 50,   i2 50.5, i3 52, i4 53, i5  55,
i6 55.5, i7 56,   i8 57, i9 60, i10 61
display p;

* NLP model

variable x(i);
variable z;

'non-linear, non-differentiable'
'limit increase'

'i1') = p('i1');

obj.. z =e=
smax(i, abs(x(i)-p(i)));
diff(i+1).. x(i+1)-x(i) =l= 1.5;

* initial point
x.l(i) = 50;

model m1 /obj,diff/;
option dnlp=conopt;
solve m1 using dnlp minimizing z;

display x.l,z.l;

* LP model

'max abs(p1-p2)'
'max abs(p1-p2)'

zmax1(i).. z =g= x(i)-p(i);
zmax2(i).. z =g= p(i)-x(i);

model m2 /zmax1,zmax2,diff/;
solve m2 minimizing z using lp;
display x.l,z.l;

Appendix B: CVXPY model

Expressing the model in CVXPY is interesting. CVXPY will apply the necessary transformations to generate an LP from the ostensibly non-linear model.

import cvxpy

list1 = [50, 50.5, 52, 53, 55, 55.5, 56, 57, 60, 61]
n = len(list1)

x = cvxpy.Variable(n)
prob = cvxpy.Problem(cvxpy.Minimize(cvxpy.max(cvxpy.abs(x-list1))),
                     [cvxpy.diff(x) <= 1.5])
result = prob.solve()

No comments:

Post a Comment