Well, there is no universally best way. But one approach I like to suggest is: use (or construct) a known feasible solution and fix decision variables. This approach is not often mentioned. Most algorithm suppliers say: just use IIS [1]. In my opinion (and based on experience), fixing is often more reliable: it will pin-point infeasible constraints. In addition, fixing is conceptually much simpler than looking at IIS output. The results of fixing does not require much explanation. Sometimes lo-tech is better than hi-tech [2].
Example: a transportation problem
Let's use a simple transportation problem as an example: \[\bbox[lightcyan,10px,border:3px solid darkblue]{\begin{align}\min&\sum_{i,j} c_{i,j} x_{i,j}\\ & \sum_i x_{i,j} \ge d_j& \forall j&\text{ (demand)}\\ & \sum_j x_{i,j} \le s_i& \forall i&\text{ (supply)}\\ &x_{i,j} \ge 0\end{align}}\]
The data looks like:---- 45 PARAMETER c transport cost new-york chicago topeka seattle 0.225 0.153 0.162 san-diego 0.225 0.162 0.126 ---- 45 PARAMETER d demand at market j new-york 325.000, chicago 300.000, topeka 275.000 ---- 45 PARAMETER s capacity of plant i seattle 350.000, san-diego 600.000
When we deal with infeasibilities, we can ignore the objective: this will never have a role. So in the data above, we don't need to worry about parameter \(c\).
A feasible solution is easily constructed:
Feasible solution |
--- 70 PARAMETER s capacity of plant i seattle 280.000, san-diego 480.000
The model is now infeasible. The reason is the total supply is too small to meet total demand.
The GAMS equation listing will show the individual equations (before solving):
---- demand =G= satisfy demand at market j demand(new-york).. x(seattle,new-york) + x(san-diego,new-york) =G= 325 ; (LHS = 0, INFES = 325 ****) demand(chicago).. x(seattle,chicago) + x(san-diego,chicago) =G= 300 ; (LHS = 0, INFES = 300 ****) demand(topeka).. x(seattle,topeka) + x(san-diego,topeka) =G= 275 ; (LHS = 0, INFES = 275 ****) ---- supply =L= observe supply limit at plant i supply(seattle).. x(seattle,new-york) + x(seattle,chicago) + x(seattle,topeka) =L= 280 ; (LHS = 0) supply(san-diego).. x(san-diego,new-york) + x(san-diego,chicago) + x(san-diego,topeka) =L= 480 ; (LHS = 0)
This is a useful listing to look at. The initial values for the variables are zero here, so the demand equations are infeasible at the initial point. The supply constraints are feasible at the initial point. It is noted that the variables are always feasible with respect to their bounds. GAMS makes sure the variable levels are always within their bounds at this initial point.
One good way to prevent even to pass the model on to the solver is to add a data check:
abort$(sum(i,s(i)) < sum(j,d(j)) - 0.01) "Supply too small for demand";
- Solve as is, and see how different solver report a solution for this infeasible model. We'll see there is little uniformity.
- Run with IIS (Irreducible Infeasible Set). This is often advised as good tool to inspect and diagnose infeasibilities. I am usually less successful when using this approach.
- Fix decision variables to a known feasible solution, and see what is reported. This is often my preferred approach.
Solve and look at the solution
The best solution will minimize the sum of the infeasibilities (this sum is 140) and properly mark the infeasible rows and columns. Let's see how the solver do on this.
Cplex
---- EQU demand satisfy demand at market j LOWER LEVEL UPPER MARGINAL new-york 325.0000 325.0000 +INF 0.2250 chicago 300.0000 300.0000 +INF 0.1530 topeka 275.0000 275.0000 +INF 0.1260 ---- EQU supply observe supply limit at plant i LOWER LEVEL UPPER MARGINAL seattle -INF 280.0000 280.0000 EPS san-diego -INF 620.0000 480.0000 . INFES ---- VAR x shipment quantities in cases LOWER LEVEL UPPER MARGINAL seattle .new-york . -20.0000 +INF . INFES seattle .chicago . 300.0000 +INF . seattle .topeka . . +INF 0.0360 san-diego.new-york . 345.0000 +INF . san-diego.chicago . . +INF 0.0090 san-diego.topeka . 275.0000 +INF . **** REPORT SUMMARY : 0 NONOPT 2 INFEASIBLE (INFES) SUM 160.0000 MAX 140.0000 MEAN 80.0000 0 UNBOUNDED
We see there are two reported infeasibilities. One of them is an infeasible variable. Interestingly, this is not the solution with the minimum phase 1 objective (i.e. minimum sum of infeasibilities). Here we have a total infeasibility of 160 while the optimal phase 1 objective is 140.
MINOS, CONOPT
A proper minimum phase 1 solution with a total infeasibility of 140 is reported by MINOS and Conopt:
---- EQU demand satisfy demand at market j LOWER LEVEL UPPER MARGINAL new-york 325.0000 325.0000 +INF 0.0039 chicago 300.0000 300.0000 +INF 0.0039 topeka 275.0000 135.0000 +INF 0.0039 INFES ---- EQU supply observe supply limit at plant i LOWER LEVEL UPPER MARGINAL seattle -INF 280.0000 280.0000 -0.0039 san-diego -INF 480.0000 480.0000 -0.0039 ---- VAR x shipment quantities in cases LOWER LEVEL UPPER MARGINAL seattle .new-york . . +INF EPS seattle .chicago . 145.0000 +INF . seattle .topeka . 135.0000 +INF . san-diego.new-york . 325.0000 +INF . san-diego.chicago . 155.0000 +INF . san-diego.topeka . . +INF EPS **** REPORT SUMMARY : 0 NONOPT 1 INFEASIBLE (INFES) SUM 140.0000 MAX 140.0000 MEAN 140.0000 0 UNBOUNDED 0 ERRORS
In both cases the solver distributes infeasibilities in a way that is unrelated to the original problem: we see a demand equation is marked (while both supply equations are the real culprit here).
Gurobi
---- EQU demand satisfy demand at market j LOWER LEVEL UPPER MARGINAL new-york 325.0000 . +INF -1.0000 chicago 300.0000 . +INF -1.0000 topeka 275.0000 . +INF -1.0000 ---- EQU supply observe supply limit at plant i LOWER LEVEL UPPER MARGINAL seattle -INF . 280.0000 1.0000 san-diego -INF . 480.0000 1.0000 ---- VAR x shipment quantities in cases LOWER LEVEL UPPER MARGINAL seattle .new-york . . +INF . seattle .chicago . . +INF . seattle .topeka . . +INF . san-diego.new-york . . +INF . san-diego.chicago . . +INF . san-diego.topeka . . +INF . **** REPORT SUMMARY : 0 NONOPT 0 INFEASIBLE 0 UNBOUNDED
This is just a bogus solution: all levels are zero. Infeasibilities are not properly marked. The number and sum of infeasibilities is also wrong.
The conclusion must be: many solvers do not do a good job of returning a proper phase 1 solution that minimizes the sum of the infeasibilities. I always find this somewhat disappointing. I like to know what the minimum sum of infeasibilities is: the size can give a clue about the source of the infeasibilities.
IIS: Irreducable Infeasible Set
Many modern solvers have a facility to use an IIS (Irreducable Infeasible Set) algorithm [3] to return groups of equations such that when one equation is removed thit set of equations becomes feasible.
Here is the IIS reported by Gurobi:
LP status(3): Model was proven to be infeasible. Computing IIS... An IIS is a set equations and variables (ie a submodel) which is infeasible but becomes feasible if any one equation or variable bound is dropped. A problem may contain several independent IISs but only one will be found per run. Number of equations in the IIS: 5 demand(new-york) > 325 demand(chicago) > 300 demand(topeka) > 275 supply(seattle) < 280 supply(san-diego) < 480 Number of variables in the IIS: 0
Cplex gives the same thing:
Minimal conflict found. A conflict is a set equations and variables (ie a submodel) which is infeasible but becomes feasible if any one equation or variable bound is dropped. A problem may contain several independent conflicts but only one will be found per run. Number of equations in the conflict: 5. lower: demand(new-york) > 325 lower: demand(chicago) > 300 lower: demand(topeka) > 275 upper: supply(seattle) < 280 upper: supply(san-diego) < 480 Number of variables in the conflict: 0.
The IIS set has all constraints of the model. This actually means: if we drop any of the constraints of our model, the resulting model will be feasible. Well, that is interesting, but it does not bring me closer to the conclusion that the supply capacities are wrong.
Fixing: the best approach for this problem
The best approach is to fix the variable to our feasible solution. In GAMS, we can do this with:
table solx(i,j)
new-york chicago topeka
seattle 325
san-diego 300 275
;
x.fx(i,j) = solx(i,j);
The equation listing tells us exactly what is wrong:
---- demand =G= satisfy demand at market j demand(new-york).. x(seattle,new-york) + x(san-diego,new-york) =G= 325 ; (LHS = 325) demand(chicago).. x(seattle,chicago) + x(san-diego,chicago) =G= 300 ; (LHS = 300) demand(topeka).. x(seattle,topeka) + x(san-diego,topeka) =G= 275 ; (LHS = 275) ---- supply =L= observe supply limit at plant i supply(seattle).. x(seattle,new-york) + x(seattle,chicago) + x(seattle,topeka) =L= 280 ; (LHS = 325, INFES = 45 ****) supply(san-diego).. x(san-diego,new-york) + x(san-diego,chicago) + x(san-diego,topeka) =L= 480 ; (LHS = 575, INFES = 95 ****)
The initial point with the fixed levels for the variables, yields infeasibilities for the supply equation. The solver will also pinpoint a single infeasibility in the log:
Infeasibility row 'supply(seattle)': 0 <= -45.Only one infeasibility is reported here, This is because the presolver handled this. When a presolver finds the model is infeasible most often it can produce a good message, but it will only report one infeasibility. The infeasiblity is reported as 0 <= -45. This is a bit less clear than we want. Basically the left-hand side of the inequality x(seattle,new-york) + x(seattle,chicago) + x(seattle,topeka) \(\le\) 280 has become empty as all variables are replaced by their fixed values. The constants are moved to the right-hand side. The constraint is therefore re-interpreted as \(0 \le 280 - 325 = -45\). This is clearly infeasible.
Note:
However, the presolver can also lead to the dreaded:
Model was proven to be either infeasible or unbounded.
The solver puts premium on minimization of its time. One could say, the real problem a solver solves is: minimize the time spent on solving the problem. In case of this message, the problem to find out whether the model is really unbounded or infeasible is off-loaded to the user. In my opinion, the users time is more valuable, so the solver should spent a little bit of extra time to figure out if the problem is actually infeasible or unbounded.GAMS can do a more strict test when we add the model setting holdfixed=1. This will make all fixed variables constants, just like parameters. When we do:
transport.holdfixed=1;
solve transport using lp minimizing z;
we'll see:
**** Exec Error at line 62: Equation infeasible due to rhs value **** INFEASIBLE EQUATIONS ... ---- supply =L= observe supply limit at plant i supply(seattle).. 0 =L= -45 ; (LHS = 0, INFES = 45 ****) supply(san-diego).. 0 =L= -95 ; (LHS = 0, INFES = 95 ****)
In this case the model is not even passed on to the solver.
Some notes on fixing:
- Often you don't need to fix all variables in a model. Just fixing the central variables may be enough: the other variables can be calculated from them in an unambiguous way. If you only fix a few central variables, you need to rely on the solver messages to pinpoint the infeasibility. As the initial point is not complete, we cannot rely on the GAMS equation listing.
- Make sure you don't destroy any bounds on the variables.
- Bounds can play an important role in the model being infeasible.
- This method also works when the problem is integer infeasible (but LP feasible).
An elastic formulation
If you want to protect your models against infeasibilities, it may help to formulate an elastic version. This means that you allow to violate constraints, but at a cost. In other words: convert hard constraints into soft ones. For example, a manpower constraint can be made elastic by allowing to hire temporary workers (at a higher cost).
An elastic formulation would make the problem always feasible. We can make the supply equations elastic by allowing to oversupply beyond our capacity. E.g. by outsourcing and supplying from other suppliers at a high cost. Such a formulation can look like: \[\begin{align}\min&\sum_{i,j} c_{i,j} x_{i,j} + C_e \sum_ i e_i\\ & \sum_i x_{i,j} \ge d_j& \forall j&\text{ (demand)}\\ & \sum_j x_{i,j} \le s_i+e_i& \forall i&\text{ (supply)}\\ &x_{i,j} \ge 0, e_i \ge 0\end{align}\] Note that we only add slacks to the supply equation. This is enough to make the model always feasible. We don't need to make the demand equation also elastic. When we solve this elastic model we see:
LOWER LEVEL UPPER MARGINAL ---- EQU cost2 . . . 1.0000 cost2 new objective: add cost extra supply ---- EQU demand satisfy demand at market j LOWER LEVEL UPPER MARGINAL new-york 325.0000 325.0000 +INF 999.2250 chicago 300.0000 300.0000 +INF 999.1530 topeka 275.0000 275.0000 +INF 999.1260 ---- EQU supply2 allow for extra supply LOWER LEVEL UPPER MARGINAL seattle -INF 280.0000 280.0000 -999.0000 san-diego -INF 480.0000 480.0000 -999.0000 ---- VAR x shipment quantities in cases LOWER LEVEL UPPER MARGINAL seattle .new-york . 120.0000 +INF . seattle .chicago . 300.0000 +INF . seattle .topeka . . +INF 0.0360 san-diego.new-york . 205.0000 +INF . san-diego.chicago . . +INF 0.0090 san-diego.topeka . 275.0000 +INF . ---- VAR e extra supply LOWER LEVEL UPPER MARGINAL seattle . 140.0000 +INF . san-diego . . +INF EPS LOWER LEVEL UPPER MARGINAL ---- VAR tc -INF 140013.6750 +INF . tc total cost **** REPORT SUMMARY : 0 NONOPT 0 INFEASIBLE 0 UNBOUNDED
I used a unit cost of \(C_e=999\) for this extra supply. Observe that we added 140 units of expensive, extra supply. Obviously this number is the same as the optimal phase 1 objective reported earlier by Minos and Conopt.
Conclusion
If you know (or can construct) a feasible point, fixing variables to this point is an extremely useful tool to find out what caused the problem to be infeasible. Of course, constructing a feasible solution is not always easy, so this method is not suited for all cases. For other models it may be surprisingly easy to generate a feasible solution. For instance, for a machine scheduling model, just order jobs by their release date on a single machine.
The GAMS output for different solvers is unfortunately difficult to interpret: some of the output is contradictory, and output is spread around in log and listing files. Solvers are not very consistent in what they report. Depending on whether the presolver or the solver itself detects infeasibilities, the reporting may be totally differently. This is all somewhat messy.
Some solvers like to report a Farkas certificate of infeasibility [4]. There are very enthusiastic proponents of this approach. I think it is fair to say that not all users are quite comfortable with this type of reporting. When a feasible solution is available, fixing, to me, still looks simpler and more accurate.
Finally I want to mention that unbounded models are easier to debug. Just put a bound on the objective, e.g.: \[\begin{align}\min\>&z \\ & z = c^Tx \\ & z \ge -1000000\\ &Ax=b \\ & \ell \le x \le u\end{align}\] and inspect the solution to for large (negative) values.
References
- encountering INFEASIBLE status in Gurobi Matlab interface, https://stackoverflow.com/questions/51743892/encountering-infeasible-status-in-gurobi-matlab-interface
- K best solutions for an assignment problem, https://yetanothermathprogrammingconsultant.blogspot.com/2018/04/k-best-solutions-for-assignment-problem.html
- O. Guieu and J.W. Chinneck (1999), "Analyzing Infeasible Mixed-Integer and Integer Linear Programs", INFORMS Journal on Computing, vol. 11, no. 1, pp. 63-77.
- Erling D. Andersen, How to use Farkas’ lemma to say something important about infeasible linear problems, https://docs.mosek.com/whitepapers/infeas.pdf
Thank you so much for this!! It was very easy to follow and it is such a relief to finally find what has been causing the infeasibility in my model! I am however, still to find out how to rectify it but it is relieving nonetheless.. :)
ReplyDelete