## Tuesday, March 21, 2017

### Joins in GAMS, SQL or R

To produce the charts in (1) I needed to do a multiple join on some data. From the GAMS model in (2) we extract the the following data:

 sets  j 'tasks'  /job1*job10/  t 'stages' /t1*t10/;parameters   proctime(j,t) 'processing times for each stage'   machine(j,t) 'machine numbers for each stage (zero based)';variable x(j,t) 'start time of subtask';

What we want to arrive in R is the following:

I used “approach 1” (pure SQL, described next) in (1), but there are other ways to attack the problem. I’ll describe a few.

##### Approach 1: pure SQL

Writing this into a SQLite database is very simple:

 execute_unload "ft10";execute "gdx2sqlite -i ft10.gdx -o ft10.db";

The database will contain a copy of all the data in the model, including the symbols we are interested in.

When loading the data we need to do two things:

1. Join proctime, machine and x on (j,t)
2. GAMS only stores non-zero elements, which we need to “repair”
3. Add the end column.

The second issue can be illustrated by looking at the table machine:

We miss the records that have a zero value. We can reintroduce them as follows. First we can generate the Carthesian product of J × T using a simple SQL subquery:

Note that table j has a single column j and table t has a single column t. As we have 10 jobs and 10 stages, this Carthesian product yields a table of 100 rows. Adding the machine number to this table is easy with a left join where we join on (j,t):

The left join added the machine column but the missing values are represented by NA in R (or NULL in SQL). This is easily fixed:

We now add the other columns and also calculate the end column:

##### Approach 2: GAMS + SQL

It is easy in GAMS to do the join. In addition we can use a trick to keep the zeroes:

 alias(*,job,stage,attribute);parameter report(job,stage,attribute);report(j,t,'machine') = EPS + machine(j,t);report(j,t,'start') = EPS + x.l(j,t);report(j,t,'duration') = EPS + proctime(j,t);report(j,t,'end') = EPS + x.l(j,t) + proctime(j,t);execute_unload "ft10";execute "gdx2sqlite -i ft10.gdx -o ft10.db";

The EPS is a placeholder for the zero values The GAMS parameter report looks like:

The GDX2SQLITE tool will map the EPS values into zero before storing them in the database, so on the R side we see:

We need to pivot on the attribute, value columns. Unfortunately this is not completely trivial in SQL. A standard way to do this is:

##### Approach 3: GAMS + R

We keep the GAMS part the same:

 alias(*,job,stage,attribute);parameter report(job,stage,attribute);report(j,t,'machine') = EPS + machine(j,t);report(j,t,'start') = EPS + x.l(j,t);report(j,t,'duration') = EPS + proctime(j,t);report(j,t,'end') = EPS + x.l(j,t) + proctime(j,t);execute_unload "ft10";execute "gdx2sqlite -i ft10.gdx -o ft10.db";

On the R side we can read the parameter report as is and pivot using spread from the tidyr package:

Arguably this is the cleanest solution.

##### References
1. Gantt Chart with R/Plotly, http://yetanothermathprogrammingconsultant.blogspot.com/2017/03/gantt-chart-with-rplotly.html
2. Playing with job shop problem ft10 (1), http://yetanothermathprogrammingconsultant.blogspot.com/2014/04/playing-with-ft10-job-shop-1.html

## Sunday, March 19, 2017

### Gantt Chart with R/Plotly

Plotly (1) is a great graphics library for producing dynamic plots. It also has an R interface. A simple example of a Gantt chart with R/plotly is shown in (2). Here is a more complicated version: a job shop problem with 10 jobs and 10 stages (3).

The colors correspond to the machine used in each stage. Machines are numbered 0…9.

These are screen dumps. I have not figured out yet how to use R/plotly dynamic plots in blogger.

##### References
1. Plotly: https://plot.ly/
2. https://www.r-bloggers.com/gantt-charts-in-r-using-plotly/
3. Playing with job shop problem ft10 (1), http://yetanothermathprogrammingconsultant.blogspot.com/2014/04/playing-with-ft10-job-shop-1.html

## Friday, March 17, 2017

### Binomial

In a nonlinear model I needed to use the following expression in a model equation:

 $y = \binom{x}{k}p^k(1-p)^{(x-k)}$

Here $$x$$ is the only decision variable. The integer version of “n choose k” is well known:

 $\binom{n}{k} = \frac{n!}{k!(n-k)!}$

For fractional values we can generalize this to:

 $\binom{x}{y} = \frac{\Gamma(x+1)}{\Gamma(y+1)\Gamma(x-y+1)}$

where $$\Gamma(.)$$ indicates the gamma function, defined by

 $\Gamma(z) = \int_0^{\infty}x^{z-1}e^{-x} dx$

In GAMS we have the binomial(x,y) function for this. The problem is that $$x$$ can become rather large, while $$k$$ is a small integer, say $$k=0,\dots,4$$. For some values of $$x$$ this leads to very inaccurate results and even overflows in subsequent calculations.

E.g.:

 scalars  x /112.8/  k /2/;parameter y;y("binomial") = binomial(x,k);y("log") = exp(loggamma(x+1)-loggamma(k+1)-loggamma(x-k+1));display y;

yields the wrong result:

 ----      8 PARAMETER y  binomial 1.0000E+299,    log         6305.520

The logarithmic version is correct.

Instead of using the logarithmic version:

 $\binom{x}{k} = \exp\left(\log\Gamma(x+1)-\log\Gamma(k+1)-\log\Gamma(x-k+1)\right)$

we can also exploit that $$k$$ is a small integer. From:

 $\Gamma(z+1)=z\Gamma(z)$

we have:

 \begin{align}\binom{x}{k}&=\frac{x(x-1)\cdots(x-k+1)\Gamma(x-k+1)}{k!\Gamma(x-k+1)}\\&=\frac{x(x-1)\cdots(x-k+1)}{k!}\\&=\frac{\displaystyle\prod_{i=1}^k (x-i+1)}{k!}\end{align}

In GAMS:

 scalars  x /112.8/  k /2/;parameter y;y("binomial") = binomial(x,k);y("log") = exp(loggamma(x+1)-loggamma(k+1)-loggamma(x-k+1));set i /1*2/;y("prod") = prod(i,x-i.val+1)/fact(k);display y;

with results:

 ----     10 PARAMETER y  binomial 1.0000E+299,    log         6305.520,    prod        6305.520

I used this last “product” version in the model.

Note that the overflow in the binomial function does not happen for all (or even most) large values:

 scalars  x /1112.8/  k /2/;parameter y;y("binomial") = binomial(x,k);y("log") = exp(loggamma(x+1)-loggamma(k+1)-loggamma(x-k+1));set i /1*2/;y("prod") = prod(i,x-i.val+1)/fact(k);display y;

gives:

 ----     10 PARAMETER y  binomial 618605.520,    log      618605.520,    prod     618605.520

## Monday, March 13, 2017

### Dynamic Documents and their importance

When a model is ran say every few months and a report has to be produced based on the model results, a copy-paste like modus operandi can easily cause synchronization errors: the wrong text is inserted not referring to the current results but to an older simulation result. I recently had a discussion with someone from a government agency who experienced exactly this problem.

A similar thing can and does happen when documenting a piece of software: examples, code fragments or screen dumps refer to older versions of the program.

R and Rstudio provide tools to create dynamic documents from R using a nice Rstudio environment. We could have said in the document something like:

A cost savings of r round(100*(oldcost-cost)/oldcost,1)% has been achieved.

which would generate:

A cost savings of 25.4% has been achieved.

Here we evaluated an R expression using data from the R session associated with the R markdown document.

##### Literate Programming

This is the grand-daddy of R Markdown. LP (Literate Programming) was meant to document source code, or rather have a single design document that could spit out either TeX for typesetting the document or Pascal (and later C) source code (4,5). Actually in some respects this tool was very flexible: things could be documented out-of-order. The name web was used for this. Knuth emphasized the usefulness of this: the order of thinking and writing about code is different than the order of generated source code. R Markdown prefers code chunks to be in order.

R users will appreciate the assignment symbol in LP:

##### R Markdown

The markup language to write documents in R is simple and somewhat restricted (6). Actually that may be a good thing. Too many ways to change the lay out of a document often leads to a somewhat less clean, more convoluted final product. Writing documentation while writing code is a big step forward I think. It makes it more pleasant to use and also has a beneficial impact on the code quality (explaining what you try to do helps to focus the mind and gets things better organized). I have heard people advancing the argument that documentation should be written before the code is written, but that looks to me not always a feasible or easy approach.

Recently better support for large documents (e.g. books) has been added in the form of being able to work sub-documents (sections or chapters) individually. Some books produced this way are shown at https://bookdown.org/.

##### Output formats

The main output formats of R Markdown are HTML, LaTeX, and MS Word. MS Word is especially important in non-academic institutions as Word and more general Office is a main work horse there. Click on the figures to enlarge.

Here is HTML, LaTeX and Word output of the same small example document.

##### Tables

Outputting tables is somewhat of a challenge. The easiest is to use the function kable. The LaTex and Word output is reasonable, but the HTML output needs some adjustment:

##### References
1. Yihui Xie, Dynamic Documents with R and knitr, 2nd ediition, Chapmand and Hall/CRC, 2015.
2. Yihui Xie, bookdown, Authoring Books and Technical Documents with R Markdown, Chapman and Hall/CRC, 2016. Also available through: https://bookdown.org/yihui/bookdown/
3. Christopher Gandrud, Reproducible Research with R and RStudio, Chapman and Hall/CRC, 2015
4. Donald Knuth, Literate Programming, http://www.literateprogramming.com/knuthweb.pdf, paper on LP (LP means Literate Programming in this context and not Linear Programming).
5. Donald Knuth, Literate Programming, CSLI publications, 1992 (the book)
6. https://www.rstudio.com/wp-content/uploads/2015/02/rmarkdown-cheatsheet.pdf a short document summarizing R Markdown.

## Saturday, March 11, 2017

### Employee Scheduling IV: direct optimization

In (1) a small employee scheduling problem is proposed. It is solved by first generating all possible shifts for each employee, and then solving a very simple MIP (Mixed Integer Programming) model to select the optimal combination of employees and shifts.

The optimization model is organized around a binary variable $$x_s \in \{0,1\}$$ defined by:

 $x_s = \begin{cases}1 & \text{if shift s is selected}\\ 0 & \text{otherwise} \end{cases}$

and is very straightforward:

 \begin{align}\min\>&\sum_s c_s x_s\\ &\sum_{s|E(e,s)} x_s \le 1\>\forall e\\ &\sum_{s|S(s,t)} x_s \ge R_t \> \forall t\\ &x_s \in\{0,1\}\end{align}

Here $$E(e,s)$$ indicates if shift $$s$$ is done by employee $$e$$, $$S(s,t)$$ indicates if time period $$t$$ is covered by shift $$s$$ and $$R_t$$ is the required staffing during period $$t$$. The first constraint makes sure every employee is doing at most one shift in the 24-hour planning period. The second constraint establishes that we have enough staffing in each (hourly) period. Of course we try to minimize total cost.

In (2) we replicated the model in GAMS, using GAMS code to generate all possible shifts. In (4) we used a different technique to generate all shifts: the Cplex solution pool. In (3) we don’t generate all shifts in advance, but use a delayed column generation scheme to generate attractive shifts on the fly.

Here I try to show that we can attack the problem also in a very different way: directly optimize the selection of employees to use and the hours they work. A basic model to generate a single feasible shift for this problem is shown in (4) and (3). In those (sub-)models a binary variable $$w_t \in \{0,1\}$$  is used to indicate if an employee is working at period $$t$$. For our “direct” model we extend this to:

 $w_{e,t} = \begin{cases}1 & \text{if employee e works during period t}\\ 0 & \text{otherwise} \end{cases}$

Again we first calculate the set canwork(e,t) indicating the time slots employee e can work. Furthermore we have parameters minlen(e) and maxlen(e) that tell us the minimum and maximum length of a shift for a given employee. Also the problem states that a shift is contiguous, i.e. no holes in the shift.

A MIP model can look like:

Notes:

• We fix $$w_{e,t}=0$$ when canwork(e,t) is not true.
• The variable shiftlen(e) is a semi-continuous variable. It can be zero, or between minlen(e) and maxlen(e).
• If you don’t want to use semi-continuous variables, we can use binary variables $$\delta_e$$ instead, and apply the constraint:
 \begin{align}&minlen_e \cdot \delta_e \le shiftlen_e \le maxlen_e \cdot\delta_e\\&\delta_e \in \{0,1\}\\& shiftlen \ge 0\end{align}
where shiftlen is now a positive variable. In GAMS we need to split this into two inequalities.
• The binary variable start(e,t) indicates when a shift starts. We observe this when w(e,t) switches from zero to one. A standard formulation is $$start_{e,t} \ge w_{e,t} – w_{e,t-1}$$. As we need to wrap around hour 24: hour 1 minus 1 is hour 24, we use the GAMS $$—$$ operator: $$start_{e,t} \ge w_{e,t} – w_{e,t--1}$$.
• We only allow up to one start to enforce that we have no breaks in a shift: shifts are contiguous in this model.

We rely on the presolver of the MIP solver to remove all fixed variables from the model. I am old-fashioned and I prefer to not even generate the variables that are not needed. This requires a little bit of attention. The new model can look like:

 binary variables   w(e,t) 'work at time t'   start(e,t) 'start of shift';semicont variables   shiftlen(e)  'length of shift';shiftlen.lo(e) = minlen(e);shiftlen.up(e) = maxlen(e);variable cost 'objective';equations   startShift(e,t)     'start of shift'   oneShiftStart(e)    'at most one start of shift'   calcLen(e)          'calculate length of shift'   coverPeriod(t)      'capacity requirements'   calcCost            'total cost';startShift(canwork(e,t))..   start(e,t) =g= w(e,t) - w(e,t--1)$canwork(e,t--1);oneShiftStart(e).. sum(canwork(e,t),start(e,t)) =l= 1;calcLen(e).. shiftLen(e) =e= sum(canwork(e,t), w(e,t));coverPeriod(t).. sum(canwork(e,t), w(e,t)) =g= requirements(t);calcCost.. cost =e= sum(canwork(e,t), wage(e)*w(e,t));model m /all/;option optcr=0;solve m minimizing cost using mip; In this model I “protected” the use of $$w_{e,t}$$ by canwork(e,t) in all cases. Note that in equation startShift the expression w(e,t--1)$canwork(e,t--1) evaluates to zero when $$w_{e,t—1}$$ does not correspond to canwork(e,t--1) being true. That is exactly what we want: if canwork(e,t) if false, implicitly we want the corresponding $$w_{e,t}=0$$. We can verify the correct behavior by inspecting the equation listing in GAMS:

 ---- startShift  =G=  start of shift startShift(SMITH,t7)..  - w(SMITH,t7) + start(SMITH,t7) =G= 0 ; (LHS = 0)     startShift(SMITH,t8)..  w(SMITH,t7) - w(SMITH,t8) + start(SMITH,t8) =G= 0 ; (LHS = 0)     startShift(SMITH,t9)..  w(SMITH,t8) - w(SMITH,t9) + start(SMITH,t9) =G= 0 ; (LHS = 0)     REMAINING 415 ENTRIES SKIPPED

The first version of the model has the following statistics:

 MODEL STATISTICS BLOCKS OF EQUATIONS           5     SINGLE EQUATIONS          545BLOCKS OF VARIABLES           4     SINGLE VARIABLES          981NON ZERO ELEMENTS         3,381     DISCRETE VARIABLES        918

while the second version shows:

 MODEL STATISTICS BLOCKS OF EQUATIONS           5     SINGLE EQUATIONS          483BLOCKS OF VARIABLES           4     SINGLE VARIABLES          857NON ZERO ELEMENTS         2,942     DISCRETE VARIABLES        856
##### Results

This model solves quickly (it is small), and we find an optimal solution with a cost of 4670. The schedule looks like:

We see one shift wrapping at t24: Employee Davis. The optimal solution is not unique. In (1) and (2) different solutions are obtained (with the same total cost). The optimal solution has no slack in staffing: we exactly match the staffing requirements:

##### References
1. Loren Shure, Generating an Optimal Employee Work Schedule Using Integer Linear Programming, January 6, 2016, http://blogs.mathworks.com/loren/2016/01/06/generating-an-optimal-employee-work-schedule-using-integer-linear-programming/ describes the problem with a Matlab implementation.
2. Employee Scheduling I: Matlab vs GAMS, http://yetanothermathprogrammingconsultant.blogspot.com/2017/01/employee-scheduling-i-matlab-vs-gams.html has a GAMS version of the model in (1).
3. Employee Scheduling II : Column Generation, http://yetanothermathprogrammingconsultant.blogspot.com/2017/01/employee-scheduling-ii-column-generation.html uses a Column Generation approach to solve this problem.
4. Employee Scheduling III: generating all shifts using the Cplex solution pool, http://yetanothermathprogrammingconsultant.blogspot.com/2017/01/employee-scheduling-iii-generating-all.html uses the Cplex Solution Pool technology to generate all possible shifts. This is an alternative approach to the method shown in this post.

### More meta-heuristics by combining them

Arguably the number of different meta-heuristics $$N$$ is already too large. One simple way to to make this number quickly $$N^2$$ is to combine them. Here this is done with Leaping Frogs and Artificial Bees.

#### Abstract

In order to obtain better generalization abilities and mitigate the impacts of the best and worst individuals during the process of optimization, this paper suggests Bee and Frog Co-Evolution Algorithm(abbreviation for BFCEA), which combines Mnemonic Shuffled Frog Leaping algorithm With Cooperation and Mutation(abbreviation for MSFLACM) with improved Artificial Bee Colony(abbreviation for ABC). The contrast experimental study about different iteratively updating strategies was acted in BFCEA, including strategy of integrating with ABC, regeneration of the worst frog and its leaping step. The key techniques focus on the first 10 and the last 10 frogs evolving ABC in BFCEA, namely, the synchronous renewal strategy for those winner and loser should be applied, after certain G times’ MSFLACM-running, so as to avoid trapping local optimum in later stage. The ABC evolution process will be called between all memes’ completing inner iteration and all frogs’ outer shuffling, the crossover operation is removed from MSFLACM for its little effect on time-consuming and convergence in this novel algorithm. Besides, in ABC, the scout bee is generated by Cauchy mutating instead at random. The performance of proposed approach is examined by well-known 16 numerical benchmark functions, and obtained results are compared with basic Shuffled Frog Leaping algorithm(abbreviation for SFLA), ABC and four other variants. The experimental results and related application in cloud resource scheduling show that the proposed algorithm is effective and outperforms other variants, in terms of solution quality and convergence, and the improved variants can obtain a lower degree of unbalanced load and relatively stable scheduling strategy of resources in complicated cloud computing environment.

## Friday, March 10, 2017

### ifthen adventures in GAMS (and R)

GAMS has a function ifthen which can be used in equations. For example:

e.. y =e= ifthen(x<1,(1+x)/2,sqrt(x));

This function implements:

 $y=\begin{cases}\displaystyle\frac{1+x}{2}&\text{if x<1}\\ \sqrt{x}&\text{if x\ge 1}\end{cases}$

This is a perfectly smooth function: it is continuous and differentiable.

However when we use it in GAMS we are in for a surprise:

 variable x,y;equation e;e.. y =e= ifthen(x<1,(1+x)/2,sqrt(x));x.fx = -1;model m /all/;solve m minimizing x using dnlp;

Here we fix $$x=-1$$ so we just want to evaluate at that point. We see:

**** Exec Error at line 4: sqrt: FUNC DOMAIN: x < 0

The underlying reason is that GAMS always evaluates both branches. Obviously we would much prefer it would not do that.

If we replace fixing the variable by:

 x.lo = -1;x.l = 2;

i.e. start at $$x=2$$ and move to the optimal value $$x=-1$$, we see:

 C O N O P T 3   version 3.17A   Copyright (C)   ARKI Consulting and Development A/S                   Bagsvaerdvej 246 A                   DK-2880 Bagsvaerd, Denmark   Iter Phase Ninf   Infeasibility   RGmax    NSB   Step InItr MX OK     0   0        1.4142135624E+00 (Input point)                   Pre-triangular equations:   0                  Post-triangular equations:  1      1   0        0.0000000000E+00 (After pre-processing)     2   0        0.0000000000E+00 (After scaling) ** Feasible solution. Value of objective =    2.00000000000   Iter Phase Ninf     Objective     RGmax    NSB   Step InItr MX OK    11   3        6.5613264487E-03 1.0E+00     1 3.9E-03      F  F    21   3        1.6608134823E-07 1.0E+00     1 6.0E-08      F  F    31   3        3.2205059005E-13 1.0E+00     1 1.5E-11      F  T    41   3        8.5912908646E-17 1.0E+00     1 2.2E-16      F  T    51   3        5.0871607672E-23 1.0E+00     1 3.4E-21      F  T    52   3        5.0871607672E-23 1.0E+00     1 ** Feasible solution. Convergence too slow. The change in objective   has been less than 3.0000E-12 for 20 consecutive iterations

The solver gets stuck at $$x=0$$. This is obviously incorrect. The only hint of the source of the problem is listed here:

 RESOURCE USAGE, LIMIT          0.062      1000.000ITERATION COUNT, LIMIT        52    2000000000EVALUATION ERRORS            803             0

This behavior is really a bug: we should see good feedback where the problem with domain errors originates from. Note that setting the limit of allowed evaluation errors (using option domlim) does not help in this case.

##### R

Interestingly R has something similar. Besides the standard if statement it has a vectorized ifelse function. The standard if is not vectorized which we can see from:

 > curve(if (x<1) (1+x)/2 else sqrt(x),-5,5) Warning message: In if (x < 1) (1 + x)/2 else sqrt(x) :   the condition has length > 1 and only the first element will be used

This only plots $$f(x)=\displaystyle\frac{1+x}{2}$$:

Basically this if statement is interpreted as:

if (x[1]<1) (1+x)/2 else sqrt(x)

With ifelse we get better results:

 > curve(ifelse(x<1,(1+x)/2,sqrt(x)),-5,5)Warning message:In sqrt(x) : NaNs produced

The plot is correct, but from the warning message we see that ifelse is also evaluating the $$\sqrt{x}$$ for all $$x$$. The ifelse function evaluates both branches.

To be precise this is only for vector arguments with some but not all $$x_i<1$$. For scalar arguments (vectors of length one) or more general if all elements of the vector are either less than one or all are $$\ge 1$$ things are ok:

Basically this is due to lazy but vector “all-or-nothing” evaluation.

##### Workaround

Of course the workaround is easy in this particular case. We can use something like:

e.. y =e= ifthen(x<1,(1+x)/2,sqrt(abs(x)));

Now the solver has no problems with this model, and can solve it very quickly (again with starting point $$x=2$$ and optimal solution $$x=-1$$):

 C O N O P T 3   version 3.17A   Copyright (C)   ARKI Consulting and Development A/S                   Bagsvaerdvej 246 A                   DK-2880 Bagsvaerd, Denmark   Iter Phase Ninf   Infeasibility   RGmax    NSB   Step InItr MX OK     0   0        1.4142135624E+00 (Input point)                   Pre-triangular equations:   0                  Post-triangular equations:  1      1   0        0.0000000000E+00 (After pre-processing)     2   0        0.0000000000E+00 (After scaling) ** Feasible solution. Value of objective =    2.00000000000   Iter Phase Ninf     Objective     RGmax    NSB   Step InItr MX OK     4   3       -1.0000000000E+00 0.0E+00     0 ** Optimal solution. There are no superbasic variables.

The same trick can be used in the R code:

curve(ifelse(x<1,(1+x)/2,sqrt(abs(x))),-5,5)

does not give any errors or warnings.

## Wednesday, March 8, 2017

### Multi-objective model with millions of objectives?

From (1):

I have a multi-objective optimization problem [] where I have about 100 millions of lines to maximize or minimize (objective functions).

When I searched about which algorithm should I use, everybody said it depends on the problem and the number of objective functions. The most indicated one was NSGAII, but I read somewhere that NSGAII was not so good for a high number of objectives. Is it true? Which algorithm is the best on my case?

This sounds really crazy.

Here is a plot of the Pareto optimal points of a (discrete) design problem with just 3 objectives:.

I don’t want to guess how this looks with a million of objectives.

##### References
1. Stackoverflow original post: http://stackoverflow.com/questions/42658957/which-is-the-best-algorithm-to-use-for-multi-objctive-optimization-with-millions
2. Visualization of large multi-criteria result sets with plot.ly, http://yetanothermathprogrammingconsultant.blogspot.com/2015/12/visualization-of-large-multi-criteria.html