Thursday, July 28, 2011

Scheduling of TV Advertisement: Theory vs Practice

I am working on a model for scheduling TV advertisements. Looking at the paper: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.37.292&rep=rep1&type=pdf this is easy to model (but difficult to solve). My model seems at least 10 times as complicated than what this academic study suggests. We needed more complexity to deal with:

  1. The number of spots to broadcast is not known in advance (depends on expected ratings of a break)
  2. There are complicated issues with spread over time
  3. There is a rather complex issue with the quality of the remaining unused space (this needs to be valued with respect to expected ratings and optimized)
  4. Different contracts have different requirements, and these requirements are not always easily modeled
  5. The model is huge (> 150K binary variables) for a monthly schedule
  6. The input data (from a database) is large and complex (this is sometimes called “data-intensive”)
  7. Numerous other things

Thursday, July 21, 2011

Pyomo

For a thoughtful comment on the design of Python based modeling system see: http://yetanothermathprogrammingconsultant.blogspot.com/2011/04/performance-of-python-based-modeling.html?showComment=1310240113570#c3299569615459362578.

My assessment was clearly not convincing. To summarize: essentially the modeling framework Pyomo is a scalar system. This compares somewhat to the difference between AMPL and GLPK (http://www.gnu.org/software/glpk/glpk.html). Both systems are written in C so we have taken out the language issue (i.e. Python is slow compared to C). The modeling system in GLPK (called Mathprog) is dense (except for the final generated matrix which is of course sparse). This actually work in many cases just fine: many sub-matrices used in the model are often small enough that dense handling is not really a problem. I have seen this in my practice: the majority of the models implemented in GLPK/Mathprog just work fine. Of course there are some models (the big sparse suckers) that bring GLPK/Mathprog on its knees while AMPL will happily generate the model very quickly. If I see this I generally suggest some reformulation or more often just give the advise: buy AMPL.

I suspect this can also happen with Pyomo. For some models a simple dense/scalar structure where we iterate over the carthesian product of indices will not perform. Of course with a system like Pyomo the modeler has more possibilities to implement better data-structures so we can skip zeros quickly and efficiently. But that means we have moved the burden from modeling system to the modeler.

Patent Application

One of my clients is applying for a patent on a model I helped them with. I am mentioned as co-inventor. That is mostly just an ego-booster I suspect, as this was work-for-hire. From what I understand the patent application is quite narrowly defined, so hopefully I am still able to use my bag of LP tricks for other clients.

Tuesday, July 5, 2011

Min sum or min num

In an advertisement model we may have to deal with underdelivery: the delivery of less impressions, visitors, GRPs than contractually agreed upon. So what to do. We can minimize the SUM of this quantity with the possible effect of many orders with small underdelivery numbers (spreading the pain). When minimizing the MAX this effect is even guaranteed. Or we can minimize the NUMber of orders with underdelivery. 

image

In practice, a good approach is to minimize a linear combination of both the SUM and the NUM objectives.

GAMS: GDXDUMP and Equations

There is a fairly deep issue when comparing GDX files if one of the GDX files has the following properties:

  1. It is produced by GDXDUMP
  2. It contains Equation records

The following illustrates the problem:

C:\projects\tmp>gamslib indus89
Copy ASCII: indus89.gms

C:\projects\tmp>gams indus89 lo=0 gdx=1

C:\projects\tmp>gdxdump 1.gdx output=2.gms
*  GDX dump of 1.gdx
*  Library in use : C:\PROGRA~1\GAMS23.7
*  Library version: GDX Library      BETA  8Jun11 23.7.0 WEX 25828.25830 WEI x86_64/MS Windows
*  File version   : GDX Library      BETA  8Jun11 23.7.0 WEX 25828.25830 WEI x86_64/MS Windows
*  Producer       : GAMS Base Module BETA  8Jun11 23.7.0 WEX 25828.25830 WEI x86_64/MS Windows
*  File format    :    7
*  Compression    :    0
*  Symbols        :  281
*  Unique Elements:  245

C:\projects\tmp>gams 2.gms lo=0 gdx=2

C:\projects\tmp>gdxdiff 1.gdx 2.gdx eps=1.0e-6
GDXDIFF          BETA  8Jun11 23.7.0 WEX 25828.25830 WEI x86_64/MS Windows
File1 : 1.gdx
File2 : 2.gdx
Summary of differences:
   bdraft   Data are different
   brepco   Data are different
  ccombal   Data are different
  consbal   Data are different
     cost   Data are different
   demnat   Data are different
divcnlsea   Data are different
   divsea   Data are different
   fodder   Data are different
   grnfdr   Data are different
   laborc   Data are different
     nbal   Data are different
  nwfpalc   Data are different
     objn   Data are different
    objnn   Data are different
     objz   Data are different
    objzn   Data are different
  protein   Data are different
   prseaw   Data are different
  qcombal   Data are different
  subirrc   Data are different
   tdraft   Data are different
watalcpro   Data are different
watalcsea   Data are different
  watalcz   Data are different
waterbaln   Data are different
Output: diffile.gdx
GDXDiff finished

C:\projects\tmp>gdxdump diffile.gdx Symbols
*  GDX dump of diffile.gdx
*  Library in use : C:\PROGRA~1\GAMS23.7
*  Library version: GDX Library      BETA  8Jun11 23.7.0 WEX 25828.25830 WEI x86_64/MS Windows
*  File version   : GDX Library      BETA  8Jun11 23.7.0 WEX 25828.25830 WEI x86_64/MS Windows
*  Producer       : GDXDIFF
*  File format    :    7
*  Compression    :    0
*  Symbols        :   26
*  Unique Elements:  249
   Symbol    Dim Type  Explanatory text
1 bdraft      4  Equ  Differences
2 brepco      3  Equ  Differences
3 ccombal     4  Equ  Differences
4 consbal     4  Equ  Differences
5 cost        3  Equ  Differences
6 demnat      3  Equ  Differences
7 divcnlsea   3  Equ  Differences
8 divsea      2  Equ  Differences
9 fodder      4  Equ  Differences
10 grnfdr      4  Equ  Differences
11 laborc      4  Equ  Differences
12 nbal        3  Equ  Differences
13 nwfpalc     2  Equ  Differences
14 objn        1  Equ  Differences
15 objnn       1  Equ  Differences
16 objz        1  Equ  Differences
17 objzn       1  Equ  Differences
18 protein     4  Equ  Differences
19 prseaw      3  Equ  Differences
20 qcombal     4  Equ  Differences
21 subirrc     4  Equ  Differences
22 tdraft      4  Equ  Differences
23 watalcpro   3  Equ  Differences
24 watalcsea   3  Equ  Differences
25 watalcz     4  Equ  Differences
26 waterbaln   4  Equ  Differences

C:\projects\tmp>

Apart from a small bug related to $ONEMPTY, it turns out that the way GDXDUMP exports equations records, it looses information about the equation type. This leads to the above results. Luckily in practice this will not be an issue, as equations are not often part of such an exercise.

Background.

During a course I taught about GAMS I was asked if it is possible to edit GDX files. The answer is: GDXDUMP will give you a GAMS representation of a GDX file which you can edit. Using the call:

GAMS gmsfile gdx=newgdxfile

you can make a new GDX file with the changes incorporated. The tool GDXDIFF will allow you to monitor the changes.

Although this is correct, I chose the INDUS89 example as illustration. That was unfortunate, as that failed because of the exotic Equation problem.

Excel issue

I was doing some reporting on the results of a large, complex MIP model. I made the unfortunate error of having some spurious blank cells. With the =SUM() function, skipping such cells is identical as considering them as zero, so there is no problem. With a =MIN() function however this is not the case. Even =MINA() will not help there:

image

It is sometimes good to work with different people on a project so that one of your collaborators can catch this before submitting results to the customer!

In this case the result was beneficial for us. With this, our estimated improvement in monthly revenue when using a MIP based model, increased from 250K euros to 300K euros.

Sorting in GAMS

This question came up during a workshop I taught at a large power company:

I want to generate a number of load-duration and price-duration curves, but it takes too long.

Here is a small example demonstrating some formulations for sorting a 1d parameter:

option profile=1;
set i /i1*i10000/
;
parameter
p(i);
p(i) = uniform(1,100);

display
p;

execute_unload "p"
,p;
execute "gdxrank p.gdx p2.gdx"
;

parameter
pindex(i);
execute_load "p2.gdx"
,pindex=p;
display
pindex;


* slow

parameter p2(i);
alias
(i,j);
p2(i)=
sum(j$(ord
(i)=pindex(j)),p(j));
display
p2;

* almost as slow

parameter p3(i);
alias
(i,j);
loop((i,j)$(ord
(i)=pindex(j)),
   p3(i)=p(j);
);

display
p3;

* fast using trick:

parameter p4(i);
p4(i+(pindex(i)-
ord
(i))) = p(i);
display
p4;

The timings are:

Method Time (seconds)
SUM (assignment to p2) 28.407
LOOP (assignment to p3) 22.230
TRICK (assignment to p4) 0.265