Monday, June 13, 2016

Loops considered harmful

In many interpreted languages, the use of loops is discouraged, and “vectorized” implementations are preferred. This is the case for Matlab, R, Python and also GAMS. There are two main reasons:

  • loopless code has less clutter, and may be easier to write and read.
  • vectorized code performs often much better.

Here is an example:

Sets
    i   
/1*5/
   
alias(i,j);

Parameters

    c(i)
   
/1    42
    
2    44
    
3    45
    
4    47
    
5    47.5/
    Q(i,j);

loop(i, loop(j, Q(i,j) = 100 $ (ord(i) eq ord(j))));

Variables
    x(i)
    f;

loop
(i,
    x.lo(i) = 0;
    x.up(i) = 1);


The first construct sets the diagonal of Q to 100. This really should be written as:

Q(i,i) = 100;

One could read this as an “implicit loop”. Similarly in the second construct we can drop the explicit loop and just write:

x.lo(i) = 0;
x.up(i) = 1
;

Timings

The loop used to populate the diagonal of the matrix Q is really the worst possible way to express this. Here are some suggestions for improvement:

  • Replace loop(i, loop(j, by loop((i,j),  This does not make a difference in performance but it looks better.
  • Put the $ condition on the left side of the assignment: loop((i,j), Q(i,j)$(ord(i)=ord(j)) = 100); This will improve the performance a lot. The result of putting the $ on the left is that we assign diagonal elements only, instead of assigning all elements.  
  • Put the $ condition on the loop sets: loop((i,j)$(ord(i)=ord(j)), Q(i,j) = 100); This is even faster than the previous version.
  • Finally use an implicit loop: Q(i,i) = 100;. This is the most readable and also the fastest.

The speed difference in the assignment of Q for a set i of size 2000 is remarkable:

Formulation n=2000, time in seconds Code
1. Original formulation 30 loop((i,j), Q(i,j) = 100$(ord(i)=ord(j)));
2. Loop with dollar condition on the LHS 1.2 loop((i,j), Q(i,j)$(ord(i)=ord(j)) = 100);
3. Loop with dollar condition on the loop set 0.6 loop((i,j)$(ord(i)=ord(j)), Q(i,j) = 100);
4. Vectorized 0 Q(i,i) = 100;
Note: $ Conditions

To be complete, let’s detail how the GAMS $ condition works when placed left or right of the assignment. The construct

   a(i)$s(i) = 1

means if s(i)=true then assign a(i)=1.

Opposed to this,

    a(i) = 1$s(i)

means if s(i)=true then assign a(i)=1 else assign a(i)=0.

I seldom use the second form while I use the $ on the left all the time. This example shows this is with good reason.