Friday, October 20, 2023

GAMS: SMAX and sparsity

This is a discussion about the SMAX function in GAMS and how it behaves for sparse data.

The data structure we were facing was something like:

set
i 'cases' /case1*case100000/
j 'attribute' /j1*j25/
k 'attribute' /k1*k25/
t 'type' /typ1*typ2/
;

parameter p(i,j,k,t) 'positive numbers';
* note: for each i we have only one (j,k)

The data for parameter \(\color{darkblue}p_{i,j,k,t}\) is originating from a database. A small (simulated) data set can look like:

---- 20 PARAMETER p positive numbers

typ1 typ2

case1 .j5 .k22 0.550 0.301
case2 .j8 .k6 0.350 0.856
case3 .j2 .k13 0.998 0.579
case4 .j25.k20 0.131 0.640
case5 .j4 .k7 0.669 0.435
case6 .j9 .k9 0.131 0.150
case7 .j15.k21 0.231 0.666
case8 .j20.k8 0.110 0.502
case9 .j5 .k22 0.265 0.286
case10.j15.k19 0.628 0.464

We see that for each case \(i\), we have exactly one \(j\) and \(k\). 


First slow SMAX implementation


We want to calculate:

parameter q(i,j,k) 'smax(t, p(i,j,k,t))';
q(i,j,k) = smax(t,p(i,j,k,t));

This obviously gives:

---- 51 PARAMETER q smax(t, p(i,j,k,t))

case1 .j5 .k22 0.550
case2 .j8 .k6 0.856
case3 .j2 .k13 0.998
case4 .j25.k20 0.640
case5 .j4 .k7 0.669
case6 .j9 .k9 0.150
case7 .j15.k21 0.666
case8 .j20.k8 0.502
case9 .j5 .k22 0.286
case10.j15.k19 0.628

Although this calculation is correct, it is painfully slow for large data. The reason is that parameter \(\color{darkblue}p\)  is very sparse: for every \(i\) we have one \((j,k)\). It is noted that GAMS uses sparse storage of data: i.e. "does not exist and being zero is the same". In GAMS, the sum function is very fast when running over sparse data: we can safely skip the zero values. However, when calculating the smax function, we can't skip safely all zeros. It will evaluate the maximum over all possible combinations \((i,j,k)\) including ones we have no data for. When using 100,000 cases this smax calculation takes 7.2 seconds. 

For \(|I|=10\) cases, we have \(|I|\cdot|J|\cdot|K| = 10\cdot25\cdot25 = 6,250\). For \(|I|=100,000\) this becomes 62.5 million. These are large numbers!


Faster alternatives: exploit sparsity


A fast version will protect us from running over the Cartesian product \(I\times J\times K\). A first version is:

parameter q2(i,j,k);
q2(i,j,k)$sum(t,p(i,j,k,t)) = smax(t,p(i,j,k,t));

This assumes \(\color{darkblue}p\ge 0\). The $sum construct says: only do the assignment if one or more of the \(\color{darkblue}p_{i,j,k,t}\) is non-zero. This version does not work always correctly if negative \(\color{darkblue}p\) are allowed. For instance \(-2+2=0\), so a sum of zero does not imply that each value is zero. In the case where we allow negative values, we can do something like:

parameter q3(i,j,k);
q3(i,j,k)$sum(t$p(i,j,k,t),1) = smax(t,p(i,j,k,t));

This version explicitly counts the number of nonzero elements \(\color{darkblue}p_{i,j,k,t}\) over \(t\). The profiling information 

--- Profile Summary (6 records processed)
     7.187   0.034GB        27 Assignment q (100000)
     0.047   0.043GB        36 Assignment q2 (100000)
     0.031   0.051GB        42 Assignment q3 (100000)
   

indicates that our fast versions are more than 100 times as fast (the first column contains the timings).  The difference in timings between versions q2 and q3 is just noise.

My preferred approach is to explicitly store the sparsity pattern of the \((i,j,k)\) tuples into a separate set. We can do:

set ijk(i,j,k);
ijk(i,j,k) = sum(t$p(i,j,k,t),1);
parameter q4(i,j,k);
q4(ijk) = smax(t,p(ijk,t));

Often, when we import sparse data tables, it is a good idea to form sets representing the sparsity patterns. That will allow us to more efficiently operate on the data.


Conclusion


Often, exploiting sparsity comes naturally in GAMS models. But sometimes, we need to pay close attention and make sure we don't operate on the Cartesian product of sets. Large, data-intensive models require much more care than small ones. 

Appendix: GAMS model


Experiments with randomly generated data.

 

set

*   i 'cases' /case1*case10/

   i 'cases' /case1*case100000/

   'attribute' /j1*j25/

   'attribute' /k1*k25/

   'type' /typ1*typ2/

;

 

parameter p(i,j,k,t'positive numbers';

* note: for each i, we have only one (j,k)

scalar s1,s2;

loop(i,

    s1 = uniformint(1,card(j));

    s2 = uniformint(1,card(k));

    p(i,j,k,t)$(ord(j)=s1 and ord(k)=s2) = uniform(0,1);  

);

 

option p:3:3:1;

display$(card(i)<=100) p;

 

option profile=1;

 

* version 1  (very slow)

* This calculates the smax for all combinations (i,j,k) (cartesian product)

parameter q(i,j,k'smax(t, p(i,j,k,t))';

q(i,j,k) = smax(t,p(i,j,k,t));

 

option q:3:0:1;

display$(card(i)<=100) q;

 

* version 2  (fast)

* This skips the smax for non-existent combinations (i,j,k)

assumption: p>=0

parameter q2(i,j,k);

q2(i,j,k)$sum(t,p(i,j,k,t)) = smax(t,p(i,j,k,t));

 

 

version 3: also works with negative values p (fast)

* Here, the $sum is counting nonzero values

parameter q3(i,j,k);

q3(i,j,k)$sum(t$p(i,j,k,t),1) = smax(t,p(i,j,k,t));

 

 

version 4. Introduce a set with existing combinations

* (also fast, but ijk+q4 takes a bit more time)

set ijk(i,j,k);

ijk(i,j,k) = sum(t$p(i,j,k,t),1);

parameter q4(i,j,k);

q4(ijk) = smax(t,p(ijk,t));

 

No comments:

Post a Comment