Thursday, October 1, 2015

SQL as optimization language?

Of course the answer is no. SQL is good to deal with data tables but trying to shoehorn optimization models into SQL looks to me like using the wrong tool.

There have been more attempts in this direction:


To me it feels a bit like writing a Basic interpreter in TeX (, interesting from a theoretical point of view but of limited use in practice.

Using relational databases to do some data wrangling is of course a different thing. Many or even most practical models have a significant amount of data manipulation. E.g. often the pure model equations take less than 10% in a real GAMS model. The interesting thing about using a relational database to store model is that we are more flexible: we have more freedom into deciding where to do certain data steps. We can do them in SQL or in the modeling language. Especially when looking at the skill set of developers, it may be a good idea to offload data manipulation to the SQL layer (there are more developers familiar with SQL than with GAMS or AMPL). As the data manipulation part of a modeling exercise is often the most significant part, this division of labor can pay off. 

Of course not all data stuff can be done conveniently in SQL. Currently I am working on a project where just to get some good big-M values needed in some MIP models, we actually solve a series of optimization models. Data steps requiring optimization is not that unusual: e.g. in economic models we often use some matrix balancing to get the data in shape.

Writing the optimization model itself in SQL, in my mind, has little value. For simple, small models it has no particular advantage above a Lindo style, LP file input. For large, more complex models it just will not work. A language is meant to express the problem at hand somewhat conveniently and also should help you to reason and think about the problem. I don’t believe SQL is doing a good job here.

Update: a more detailed list of arguments in favor of SQL: