Tuesday, June 7, 2011

Writing solutions back into a database

The issue of writing optimization results back into a database is often a little bit more complex than reading input data (see http://yetanothermathprogrammingconsultant.blogspot.com/2011/04/live-databases.html for some thoughts on reading data).

First with writing more things can go wrong: you need additional permissions, you can overwrite data, and conceptual there are some issues. Here are some questions and possibilities that come to mind:

  1. Do we have a say in how the tables look like or do we need to follow some existing data model?
  2. Update or Insert. Do we use an SQL INSERT or SQL UPDATE to populate solution tables?
  3. Some databases have SQL extensions that implement an UPSERT. E.g. MySQL has INSERT INTO … ON DUPLICATE KEY UPDATE, or REPLACE INTO, Oracle and SQL Server have MERGE.
  4. It may be better to erase the solution tables first.
  5. Or always INSERT with a new key indicating the run id. A new solution gets a new run id.
  6. For very large data some RDBMS systems have a BULK INSERT facility (SQL Server: BCP or BULK INSERT, SQL*Loader for Oracle).
  7. We can write to temporary tables and deal with the problem of merging results into the real tables further down the road.
  8. Just write a well-designed temp database (e.g. in Access) and let the IT people deal with getting the data into their systems.

In general I prefer 8, let others take responsibility for plugging things into the database. Note that this step can be expensive. A few weeks ago I was discussing a complex model where we needed to get the results into the RDBMS. To achieve a reasonable turnaround time, I was given a time limit of about an hour to come up with solutions for the model, while the system guy allocated 2 hours for himself to put the solution back into the database (much time is spent on checks).

2 comments:

  1. Your post about the many pitfalls of loading data back into databases is making me feel less angst about the fact that Pyomo still doesn't support this feature. Every time I think about how we would do this robustly, I run into the same sort of design questions. :P

    ReplyDelete
  2. It posses another question: "where should the client visualize the results?"
    Sometimes it is just better to have a GUI together with the model and let users view and rollout plans through that.

    ReplyDelete