JSON (JavaScript Object Notation) files [1] provide a flexible and simple way to store and exchange data. It is text, so we can easily look at it. JSON files typically have the extension .json. (using it may help syntax coloring in editors). JSON can represent rather complex tree-like data structures that may require some work to convert into simpler formats such as tabular data.
EIA (U.S. Energy Information Administration) provides bulk data in JSON format [2]. These files are not strictly JSON. They have a .txt extension. Basically, each line is a JSON object.
To make it a proper JSON file one would have to wrap this into an array.
If we reformat the first object a bit, we see:
To store this in a GAMS parameter, we can envision something like:
parameter series(id,t) 'data for each series';
where t is a time/date index.
Apparently, a value of "-" means: not available. GAMS has a special value (NA) for this. However, we need to be a bit careful as these JSON data files have different conventions and codes for this purpose:
- null: this is a JSON keyword indicating no data
- "-", "--", "- -": different forms of dashes are used
- "W", "NM", "NA": different strings indicating "withheld", "not measured", "not available".
$ontext $offEmbeddedCode id series errors execute_unload "%gdxfile%"; |
This will export a GAMS GDX file with the data in GAMS readable format. The errCount parameter shows how many times we saw a given string instead of a number:
---- 67 PARAMETER errCount observations recoded as NA
- 3431, NA 36015, NM 22, W 1450, - - 1787, -- 3
Of course, in GAMS we can see:
The values with NA are replicated in the errors sets. The empty cells correspond to zero values: GAMS does not store zeros (it uses a sparse storage scheme).
- Some of the files are very large. elec.txt is 1.1 GB. It takes a while to process this file.
- Not all lines contain data series. We only look at lines with a series_id.
- The series_id string has encoded information in it. I did not unpack that here.
- The dates are not ordered from old to new. GAMS will order things based on how it receives set elements. If you want to order things say from old to new, then just create a dummy set with dates that are ordered, and declare that dummy set at the beginning of the model.
- If you want to explicitly store the zeros in GAMS, you can use the EPS special value for that. It is numerically zero but is actually stored.
- Reading JSON often requires a bit of programming outside GAMS (like here in Python). Writing JSON can be done with the PUT facility (JSON is just text). I often write JSON or JavaScript data files to visualize solutions in the browser. Examples are [3,4].
Conclusion
With just a little bit of Python code, we can read JSON objects and store them in a GAMS parameter.
References
- JSON, json.org.
- EIA Open Data Bulk Download Facility, https://www.eia.gov/opendata/bulkfiles.php
- Shortest path in GAMS, https://yetanothermathprogrammingconsultant.blogspot.com/2021/11/shortest-path-in-gams.html
- MacMahon Squares, https://yetanothermathprogrammingconsultant.blogspot.com/2021/12/macmahon-squares.html
No comments:
Post a Comment