Tuesday, February 22, 2011

XML files: they are big

I received some XML files from a client. I am always surprised how big they are:

  • XML: 235 MB (this is I how I received them)
  • Zipped XML (max compression): 4.85 MB
  • Imported into Access (as MDB file): 89.5 MB


  1. Depending on the nature of the content, the ratio of "overhead" (tag characters) to data (meaningful characters) can be rather high. I like the transparency of XML, but for some things a proprietary binary format may still be the best option.

  2. The XML file has 40+ tables buried inside. Transparency then becomes a little bit less obvious.

  3. There is a nice post about the angle bracket tax by Jeff Atwood here http://www.codinghorror.com/blog/2008/05/xml-the-angle-bracket-tax.html

    He suggests YAML as an alternative to XML, which is much cleaner and smaller.

  4. That's why I was surprised to hear a few years ago that projects in COIN-OR are using (or were going to use) XML for saving model data. YAML or JSON would be better choices.