25 November 2009

Scala == EPIC WIN

Find an excuse, any excuse, to try Scala. My excuse was that the (otherwise awesome) Python xlwt Excel-writing library didn't support hiding cells and took a minute and a half to process a huge spreadsheet. Apache produces the POI library, which supports hiding. It's also a lot faster.

To save myself the agony of writing a Java program, I tried Scala, which compiles to JVM bytecodes and interoperates with any Java library. It also has a ridiculouly good XML integration -- there's a lightweight (non-DOM) XPath-like API for processing XML. You can apparently even paste arbitrary XML directly into your program -- the parser is smart enough to turn it into Scala objects.

Scala has been criticized by no less than Guido van Rossum for having a too-complex type system. He's probably right -- the type system is really complex. It's also more verbose than ideal: unlike O'Caml, you have to declare the type of every function parameter. But for my purpose, using a Java library without actually having to type
ExcessivelyLongClassTypeName excessivelyLongObject = new ExcessivelyLongClassTypeName();
a zillion times, it was great. It didn't take long to take the XML dump of my data and shove it into an Excel spreadsheet with POI.

One gotcha in POI: if you make a reference to a cell on another sheet before that sheet exists, POI will fail silently, whereas xlwt would throw an exception. For instance, if you entered the formula "='Sheet 2'!D3" before you created Sheet 2, the output cell in the .xls will contain =$#REF!.D3 instead of what you entered. You can avoid this by creating all of the sheets at once and filling them in later. (Note that you can't use the setSheetOrder() call to reorder sheets after you create them; the sheet references get silently b0rked. I'm surprised this bug hadn't been caught before; I just reported it.) Otherwise, POI is epic win, too.

(Also, you know how Joel Spolsky said that it's basically impossible to read or write Office formats, so you might as well just give up and use COM? I'm not so sure.)

No comments:

Post a Comment

About Me

blog at barillari dot org Older posts at http://barillari.org/blog