POI+HSSF: Using Excel with Java

Today, my work day was spent using POI HSSF, which can read and write Excel spreadsheets in Java. You can read all about it on their website: http://poi.apache.org/hssf/index.html. It does everything I nearly want to do, except one crucial part of today's task, which is cell validation. You can setup Excel to "validate" a cell by restricting what values can be entered. One such method is providing a list of values, and the cell will act like a dropdown menu.

Heartbroken that I couldn't use POI to generate my Excel document, I setup my code to read in an Excel document formatted exactly how I wanted and use POI to modify the current document with the list values I want. In Excel, list values can be designated with a Name. In order to name a list, select the cells you wish to use and then enter a name for them.

When I attempt to do this action using POI, the names are looking in radically different places than what I specified. I still haven't quite figured out why, but after digging down deep in the object models of POI, I've learned some interesting design choices by the developers.

Layered Design

POI seems to have two object layers in its API. There's a very low level object model that is very machine. The objects and values at the bottom are somewhat difficult to decipher, as they describe each individual part of the excel document in a very programmatic way.

The upper layer is a very easy to understand set of objects. If I want to make a cell, I only need to call one method that will set everything up me, rather than create an object and set all the necessary trivial data for Excel to read. This one method will do all that for me.

Hurrah for school, I've actually used this design a couple times before both in School and Work. This middle-man layer is known as the Policy Layer. It's main purpose is to create a bridge between low level details and the user. It separates your program or API into two chunks. One chunk being a code representation of the machine code you're operating on. One chunk being a very easy to use API that operates on the low level objects so the User doesn't have to.

This type of design is used much more with GUI design, where your program is the low level chunk, and all the GUI code is the upper level chunk. Because the GUI code and operational code are completely separate, it makes switching out and refactoring the GUI much much easier.

Actually, I suppose that's the same with all object packages. I mean, a package is nothing more than a jar of code someone else designed, tested, and published to help out other people with the same problems.

Shelled Layers

Well that's fine, except for my error I mentioned earlier. The policy layer of POI is screwing up my Excel document underneath, and unlike a good Policy and operation layer setup.. the Policy layer is completely obstructing my view of the operation layer.

I know exactly what low level stuff I need to change, but all the Policy objects have no getters, no setters, and no way of getting to it. The Policy layer acts as a shell with very little holes to exploit.

Policy layers should never get in the way of what people want to do, and this fails one of the golden rules of software design which is putting the User in control. Sure you don't want to confuse your User, and you certainly want to keep them from screwing up everything.. But you are also shutting out the Users that actually know what they're doing.

Fricken annoying... though, who knows.. I'll check it out again tomorrow and see if I missed something.. otherwise I may need to hack apart the library a little bit to get what I want.

Acronyms

I also find it really odd that through all this documentation and code, I still have no idea what the hell POI or HSSF stands for.. honestly, I really couldn't even guess what it stands for either.

So, I looked it up on wikipedia.. POI stands for Poor Obfuscation Implementation, and HSSF stands for Horrible SpreadSheet Format. Nice!

POI would be the Policy layer in this case, and HSSF being the low level operation layer. I don't think their implementation is that bad, but if I need to override classes and play around a little bit with reflection, then it is aptly named, my friend.