Oh man, I remember in 2000 when I first started working in the industry we had t...

Oh man, I remember in 2000 when I first started working in the industry we had this database build process written in Java that took almost 30 days to run. The delivery schedule was monthly, and if anything went wrong we'd have to restart from checkpoint and the database build would be late. It also pegged a 32-CPU SMP DEC Alpha machine for the entire time, which was, well... CPUs would regularly (once every other build or so) cook the socket they were in and have to be replaced. The GS-320 would hot-swap (semi-reliably) so it wasn't a HUGE deal, but it would slow it down and inevitably the build would be a day or two late.

Enter myself and a buddy of mine. First thing we discovered was that they were using regular java.lang.Strings for all the string manipulation, and it'd garbage collect for between 30 and 50 seconds every minute once the process got rolling. It used a positively criminal number of threads as well in our predecessor's desperate attempt to make it go faster. SO much time was spent swapping threads on CPUs and garbage collecting that almost no real work got done.

Enter the StringBuffer rotation scheme. John and I decided to use the backup GS-160 as a hub to read source data and distribute it among 16 of our floor's desktop machines as an experiment. The hub was written in C++ and did very little other than read a series of fixed-length records from a number of source files and package them up into payloads to ship over socket to the readers.

The readers gut-rehabbed the Java code and swapped out StringBuffer for String (and io for nio) to take the majority of garbage collection out of the picture.

The trick we employed was to pre-allocate a hoard of StringBuffers with a minimum storage size and put them in a checkin/checkout "repository" where the process could ask for N buffers (generally one per string column) and it'd get a bunch of randomly selected ones from the repo. They'd get used and checked back in dirty. Any buffer that was over a "terminal length" when it was checked in would be discarded and a new buffer would be added in its place.

We poked and prodded and when we were finally happy with it, we were down to one garbage collection every 10 minutes on each server. The final build was cut from 30 days to 2.8 and we got allocated a permanent "beowulf cluster" to run our database build.