On further experimentation, the approach is maybe too aggressive in trading memory for speed... It runs out of memory on my 32-bit machine if you try to run it for the range 1-10 instead of 1-9.