A friend of mine does a lot of work that often boils down to neighborhood searches in high-dimensional spaces (20-200 dimensions). The "Curse of Dimensionality" means that trees and tree-based spacial data structures are ineffective at speeding up these searches because there are too many different paths to arrive at nearly the same place in 150 dimensional space.
Usually the solutions end up to use techniques like principle component analysis to bit-pack each item in the dataset as small as possible. Then to buy a bunch of Tesla GPUs with as much RAM as possible. The algorithm then becomes: load the entire dataset in the GPUs memory once, brute force linear search the entire dataset for every query. The GPUs have enormous memory bandwidth and parallelism. With a bunch of them running at once, brute forcing through 30 gigabytes of compressed data can be done in milliseconds.
Usually the solutions end up to use techniques like principle component analysis to bit-pack each item in the dataset as small as possible. Then to buy a bunch of Tesla GPUs with as much RAM as possible. The algorithm then becomes: load the entire dataset in the GPUs memory once, brute force linear search the entire dataset for every query. The GPUs have enormous memory bandwidth and parallelism. With a bunch of them running at once, brute forcing through 30 gigabytes of compressed data can be done in milliseconds.