Even in that article with much smaller vectors than what GPT puts out (1536 dimensions) QPS drops below 100 if recall@1 is more than 0.4. That's to say nothing of cost of regenerating this index using incremental updates. I don't get why people on HN are so adamant on the idea that no one needs scale beyond 1 machine ever.