I don't understand the emphasis here on vertical scaling. Move a database to a b...

paystax · on Aug 17, 2022

Completely true. You have to understand the economics behind this to see why their claim is hyperbole at best and flat out misleading at worst. The fundamentals of scalable vector search is that you are dealing with potentially huge dimensionality and huge datasets, that means that memory consumption will be huge even for a modest (by today's standards) datasets. This problem has garnered lots of research attention, so making such a bold claim makes you think what Pinecone has under the hood that others don't.

Pinecone is VC backed and they have taken in to the tune of $50M in funding. They have to claim the "first" in solving these challenging technical problem, otherwise they'd have to really explain that their "secret source" is not really ground-breaking but relying on a series of open-source components under the hood. VCs wouldn't want to be backing yet another donkey in the derby. The truth is that solutions like FAISS, ScaNN, Weaviate, Quadrant, ANNOY and co. are working on this problem on a much more fundamental level. Pinecone and Google vertex matching AI are working on it on a application level. If Pinecones's solution is truly groundbreaking, they'd publish it in a more scientifically rigorous way. So these claims are to be taken with a grain of salt for what they are: developer evangelism/marketing speak.

sgt101 · on Aug 17, 2022

There are 2 dimensions here :

* fundamental performance

* practical utilization

I led a prod project that uses FAISS at a bank. A huge amount of the work was about making the index practical in a real IT environment. For example we had to build a sharding system to allow it to scale, but also to allow it to be rebuilt with 0 down time. There were many other significant engineering steps required to get it depolyed.

So, I would say that if Pinecone could solve these problems they don't need to have fundamental breakthrough performance vs the open source systems. On top of that, as every dev knows, there are a bunch of hygiene components and features that prod software wants - connectors, admin interfaces, utilities. $50m is probably a bit low to cover all of these and the marketing to be honest - but it will go a long long way and I guess that there's series B funding to get over the line if they don't sell out.

On the otherhand they must avoid over-committing to the indexing approaches of today because if someone does make an algorithmic step forward and Pinecone don't / can't take advantage then the features that they provide that enable deployment are a matter of engineering. Also, at the end of the day I think vectorDB's are going to be an important niche in the enterprise and not at the scale of data warehouses, lakes, or application DB's. I think that fitting them into the enterprise IT puzzle scape is going to be very important in making them commercially successful and good VC investments.

fogx · on Aug 18, 2022

This sounds like a perfect job for jina.ai -> sharding, redudancy, automatic up- and down scaling, security features and most importantly: flexibility to use and switch whatever (vector) database https://jina.ai/

fzliu · on Aug 17, 2022

On the topic of publishing, these two papers from the Milvus community may be of interest to some folks.

SIGMOD'21 - https://www.cs.purdue.edu/homes/csjgwang/pubs/SIGMOD21_Milvu... This paper talks about the vector database vertical (compute core and user-facing API)

VLDB'22 - https://arxiv.org/pdf/2206.13843.pdf This paper discusses the development of a cloud-native vector database.

Reading through them should help folks understand where the novelty and difficulty in developing an full-fledged vector database comes from.

Disclaimer: I'm a member of the Milvus community.

ww520 · on Aug 16, 2022

Bigger machine doesn't automatically mean higher performance. The code needs to scale with the increased number of cores, has share-nothing or share-very-little approach to avoid contention, and uses efficient data structure to utilize the increased memory.

fzliu · on Aug 17, 2022

Larger disk space does help effortlessly scale storage in single-node systems, but I agree with you that shared nothing (and/or shared something) is a necessary step for extracting maximum performance on a larger machine. When it comes to distributed architectures, shared nothingness is important as well. The decoupling of storage from compute and stateless from stateful helps minimize resource allocation when it comes to billion-scale vector storage, indexing, and search. Milvus 2.0 implements this type of architecture - here's a link to our VLDB 2022 paper, if you're interested: https://arxiv.org/abs/2206.13843

PeterCorless · on Aug 29, 2022

Just having "moar disk!" ≠ "scalability." Because unlike running single-thread on a shard-per-core [or hyperthread] basis, aligning NUMA memory to those cores, etc., there's no way to make your storage "shared-nothing."

At ScyllaDB we've put years of non-trivial effort into IO scheduling to optimize it for large amounts of storage. You also need to consider the type of workload. Because optimizing for reads, writes, or mixed workloads are all different beasties.

More here:

https://www.scylladb.com/2022/08/03/implementing-a-new-io-sc...

marginalia_nu · on Aug 16, 2022

Why would search queries have contention with each other? Surely it's in the domain of embarrassingly parallel.

pbadams · on Aug 16, 2022

I agree in the typical case, but they support concurrent add/delete one of their index options. Handling consistency/contention for modifying whatever graph/tree/etc structure they are using is probably nontrivial, and the resulting cache invalidations would also likely affect the QPS.

P.S. Great work on your site, by the way - it's a really inspiring project!

marginalia_nu · on Aug 17, 2022

Seems to me you can do that in a way that ensures low contention between consumers by using a read-biased MRSW lock. It's not free such a construction, but it really shouldn't eat into your read performance all that much. You're adding hundreds of nanoseconds to your query time by acquiring and releasing a lock. Unless you're already serving millions of queries per second per thread, this is piss in the ocean.

PeterCorless · on Aug 29, 2022

Disk IO needs concurrency. And you're back to Little's Law.

mping · on Aug 17, 2022

I guess it's about downtime:

> With vertical scaling, pod capacities can be doubled for a live index with zero downtime. Pods are now available in different sizes — x1, x2, x4, and x8 — so you can start with the exact capacity you need and easily scale your index.

ragingcoder · on Aug 17, 2022

Horizontal scaling has been a feature of Pinecone for a while now.