Totally agree. The thing is that ElasticSearch does not meet our requirements in vector searching.
I am currently running with Milvus + ElasticSearch, works perfect. The latest Milvus version is super fast and scalable (>50M vectors). Haven't tried Zilliz Cloud. Have to find out what the cost is.
I am old school. IMO ElasticSearch is only good for keyword search and these so called "vector databases" products are only good for vector search.
Others have made other suggestions, but Vespa has two unique features. First it is battle tested at a large scale, second it supports combining the keyword and vector scores in several ways. The latter is something that other hybrid systems don't do very well in my experience.
Didn't even realise Milvus was so lacking. https://github.com/marqo-ai/marqo also has a hybrid approach. It's just a more complete/end-to-end platform than pinecone, so it really just depends on what you're building
Could you please elaborate on how you utilize both of them together, and for which specific use case? I'm attempting to gain a better understanding of the hybrid approach.
The thing is to make ElasticSearch scores "comparable" to Milvus scores. Lots of ways to do this, but there's no single good solution. For example you could calculate BM25 score offline, or use TF-IDF score to do some kind of filtering. Again there's no single perfect answer. You'd have to do a lot of experiment according to your own use case and your own data to get the best results.
Also a lot of tuning needs to be done during all phases:
1) query pre-processing
2) query tokenizing
3) retrieval
4) ranking and reranking
I personally would not trust any universal "hybird-search" solutions. All toy demos.
It usually takes 5-10 good engineers to build a decent search engine/system for any real use case. It also requires a lot of turning, tricks, hand-written rules to make things work.
I am currently running with Milvus + ElasticSearch, works perfect. The latest Milvus version is super fast and scalable (>50M vectors). Haven't tried Zilliz Cloud. Have to find out what the cost is.
I am old school. IMO ElasticSearch is only good for keyword search and these so called "vector databases" products are only good for vector search.