More

snikolaev · 2025-03-09T10:15:34 1741515334

This article may be more relevant https://manticoresearch.com/blog/github-semantic-search/

snikolaev · 2025-03-07T07:37:28 1741333048

Meilisearch doesn't support BM25, does it?

Kerollmops · 2025-03-07T11:21:13 1741346473

Nope, it doesn't. It's based on Cascade Ranking, also called [bucket sorting][1]. We released our new Hybrid search ranking system, combining the best full-text search results (our Cascade Ranking) with semantic results (with arroy, our full-Rust Vector Store). You can try that at https://wheretowatch.meilisearch.com.

[1]: https://en.wikipedia.org/wiki/Bucket_sort

snikolaev · 2024-12-29T03:24:46 1735442686

Check out Manticore Search for your use case. It's open-source, cost-effective, and doesn't require keeping everything in memory.

Key points:

- Columnar Storage: Efficiently handles large datasets on disk, ideal for terabyte-scale data. It's not enabled by default but can be set up easily with "CREATE TABLE ... ENGINE='columnar'".

- Faceted Search: Probably easier than anywhere else with just "FACET <field name>" added to your "SELECT" query.

- MySQL Protocol and SQL Support: If you’re familiar with SQL and MySQL, it's easier to get started compared to other search engines.

iambateman · 2024-12-30T16:16:11 1735575371

Thanks for your recommendation. I ended up going with Quickwit, since it lets me store data on S3.

snikolaev · 2024-09-23T06:19:00 1727072340

https://manual.manticoresearch.com/Creating_a_table/NLP_and_...

snikolaev · 2024-07-29T07:51:04 1722239464

> We filtered out PISA Search and Manticore because neither of them offers search-as-you-type and facet search features

Manticore does support facet search and it's quite powerful in Manticore:

- docs - https://manual.manticoresearch.com/Searching/Faceted_search#...

- interactive course - https://play.manticoresearch.com/faceting/

Search-as-you-type depends more on the client, not on the backend. However, Manticore provides the autocomplete and fuzzy features (both in beta stage though). More info here https://github.com/manticoresoftware/manticoresearch/issues/...

snikolaev · on Sept 12, 2023

Manticore Search

snikolaev · on June 26, 2023

MySQL's full-text ranking capabilities are quite limited and AFAIK full-text wasn't a priority for them lately. The related article is "Rankings with InnoDB Full-Text Search" [1]

If it works for you - great. If you need more flexibility in terms of data tokenization, matching and ranking you can consider Manticore Search [1] instead of Elasticsearch since it's a continuation (a fork made in 2017) of the Sphinx search engine mentioned in the article on mysql.com and has a better integration with MySQL than Elasticsearch (e.g. you can use Linux mysql client or any programming language mysql connector to make queries to Manticore).

[1] https://dev.mysql.com/blog-archive/rankings-with-innodb-full...

[2] https://github.com/manticoresoftware/manticoresearch

snikolaev · on June 9, 2023

> I decided to experiment with this setup and the NY Taxi Dataset. The initial goal was to populate ElasticSearch with ~14 million rows, loading data from a compressed parquet file of ~350 MB.

> I tried multiple times, but the operation failed continuously, due to JVM memory constraints

Here's a script https://github.com/db-benchmarks/db-benchmarks/blob/main/tes... which loads 1.7B NYC taxi ride documents into Elasticsearch.

snikolaev · on March 30, 2023

> Meilisearch focuses on simplicity, relevancy, and performance.

> excellent relevance out of the box

> if ease of use, performance, and relevancy are important to you, Meilisearch was made for you

Is there a benchmark that shows Meilisearch outperforming Elasticsearch in terms of relevance score? I couldn't find Meilisearch listed on https://github.com/beir-cellar/beir.

Kerollmops · on March 30, 2023

We are not in contact with beir or the owner of the bei-cellar oganisation.

However, we started tracking our relevancy with the TREC 4 & TREC 5 data which are provided by the NIST organisation [1]. I can only tell that the results are very good and that we continue to improve that. We will talk about that in a blog post.

[1]: https://nist.gov

snikolaev · on March 27, 2023

It's not open source since 2017. The open source fork is https://github.com/manticoresoftware/manticoresearch