jw903's comments

jw903 · on Jan 29, 2024

thank you!

jw903 · on Sept 14, 2023

Impressive performance. In your experience, is there a range of vector dimension for faster search results?

ngalstyan4 · on Sept 14, 2023

We have not run microbenchmarks to see what dimension ranges perform best but those are coming soon! Below is an anecdotal answer:

We run our ci/cd benchmarks on 128dim sift vectors. We have some demos using clip embeddings (512dim) and baai/bge 768 dimensional embeddings.

Generally, smaller vectors allow higher throughput and result in smaller indexes. But the effect on performance is small. Once we merge the PR implementing vector element casts to 1 and 2 byte floats, the effect of this on throughput should be even smaller.

jw903 · on Sept 12, 2023

It's one of widely fine tuned model for now. Take a look at this colab for fine tuning on your dataset https://github.com/mlabonne/llm-course/blob/main/Fine_tune_L...

jw903 · on Sept 12, 2023

A detailed look at XGBoost hyperparameter tuning.

jw903 · on Sept 11, 2023

Checkout this blog post for tuning XGBoost hyperparameters involves experimenting with various settings