I found that for local setups (running on my MacBook), the overhead of managing a vector index wasn't worth the hallucinations. Cosine similarity is just too fuzzy for code/tech docs.
I switched to using *Optimal Transport (Wasserstein Distance)* in-process. It essentially treats the memory as a geometry problem. If the "transport cost" from the query to the memory chunk is too high, it rejects it mathematically.
It’s way lighter than running a local Chroma/LanceDB instance, and the coherence is ~0.96 vs ~0.67 for standard embeddings.
I found that for local setups (running on my MacBook), the overhead of managing a vector index wasn't worth the hallucinations. Cosine similarity is just too fuzzy for code/tech docs.
I switched to using *Optimal Transport (Wasserstein Distance)* in-process. It essentially treats the memory as a geometry problem. If the "transport cost" from the query to the memory chunk is too high, it rejects it mathematically.
It’s way lighter than running a local Chroma/LanceDB instance, and the coherence is ~0.96 vs ~0.67 for standard embeddings.
It's free (MIT license) and open-source btw.
https://github.com/merchantmoh-debug/Remember-Me-AI