From looking at this, I think it’s a very risky starting point for an engineer t...

chasd00 · on Jan 12, 2024

calculating the embeddings is still a mystery to me. I get going from a picture of an Apple to a vector representing "appleness" and then comparing that vector to other vectors using all the usual math. What I don't get is, who/what takes the image as input and outputs the vector. Same goes for documents, let's say i want to add a dimension (another number in the array) what part of the vector database do I modify to include this dimension in the vector calculation? Or is going from doc/image/whatver to the vector representation done outside the database in some other way?

edit: it seems like calculating embeddings would be something an ML algorithm would do but then, again, you have to train that one first. ...it's training all the way down.

nostrebored · on Jan 12, 2024

Yup it happens outside of the system — but there are a number of perks to being able to store that data in a db — including easily adding metadata, updating entries, etc.

I think in 10y we will see retail systems heavily utilizing vector dbs and many embedding as a service products that take into account things like conversion. In this model you can add metadata about products to the vector db and direct program flow instead of querying back out to one or more databases to retrieve relevant metadata.

They’ll also start to enable things like search via image for features like “show us your favorite outfit” pulling up a customized wardrobe based on individual items extracted from the photo and run through the embedder.

Just one of many ways these products will exist outside of RAG. I think we’ll actually see a lot of the opposite — GAR.