But, in order to generate the vectors, I understand that it's necessary to use the OpenAI's Embeddings API, which would grant OpenAI access to all client data at the time of vector creation. Is this understanding correct? Or is there a solution for creating high-quality (semantic) embeddings, similar to OpenAI's, but in a private cloud/on premises environment?
Enterprises with Azure contracts are using embeddings endpoint from Azure's OpenAI offering.
It is possible to use llama or bert models to generate embeddings using LocalAI (https://localai.io/features/embeddings/). This is something we are hoping to enable in LLMStack soon.