For the purpose of the tutorial that we built, it really comes down to the type of data that you're using.
If you have data with PII:
One option would be to use Airbyte and bring the data into files/local db rather than directly to the vector store, add an extra step that strips the data from all PII and then configure Airbyte to move the clean file/record to the vector store.
The option that jmorgan mention is relevant here, using a "self-hosted" model.