Anything currently private that these companies can't access and train their models on is the one valuable competitive advantage you have going for you, giving them access to it for a bit of convenience seems short sighted.
I assumed GP was referring to OpenAI, not danswer (given that they mentioned that those companies were training models). And you're still using OpenAI's API, so neither open source and self hosting affect data collection.
You can plugin any model of your choice! Self-hosted, open source models are a great choice if you're very concerned about keeping your data safe and secure
> Note: On the initial visit, Danswer will prompt for an OpenAI API key. Without this Danswer will be able to provide search functionalities but not direct Question Answering.
There are open ai compatible chat/completion endpoints for local LLMs. You point the url to your self hosted version, and use the API key you started it with...
But it only works if you have already written all of the documentation manually, and kept that up to date. It's basically a chat bot that knows all of your documentation.
The scenario is a customer opens a chat box on your website and asks some questions for the LLM.
You wouldn't expect your customers to search your internal Confluence pages. The LLM would be trained on all of your internal documentation which is not exposed publicly.
Hallucination is mostly a problem with insufficient training with the current generation of LLMs.
Edit: Maybe not "all" of your internal docs should be exposed via LLM. But the idea is this is an interactive support agent for customers.
that sounds like a dangerous scenario. If your docs are intentionally internal and not public, why would you let a publicly accessible LLMs answer questions with info from them?
An LLM trained on public docs for the public could be a better interface for projects with lots of public documentation.
An LLM trained on internal docs only accessible to internal users might be similarly useful
Even a private LLM on public docs for your support agents to use could increase their efficiency.
But I would never expose an LLM to the public that has been trained on data I don't want public
Actually even the free ChatGPT looks better than Google and even DuckDuckGo results. Unless you want to find the nearest pizza joint ofc.
Of course you have to verify, but it seems to give you stuff that's related to what you're searching for to verify, while a traditional web search has become kinda useless for that because of all the spam pages that are clones of each other.
The memes of society are hallucinations. Worked ok so far.
If you want to live by raw logic well, you’re one of billions, idgaf what you want.
^^ there’s social life under raw logic, sort of like regular life where I have no obligation to your existence, but everyone reminds you explicitly instead of hallucinating otherwise cordially
Hallucinations may not be all that bad unless they’re hallucination's that lead to atrocity. Like the hallucination we can keep burning resources to make AI bots.
To use custom GPTs you need ChatGPT plus subscription. So your customers need to have ChatGPT plus subscription to get support? As far as I know there is no API to integrate custom GPTs into.
One example that says "no" to your question. -> https://ollama.ai/ There are surely more. It can be used with something like "LangChain" or "LlamaIndex" to give the locally hosted LLM access to local data, and a bit of Python "glue code" to tie it all together.