When I look at all the LLM SQL tools, I think: what a cheap and accessible way to get the wrong answers.
SQL is easy. Knowledge management is hard. Does the LLM know that there was a bug in June that changed the data? Does it know that this one column is improperly named and confusing? Does it know that you recently released a mobile app the data from which is in a different table?
No, of course not, those things are never explicitly documented and so are invisible to an LLM.
If it was that easy you wouldn't have a tradition of devs trying for any alternative.
There's obviously a use case for this sort of product, objections appealing to the ease of use of any technical language or toolset are unlikely to be convincing to the majority who are not comfortable with it.
The history of programming is devs frustrated at arbitrary limitations of syntax or modelling and forming new ones, with their own arbitrary limitations of syntax or modelling.
When your only tool is a FOR loop hammer, every set based operation frustratingly looks less like a nail than a screw.
Love this! I don't think you could be more right about the practical challenges of implementing something like this. In my experience, this same problem is what makes it so challenging to onboard new data scientists/analysts.
It takes a lot of training to get a team member up to speed - with the same amount of training, do you think an LLM can compete?
SQL is easy, the problem is that some enterprise database schemas have gotten incredibly complex over time. I think LLMs might help the maintainers navigate such a complex landscape. Especially if comments are added to each table with clarifications. The only serious limit here is the LLM's context length...
Hit the nail on the head! Not only is the context length a limitation, but the speed of response gets impacted as well.
With a human in the loop, even providing a "mostly" correct SQL that takes a swing at the correct joins between relevant tables reduces the data practitioner's work significantly. Of course, as more questions are asked, the tool gets better at writing the SQL better. Almost like a human in a Database Management and SQL class...
Sort of. Having perfect data engineering is a requirement if you want to connect an LLM straight to your data warehouse. For real world scenarios, you need a way to add context over time (including examples of how to answer questions from messy data). The same way a new team member would need to be on-boarded the tool needs to learn the context of the data and business logic, store it under supervision from an admin and be able to retrieve it in generating the SQL.
Personally I think that's a great response. Continuous schema mapping, gotcha-patching, and formalization of undocumented knowledge is imperative, and this can be solved by engineers using an ongoing process. Kudos on the launch.
I did an NL-to-SQL startup, but now I think is a much better time to do this.
SQL is easy. Knowledge management is hard. Does the LLM know that there was a bug in June that changed the data? Does it know that this one column is improperly named and confusing? Does it know that you recently released a mobile app the data from which is in a different table?
No, of course not, those things are never explicitly documented and so are invisible to an LLM.