Yes, we have already submitted the model for evaluation on the Spider holdout te...

philipodonnell · on Aug 31, 2023

I don’t think it’s necessarily about a “universal” solution, just “better”. Make the column names more verbose, changing numeric enums to text ones, disambiguating column names, etc. One of the spider datasets is a stadium table and one of the column names is “average”, which means average capacity, but it’s super ambiguous. If you asked an LLM to “make these table columns more verbose” I bet it would call that “average_capacity” and all of the sudden some NLQ queries that confused the function and the column name would start to work.