What I'm saying is that I, as a human, can spot simple situations under which semantically-correct optimizations are available. I can also create queries where literally cutting and pasting IDs over and over again into a WHERE clause is quicker than joining or using an IN clause. These things are dumb.
I personally think there are massive opportunities for data description languages and query languages that make expressing a single set of semantics simpler. SQL is supposedly a declarative language, but almost always requires you to understand _all_ the underlying implementation details of a database to get good performance.
Beyond that, I'd be more than happy with ML that enhanced, heuristically, some parts of the query optimiser. Hell, I'd be happy with a query optimiser that just went away and optimised, 24 hours a day, or at least didn't just spend 19ms planning a query that is then going to run for 8 hours.
ML with human-in-the-loop (HITL). There is a feedback loop that learns based on human expectation. Peloton[1] is already learning based on data access patterns, but doesn't consider subjective feedback. Maybe there could be a query planner that learns which query plans are better via reinforcement learning (the reward is given by the human).
I personally think there are massive opportunities for data description languages and query languages that make expressing a single set of semantics simpler. SQL is supposedly a declarative language, but almost always requires you to understand _all_ the underlying implementation details of a database to get good performance.
Beyond that, I'd be more than happy with ML that enhanced, heuristically, some parts of the query optimiser. Hell, I'd be happy with a query optimiser that just went away and optimised, 24 hours a day, or at least didn't just spend 19ms planning a query that is then going to run for 8 hours.