After a quick look, I'm not sure if I would call this “industrial strength”. In particular, the join optimizer (typically the heart of a large-scale SQL optimizer) looks very rudimentary? And the statistics it uses have zero idea about correlation, no histograms beyond min/max…
I was wondering about the same claim. However, I believe that JOIN's are a common weakness among OLAP database engines, and DataFusion is built on top of a columnar storage format - Apache Arrow.
(can also use it in your own projects)
It is quite similar to what is described in this post