But SQLite is not even present in the TPC-DS benchmark graphs, while Pandas (whi...

lnkuiper · on Aug 27, 2021

Customer fits in memory, whereas catalog_sales does not.

We chose to remove SQLite from the results because it was so much slower. The plots are much less readable when they are stretched out by something that is slower by an order of magnitude

masklinn · on Aug 27, 2021

> Customer fits in memory, whereas catalog_sales does not.

Didn't prevent using pandas which had to rely on dynamic swapping? Or is in-memory sqlite unable to use that much memory?

> We chose to remove SQLite from the results because it was so much slower. The plots are much less readable when they are stretched out by something that is slower by an order of magnitude

So you're using on-disk sqlite because it fits in memory (unlike pandas which also fits in memory) but you're dropping it anyway because it's too slow when it works on-disk?

lnkuiper · on Aug 27, 2021

You are right, we could probably re-run SQLite purely in memory, but only because macos dynamically allocates additional swap.

However, I would not expect much improved performance, because I do not believe that SQLite has a different sorting strategy when running in memory. It would only save some i/o operations, which are very cheap on the macbook anyway.

Either way, would be an interesting experiment.