My main takeaway is that if you care about performance so much, just ETL the dat...

frankmcsherry · on Sept 20, 2017

When I was grinding away on the 128B edge graph on my laptop, literally 85% of the time was parsing integers. Get your data to native binary as soon as you can (edit: or out of plain text at least, per parent).

summarity · on Sept 20, 2017

Yep. After working with JSON logs from 50GB up, converting them to Parquet (plus snappy compression) was really liberating. This also led me to Apache Drill, one of the best onprem data exploration tools IMO