Hacker News new | past | comments | ask | show | jobs | submit login

I can't speak for XSV but when I had to do similar analysis of logfiles in range of hundreds of thousands of records - SQLite would consistently seize up on my machine and make it impossible to do analysis. Had i know about XSV I may have tried it, but GNU Join saved my bacon that week and would do my analysis in minutes.



That's why you can `EXPLAIN` your query, and maybe even add indices, to speed things up!

Plus in the last few years SQLite has gotten a lot more powerful, and has added lots of JSON support and has improved index usage. I'd avoid working with lots of data as flat ASCII text, because of all the I/O and wasted cycles to read the data and insert it into some data structure and then write it out as flat ASCII text, but it can be super debuggable.


> GNU Join

The join utility is actually a part of POSIX[1], so every UNIX should have one. Here's one from OpenBSD[2], for example. The GNU version probably has more flags though.

[1] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/j...

[2] https://man.openbsd.org/join.1


Plus POSIX join requires pre-sorted inputs.

BTW, there's never a bad time to mention http://johnkerl.org/miller/doc/ ...


Yeah my colleague went down the rabbit hole and swore never again. Good luck with carriage return bugs!


Did you create indices on the columns you wanted to join?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: