> The syntax doesn't seem noticeably clearer? From the very first example: csvgr...

Malarkey73 · on May 28, 2014

I take you're point and those other well made points about line continuations below -- but you're over egging it here as

cut -d, -f1 2010.csv | grep ILLINOIS

works just fine for this data. I emphasise again - I take your overall point though.

acdha · on May 28, 2014

Yeah, I'm more sensitive to this class of errors now that I live in DC since the "Washington, District of Columbia" format is common enough to show up sporadically.

erikb · on May 28, 2014

I also don't really understand why you want to grep the whole block instead of the single word 'ILLINOIS'.

acdha · on May 28, 2014

My examples anchored it to avoid matching outside of the expected field. That probably wouldn't matter for the easy data used in the csvkit example but, for example, suppose you were looking at business data and your search for California included records from every state with a California Pizza Kitchen, California Tortilla, etc. Or, the reverse, you want your search for Washington to include records from the state but not DC.

This class of error is somewhat treacherous since it's common for people not to notice it before they start working with a full data file. Using tools which don't require constant caution to avoid data errors is simply a basic good habit.