Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
You can’t parse CSV with a regular expression (successfulsoftware.net)
4 points by charly357 on March 3, 2023 | hide | past | favorite | 4 comments


It is unclear if the short note is about regular expression refers to pattern matching techniques, an application like regex, or an internal command in some coding language like =~ and !~ in Perl.

I can use regular expression to parse CSV, it is just not pretty. Regex solutions do not need to be single runs.

I frequently use regex in multiple iterations to clean up the data be it in code or command line, then process it for one off scenarios.

> This is because a regular expression doesn’t store state.

This depends on how much state I need to store and in what context (see first sentence).


Regular expressions are a very useful tool in a programmer’s toolbox. But they can’t do everything. And one of the things they can’t do is to reliably parse CSV (comma separated value) files. This is because a regular expression doesn’t store state. You need a state machine (or something equivalent) to parse a CSV file.


But can I parse html?


No, and the article links to a StackOverflow question about this.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: