Hacker News new | past | comments | ask | show | jobs | submit login

> Regex mostly crops up in validations.

I’ve just grepped my codebase for regex matchings, and this is not true. The most common use case is matching a filesystem path or a URL that is known to conform to a schema (e.g. file names prefixed with dates) and extracting parts of the schema from it.

> Just Google the first result for 'email address regex validation.'

That is an abomination and not a good way to validate emails, because, as you say, it’s super complicated and barely understandable. Draw a finite-state automaton corresponding to this regex to see why. Equivalent code written without regex, implementing the same FSA, would easily be >100 LOC and equally incomprehensible.

In practice, it’s better to check whether the string contains an @ and maybe a dot, and that’s it. Sure, you won’t be RFC 5322 compliant, but who cares? Your users are much more likely to make a typo in the domain name anyway than misspell the characters that would render the email invalid. Just send an email and see if it arrives.

All of the regexes in said codebase of mine are simple. The longest is 75 characters and a straightforward one to check for UUIDs; you can understand it at a glance:

    [0-9A-Fa-f]{8}-[0-9A-Fa-f]{4}-[0-9A-Fa-f]{4}-[0-9A-Fa-f]{4}-[0-9A-Fa-f]{12}



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: