Email Address Regular Expression That 99.99% Works

Etheryte · on Oct 24, 2023

Yeah no, if you try to check this into any codebase it should get rejected straight away. There is no need to regex emails pretty much anywhere, when you use an email for signup or similar you send a confirmation email which serves the same purpose - to make sure the address is valid and correct. Use `<input type="email">` or check that "@" is present if you must, but anything beyond that is nonsense.

shaky-carrousel · on Oct 24, 2023

Not everything is web development.

Etheryte · on Oct 24, 2023

That is fair, but I don't think it changes the outcome. The only way you can verify an email address is correct is to send a verification email. If you want to help the user ensure their input was what they intended, having two inputs for the same value is a better way to do that than a regex on a single field. As an example, my email is 20 chars long, the whole email regex thing only validates one or two of those 20 chars and doesn't help against typos in any way. In short, it's a lot of complexity without any real upside.

ceejayoz · on Oct 24, 2023

Imagine how fast this falls down when writing, say, an email client. There’s a place for validation.

Etheryte · on Oct 24, 2023

What exactly changes when validating in an email client? foo@localhost, foo@bar and f@b are all valid email address and your user may want to send an email to those address. There is literally nothing there for you to validate.

ceejayoz · on Oct 24, 2023

If you are writing Gmail no one is sending to foo@localhost. It’s a typo.

(And you certainly don’t want to send every bcc a verification email they have to click on!)

Thiez · on Oct 24, 2023

What are you talking about with the bcc thing? Generally you receive someones email address once, and then optionally verify. If you fill the To/Cc/Bcc fields from your address book, everything should be fine. If you insist on typing something yourself and you enter a wrong email address, it just doesn't arrive (or reaches another recipient, depending on the typo).

ceejayoz · on Oct 24, 2023

> What are you talking about with the bcc thing?

That neither solution proposed upthread - double inputs, or a verification email - is as user-friendly as catching "oops you forgot a dot" validation on the field in an email client.

Etheryte · on Oct 24, 2023

I think you're wildly overestimating how useful that is compared to the complexity of the solution in the linked article. Mistyping a "," for a "." is as likely of a typo as typing "gnail" instead of "gmail". My personal email is over 20 characters long, and statistically nearly every typo I can make is something you simply can't check for.

To put another way, as anecdotal experience I've encountered many forms online that won't accept my email address or phone or whathaveyou because of buggy validation. I can't recall when I mistyped my email, but surely it's possible I've done it at some point, however the difference is in the price I pay. If I mistype my email and don't get a confirmation email or w/e I can always try signing up again. However if I can't use the form because the validation is bugged there is no straightforward recourse I can take as a user.

ceejayoz · on Oct 24, 2023

https://en.wikipedia.org/wiki/Perfect_is_the_enemy_of_good

kartoffelmos · on Oct 26, 2023

Except in this case "good" will fail to deliver emails to perfectly valid email addresses.

kwhitefoot · on Oct 24, 2023

What difference does that make?

tiborsaas · on Oct 24, 2023

Don't use a regex, test only for a @ and a dot after it and call it a day.

Btw content is unreadable thanks to aggressive adblocker detection so much that this feels like spam.

bazoom42 · on Oct 24, 2023

Even foo@bar may be a vaild email.

tiborsaas · on Oct 24, 2023

I know about this, but how many domains exists with just a root domain? The vast majority of cases, a missing dot is a typo. I also wouldn't want to imagine how painful using services must be with a foo@bar style address.

bazoom42 · on Oct 24, 2023

The vast majority of common typos result in a syntactically valid email address anyway. A regex is just not an appropriate tool for email validation.

ceejayoz · on Oct 24, 2023

"We can't catch all typos, so we won't catch any" is a silly approach to form validation.

I can't check that 123-45-6789 is a valid Social Security Number, but I can still reject "FARTS".

The same is true for "foo@gmailcom" for a public-internet facing contact us form.

bazoom42 · on Oct 24, 2023

Why not reject invalid top level domains then? This is going to catch a vastly larger amount of typos (including foo@gmailcom) without eliminating any valid emails.

ceejayoz · on Oct 24, 2023

Now you're maintaining an ever-changing external list for which there's no official API.

bazoom42 · on Oct 24, 2023

Yes, but at least it provides some actual value to the user. Alternatively I guess you could look up the mx record?

addandsubtract · on Oct 24, 2023

This is where the linked regex fails at.

ceejayoz · on Oct 24, 2023

Or correctly handles, for most of the use cases you’d want it for.

If you’re building a web app, foo@gmailcom is almost certainly a mistake.

bazoom42 · on Oct 24, 2023

Of course, since there isn’t a top level domain called gmailcom. But a regex cant tell you that.

ceejayoz · on Oct 24, 2023

Gmail doesn't need to support "you own an entire TLD and use its root for email" use cases, so you can at least check for a dot after the @ and catch a bunch of typos that leave frustrated users.

evrimoztamur · on Oct 24, 2023

Each time that this or its variants get posted here, the response is the same:

Check if there’s an @ and send a verification email.

Retr0id · on Oct 24, 2023

Archived: https://web.archive.org/web/20231024092745/https://emailrege...

erikgahner · on Oct 24, 2023

Thanks! I had problems accessing the webpage (due to my adblocker).

HPsquared · on Oct 24, 2023

Very ironic if that's a regex bug or "special case" in the adblocker filter list.

a9ex · on Oct 24, 2023

I usually just test for the presence of @ - that’s it. Works for IDN addresses as well.

I remember some story back in the day where someone had an email address using the top level to domain only. Like “x@to”, pretty cool but probably a pain to use (:

creshal · on Oct 24, 2023

Why does this site need an adblocker-blocker?

arp242 · on Oct 24, 2023

Use "mail.parse()" or whatever there is for your language. In general just parse stuff instead of letting some regexp loose on it.

For a quick check to make sure people don't mix up fields or accidentally hit enter before they finished typing, just /.+@.+\..{2,}/ is more than enough (technically foo@com is valid, but no one uses that – note that root@localhost or cron@sysops CAN be valid, so in some contexts you want to use just /.+@.+/, but that doesn't really apply for signups and the like).

swores · on Oct 24, 2023

  > "just /.+@.+\.{2,}/ is more than enough"

Isn't that asking for 2x '.'s after the @, which isn't required?

Should instead be /.+@.+\..+/ Which will allow foo@bar.com but not foo@bar ?

arp242 · on Oct 24, 2023

Eh yeah, I missed a dot there >_< I may be a sad case, but I don't write unit tests for my HN comments.

mmh0000 · on Oct 24, 2023

This[1] is or at least, was PHPs validator for email. Author provides a lot of insight into the complexity of regexing emails.

[1] https://fightingforalostcause.net/content/misc/2006/compare-...

arun-mani-j · on Oct 24, 2023

I have uBlock Origin and it works fine in my Firefox.

Sorry I didn't knew of this adblock detection nonsense, otherwise wouldn't have posted it. :(

MilStdJunkie · on Oct 24, 2023

Why would you do this?! Mail parse libraries are everywhere

sam_lowry_ · on Oct 24, 2023

The first thing I tried, root@localhost, did not match ;/