Hacker News new | past | comments | ask | show | jobs | submit login
[flagged] Email Address Regular Expression That 99.99% Works (emailregex.com)
16 points by arun-mani-j on Oct 24, 2023 | hide | past | favorite | 37 comments



Yeah no, if you try to check this into any codebase it should get rejected straight away. There is no need to regex emails pretty much anywhere, when you use an email for signup or similar you send a confirmation email which serves the same purpose - to make sure the address is valid and correct. Use `<input type="email">` or check that "@" is present if you must, but anything beyond that is nonsense.


Not everything is web development.


That is fair, but I don't think it changes the outcome. The only way you can verify an email address is correct is to send a verification email. If you want to help the user ensure their input was what they intended, having two inputs for the same value is a better way to do that than a regex on a single field. As an example, my email is 20 chars long, the whole email regex thing only validates one or two of those 20 chars and doesn't help against typos in any way. In short, it's a lot of complexity without any real upside.


Imagine how fast this falls down when writing, say, an email client. There’s a place for validation.


What exactly changes when validating in an email client? foo@localhost, foo@bar and f@b are all valid email address and your user may want to send an email to those address. There is literally nothing there for you to validate.


If you are writing Gmail no one is sending to foo@localhost. It’s a typo.

(And you certainly don’t want to send every bcc a verification email they have to click on!)


What are you talking about with the bcc thing? Generally you receive someones email address once, and then optionally verify. If you fill the To/Cc/Bcc fields from your address book, everything should be fine. If you insist on typing something yourself and you enter a wrong email address, it just doesn't arrive (or reaches another recipient, depending on the typo).


> What are you talking about with the bcc thing?

That neither solution proposed upthread - double inputs, or a verification email - is as user-friendly as catching "oops you forgot a dot" validation on the field in an email client.


I think you're wildly overestimating how useful that is compared to the complexity of the solution in the linked article. Mistyping a "," for a "." is as likely of a typo as typing "gnail" instead of "gmail". My personal email is over 20 characters long, and statistically nearly every typo I can make is something you simply can't check for.

To put another way, as anecdotal experience I've encountered many forms online that won't accept my email address or phone or whathaveyou because of buggy validation. I can't recall when I mistyped my email, but surely it's possible I've done it at some point, however the difference is in the price I pay. If I mistype my email and don't get a confirmation email or w/e I can always try signing up again. However if I can't use the form because the validation is bugged there is no straightforward recourse I can take as a user.



Except in this case "good" will fail to deliver emails to perfectly valid email addresses.


What difference does that make?


Don't use a regex, test only for a @ and a dot after it and call it a day.

Btw content is unreadable thanks to aggressive adblocker detection so much that this feels like spam.


Even foo@bar may be a vaild email.


I know about this, but how many domains exists with just a root domain? The vast majority of cases, a missing dot is a typo. I also wouldn't want to imagine how painful using services must be with a foo@bar style address.


The vast majority of common typos result in a syntactically valid email address anyway. A regex is just not an appropriate tool for email validation.


"We can't catch all typos, so we won't catch any" is a silly approach to form validation.

I can't check that 123-45-6789 is a valid Social Security Number, but I can still reject "FARTS".

The same is true for "foo@gmailcom" for a public-internet facing contact us form.


Why not reject invalid top level domains then? This is going to catch a vastly larger amount of typos (including foo@gmailcom) without eliminating any valid emails.


Now you're maintaining an ever-changing external list for which there's no official API.


Yes, but at least it provides some actual value to the user. Alternatively I guess you could look up the mx record?


This is where the linked regex fails at.


Or correctly handles, for most of the use cases you’d want it for.

If you’re building a web app, foo@gmailcom is almost certainly a mistake.


Of course, since there isn’t a top level domain called gmailcom. But a regex cant tell you that.


Gmail doesn't need to support "you own an entire TLD and use its root for email" use cases, so you can at least check for a dot after the @ and catch a bunch of typos that leave frustrated users.


Each time that this or its variants get posted here, the response is the same:

Check if there’s an @ and send a verification email.



Thanks! I had problems accessing the webpage (due to my adblocker).


Very ironic if that's a regex bug or "special case" in the adblocker filter list.


I usually just test for the presence of @ - that’s it. Works for IDN addresses as well.

I remember some story back in the day where someone had an email address using the top level to domain only. Like “x@to”, pretty cool but probably a pain to use (:


Why does this site need an adblocker-blocker?


Use "mail.parse()" or whatever there is for your language. In general just parse stuff instead of letting some regexp loose on it.

For a quick check to make sure people don't mix up fields or accidentally hit enter before they finished typing, just /.+@.+\..{2,}/ is more than enough (technically foo@com is valid, but no one uses that – note that root@localhost or cron@sysops CAN be valid, so in some contexts you want to use just /.+@.+/, but that doesn't really apply for signups and the like).


  > "just /.+@.+\.{2,}/ is more than enough"
Isn't that asking for 2x '.'s after the @, which isn't required?

Should instead be /.+@.+\..+/ Which will allow foo@bar.com but not foo@bar ?


Eh yeah, I missed a dot there >_< I may be a sad case, but I don't write unit tests for my HN comments.


This[1] is or at least, was PHPs validator for email. Author provides a lot of insight into the complexity of regexing emails.

[1] https://fightingforalostcause.net/content/misc/2006/compare-...


I have uBlock Origin and it works fine in my Firefox.

Sorry I didn't knew of this adblock detection nonsense, otherwise wouldn't have posted it. :(


Why would you do this?! Mail parse libraries are everywhere


The first thing I tried, root@localhost, did not match ;/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: