Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Having gone up and down this problem a number of times it is my opinion that the only way to truly evaluate email address validity is with a fairly elaborate state-machine based approach that provides you with feedback as to what is wrong in order to decide how to deal with it (or not). Here's one example:

https://github.com/dominicsayers/isemail

The regex's floating around out there are horrible.

Validating email addresses doesn't necessarily mean that you affect the user's experience. I think of it as an opportunity to avoid losing a potential customer due to a silly mistake. One such example would be a one page sign-up site where you are trying to collect the email addresses of those interested in your offering. In this context it is important to try and catch errors. You have a visitor who wants to keep in touch with you. He or she mistypes the email address. If you don't detect it you might lose them forever.

Granted, all errors are not detectable. If someone types jeo@example.com vs. joe@example.com there's precious little you can do about it in terms of automated detection.

You can accept obviously bad email addresses, store them in your database and simply tag them as such. This is where ML or human intervention might be able to fix the problem or choose to discard it. Email list pollution can be dealt with in other ways, for example, if you use this list to reach out to prospective customers bad emails will simply bounce.

In the end what is important is to avoid losing real potential customers as much as possible. I think a little software-based verification along with giving the user the opportunity to catch the mistake is enough. All the junk easily falls though the cracks of a multi-stage filter after the fact.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: