Hacker News new | past | comments | ask | show | jobs | submit login
How not to check the validity of an email address (dellsystem.me)
437 points by limelight on Sept 4, 2013 | hide | past | favorite | 235 comments



In college I was hired to build an auction site. I was billing my client $20 / hour and subcontracting out the work to some of my fellow classmates at $10 / hour. I was swamped with other work and didn't have much time to review the code. I just made sure it satisfied the specifications and shipped it. We launched the site and did a few hundred thousands dollars worth of transactions in the first 24 hours. Then something strange happened... all of the bids mysteriously disappeared from our admin panel and users started emailing in asking why their bids weren't showing up anymore. I got a panicked call asking what had happened. I had no clue, but promised to look into it. I started digging through the server logs and noticed that all the bids had been deleted around the time that Google had discovered and crawled the site. Sure enough, my friend had added links to delete bids via the admin panel that were executed via GET requests. It wouldn't have been that big of a deal except the poor guy had used JavaScript for authentication! Google's crawlers had carefully hit every single Delete link and wiped out the site. I fixed the authentication system, refunded everyone's credit cards and relaunched the site with a huge apology for the issues. Needless to say, from that day on I became far more diligent about doing code audits.


The first time someone wanted to hire me was even before I started college. "so how much will you bill me?" "Well right now I get 7€ (9.2$ at current rate) for unloding trucks ..." He interupted me before I could finish the sentence and demand 8€. "I can not pay you 7€! taht is just to much!"

Sooooo I declined but someone took the job ... for under 8$ an hour.

How much quality can you expect for that price???


This sadly still happens. I remember looking through Craigslist jobs and legit companies wanting someone with html/css/javascript/php experience for $8/hr, kid you not.


Part of the problem is, it's difficult to convince companies that (html/css/js/php) coding work is worth something, unless the company you're working for is actually a startup. Especially when they say "Can't you just do it in Wordpress, or Wix or something?"

Another part of the problem is, developers take that work. I probably would myself. I'd hate myself for it but if I have to take shit pay to make ends meet I'd rather do it coding than not.


Do they really allow hourly billing - I wonder if they realise it can cost less to get an experienced person at a higher hourly rate who'll do the job in less hours.


I'm guessing they don't realize much of anything related to the work they're looking to have done.


This seems analogous to "The Market for Lemons"* The proprietor of a small business has a very hard time judging the competence of a designer/developer even for simple jobs.

* http://en.wikipedia.org/wiki/The_Market_for_Lemons


This story almost had me in tears (a mix of schadenfreude and shame for my profession). I hope they learned something not to give business critical work to college students. Makes me think that IT Risk management should be right at the top of what MBAs have to learn.


>I hope they learned something not to give business critical work to college students.

I've had the same kind of experience hiring a 'real' consulting firms. We were sent a PhD, who banged out a pretty standard looking interface for us that worked terribly. Ostensibly we could've sued, but that just takes more time, and doesn't really fix things.


Wow. Just because someone is a college student doesn't mean they are incompetent! Plenty of people do business critical work as college students, haven't you ever heard of co-op before? A degree doesn't make you competent either, I've worked with enough people who have degrees who are completely incompetent. College students might need some extra supervision to make sure they are doing the right thing, but so do jr engineers.

Anyways, those are the things that a code review would catch.


I think you got me wrong, sorry for not having myself made clearer. I worked through college as an IT consultant as well, however as you noted under supervision of a consulting company with seniors. Hiring a college student directly, without a company attached that can be made liable, is just a bad idea, both for the customer and the student. Just imagine the customer had sued parent. I wasn't talking about technical competence - it's all about liability as well as having proper processes for design, implementation, review, testing and rollout in place - a thing that can easily he seen even before signing a contract.


I agree with the sentiment, but in this particular case, do you really think an MBA'd IT manager was in charge of a project that paid $20 per hour? (As opposed to Accenture rates.)


No, I don't think so. On the other hand I do know a case with MBA type manager who gave away the job of doing their website to a complete amateur, just because he knew him. So it wouldn't surprise me either to be honest.


How did GoogleBot get access to the (presumably) private admin panel to crawl the links in the first place? Also, these sort of things require server-side authentication, not just JS. Just because GoogleBot won't crawl your PUT requests doesn't mean others won't -- including users (malicious or otherwise)! Just sayin'.


If I recall correctly the code looked something like this:

<?php

if (!admin())

  echo "<script>window.location = '/signin';</script>";
delete_bid($_GET['bid_id']);

?>


When you think about it it's actually pretty elegant -- if your purpose is to write something that appears to work but will do the worst possible thing when crawled.


Something like the underhanded C contest then?


I don't understand. Was it supposed to delete the thing anyway, and then redirect if the user wasn't an admin?


a simple 'else' would have at least stopped the deleting part :/


This could have been written in the mindset that JavaScript and PHP ran at the same time. Lots of people still believe that.


This sounds like a scary story that nerds would tell by a fire while camping.


Every single legacy application I've ever worked on has had analogous code buried in it somewhere.

An application I've just been "repairing" recently has a spot where it uses two separate queries to pull two full table sized lists of values, then manually joins them with a loop, and then manually re-orders the joined values into groups selectively ignoring some rows, and then embeds the the whole reordered list in a web page. The page takes around 20 seconds to load. Switching it to use a single properly formatted SQL reduced load times to under a second.

Another legacy app I'm employed to "repair" has one single 'template' for every page on the whole site. Its first ~500 lines conveniently consist of a giant and highly nested if/else clause to set the page variables and inline javascript.

Such things are the result of "IT experts," "Software Managers," and "Product Administrators" who've never done real software/web development in their lives hiring random "programmers" who have history or psychology degrees and think they can program because they made a form in PHP.

It only gets lovelier when eventually somebody realizes it's a huge security risk and hires an outside development firm to "secure" it. (Giant eye roll. If they couldn't vet a programmer, you can bet they're great at vetting security consultants and contract developer shops.) Did you know that randomly moving code into folders named "private" and "public" for a few thousand dollars can solve giant architectural and security issues like ridiculously easy XSS and SQL injection?

I don't know what the deal is, but a huge proportion of people writing code are plain incompetent.

At my last company we fired someone who created huge amounts of work for everyone (he thought he could secure page content and alert messages by using base 64 encoding as a stand in for hashing and encryption, for example) and a few months later he was hired as lead developer by a pretty reputable educational business.

... sigh


> Every single legacy application I've ever worked on has had analogous code buried in it somewhere.

I'm not old enough to be responsible for stuff like that but I am incompetent enough.


I love your comment. Funny and humble.


Rest assured that while the person responsible for the legacy design flaw is perhaps long gone, I'll be right there...waiting, oblivious.


> I'm not old enough to be responsible for stuff like that but I am incompetent enough.

Now that is a great conversation starter! I assume you think you know more than the senior/lead/architect on the team. You might, but have fun with that mentality. It's not sure to last. :)


You must have misread the parent in an interesting way.


Not old enough as in hasn't produced any legacy code yet. Not in the sense of avoiding senility.


What are you on about?


the GP said "incompetent enough"


What?


>Another legacy app I'm employed to "repair" has one single 'template' for every page on the whole site. Its first ~500 lines conveniently consist of a giant and highly nested if/else clause to set the page variables and inline javascript.

oh, oh! I'm doing one of those. Only, it's a modern, MVC version, so there's actually a couple of dozen controllers with a single function each, and all actions snake through The Great Maze of Ifelsedom to set their rightful values, before traversing it once more on the page view file

Manager: "Uhm, I thought you said changing that label would be a 5 minute task?"


Whenever something like that happens, this is my reference: http://developerexcuses.com


  git commit -m "$(curl -s developerexcuses.com | sed -n '/<a.*>/{s/<[^>]*>//g;p}')"


Nice bit of sed. :-) Here's the best awk, I could muster:

  awk -F'>' -v RS='<' '/a href/{print$2}'


There's also:

    lynx -dump -nolist developerexcuses.com


> Switching it to use a single properly formatted SQL reduced load times to under a second.

We need you to change it back because the system that scrapes that page is relying on the page load time.


That is, computer running the sign on the front of the store pulls data from here, and you've completely broken it because now the page loads a second time before it can finish drawing the screen. It /needs/ that delay on the HTTP request in order to catch up. A real programmer would have thought of that. Add some loops, sleeps and rands; we can talk about your reckless conduct in the morning.


What's worse is when fake delays are put into the code because (stupid) people think that the computer can't possibly be doing a good job if it retrieves the results in under a second. If it's "thinking", it's working well!


I thought the fake delays were so the programmer could cash in on doing optimisations at a later date.


Also, your ads are viewed for more seconds = more revenue


Those two things are just added bonuses!



See: banking websites.


Just put a four way Cartesian join in a subselect that gets discarded. All unit tests pass! Ship it!


> hiring random "programmers" who have history or psychology degrees and think they can program because they made a form in PHP.

That's pretty unfair to people coming from history or psychology who actually can write good code. Just because you don't have a degree in CS doesn't mean your code is shit. This is purely anecdotal, but my predecessor at my current job was a CS graduate and wrote code like in the OP.


As a history major, I would agree with you.

One of the real issues has to do with the mentality of coding. There are people regardless of background who approach coding as a job, and those who approach it as a craft. You want the latter, not the former.

Here's my rule:

If you don't look back at code you wrote a decade ago with some degree of horror, you are either an extraordinarily good coder, or you aren't a good coder at all.


A decade is a long time at a single job... Try six months for a good start :)


If you are coding at all outside of work you will have code to look back at.

The question is:

Are you improving? Or are you beyond improvement?


Six months? I come in after the weekend and constantly want to rewrite the whole goddamn thing.


This^ and I'm only a student; I'll write a program in the evening and by the following morning I'm all, no, no, no!


I tend to spend a lot of time planning (almost as much as coding). I do tend to notice big changes over a period of 2-3 years but areas where I can notice improvements in say six months.


I think I spend too much time planning and not enough time just getting shit done. It's one of the things that I feel like I have to work on this year.


I don't think it is necessarily a bad thing. I spend a lot of time planning because I find my coding productivity is higher. Often it's better to let problems sit for a bit than to code then first, or if one does a mock-up it is an exploration that is part of the planning, to be discarded and done right a second time.

But what this means is I rarely come in the next week and wonder what I was thinking (it does happen, but rarely). More often I look at things, over a few months figure out better solutions to coding problems and my style changes accordingly.


That's good, but the rate slows down over time.


I look at stuff I wrote 3 months ago with horror.

Then again I started learning ios programming then so I might have a bias/reason for it. >.<


I became an English major because I'd been coding since I was six years old and preferred to learn how to become a better writer.

At the time, the majority of incoming CS freshman did not even harbor the most basic ideas or curiosity about how a computer functioned. They had all heard they could make mounds of money. When I took elective CS classes in college, I had graduate CS students attempting to copy off of me.

It's possible CS students and programs have changed since the 90s, but based upon the CS graduates I encounter, I expect they haven't improved that much.


Here here. There are plenty of people who have CS degrees and can't programme. Why else do we need "FizzBuzz".


It's "hear, hear!" - an abbreviation of "Hear him, hear him!".


This one annoys me as much as "bare with me"..


Yeah, it "literally" kills me every time


I knew a music major who was a better coder than most of the CompSci grads.

At one job, I judged coding ability by the number of times your SQL woke me up due to huge I/O usage[1]. Many a CS graduate cannot read a query plan.

1) if it looked like a reasonable query and it was just Sybase's crap optimizer, I gave them a pass. Although, after a while, you learned to force the indexes.


Yeah, I didn't mean to sound so derogatory, but out of non-formally-trained developer co-workers I've ever had only a handful didn't create work for everyone.

The two most common issues are inability to create sanely defined and decomposed schemas and clearly written queries, and poor ability to structure code in an organized way.

That said, I have known a couple very good coders without any kind of formal education. They just seem to be outliers.

When dealing with formally trained developers I've certainly encountered the incompetent also, but I'd say at an easy majority of trained coders I've worked with are competent enough to not be hard to work with.

Mind I you I do not claim to be a "superstar developer" or anything. The state of the art tends to change just a bit faster than I can keep up with these days.


Conversely, I'd hazard a guess that <10% of my outgoing CS class were competent in the way you describe... I'm fairly certain that the "ability to create sanely defined and decomposed schemas and clearly written queries, and structure code in an organized way" is the outlier here.


Speaking as a former history major, thanks! :-) Code like that is written by people who don't really understand how their code works and there are plenty of CS majors in that category. With any luck, they are swiftly promoted to management where they can do less harm.


I know it's unfair to generalize, but seriously if we'd had to put numbers on it and perform statistics on it (something most psychology degrees are pretty capable of even though the way they do it might raise questions) I'm rather sure the outcome would be close to: sorry, but 90% cannot write good code. Why I think this? Because I've seen and fixed code written in multiple laguages by psychology degrees (about 20 in total, from more than one institution) for the last ten years. And there are just no words for how awful it was. Hell, I've seen matlab scripts in which I would use the same dataset for both dataset arguments and still the result was there was a statistically relevant (p < 0.01) difference between the two.


Indeed. It would have been better as "CS degrees from JavaSchools."


I'm starting to really dislike the term "legacy code". It implies the code is bad because it's old. It perpetuates the misconception that code gets "stale" and problems build up. But that's not true. "Legacy code" is bad because it's BAD. All code is "legacy code" because unless the project is brand new, it has some history. We just don't call good old code "legacy code" because it hasn't caused us any problems. There is no reason to correlate bad code with age.


I think you're misunderstanding the term. Legacy code -- like legacy lots of other things -- is the intersection of "bad" and "it's too late to fix now".

Code which can be fixed easily isn't "legacy". Code where a replacement would need to be bug-for-bug compatible in order to avoid breaking things is "legacy".


My favorite definition of legacy code is bad code for which no tests exist. It's terrible because people have no idea what to do with it and no idea what they're breaking when they change it.

You could be writing legacy code today.


I've heard of that scenario, but I'm curious about what the best way to produce a bug-for-bug replacement would be. Should you do it right, and then put a layer in front of it that switches/transforms the output to match the old errors?


Code sitting on the same machine can keep on performing the same way forever.

But other things change. The library changed its interface. You coded your crypto lib when MAC-then-Ecrypt was all the rage. You aren't handling HTTP/1.0 headers. Windows 7 doesn't even have those 16-bit drivers any more.

I hate the "throw everything away and rewrite with the brand new frame work every two years" that some parts of the web development community seem to have (and then get angered when you ask about a version over six months old because it is what you installed everywhere). But sometimes software does rot if it's old and no one is trying to keep it up-to-date.


In 'Working with Legacy Code', Michael Feathers defines legacy code as code without sufficient tests.

The great thing about this definition is that it doesn't matter how old the code is - if it can be changed without worry about introducing unwanted behaviour because of test coverage, then it's not legacy.

It also means that you can write new legacy code right now!


It also means that you can write new legacy code right now!

I once made the argument that a team within my organization was actively producing new legacy code. It wasn't my most diplomatic moment.


The only legacy code that people talk about is bad legacy code. People never fondly remember the function they wrote ten years ago that's still doing what it should do, no debugging required.

It's always the function that's been rewritten twenty times that gets labelled as legacy code.


While code may not go bad with age, the surrounding world (compilers, frameworks, dependencies) change constantly. So while the code quality of a million LOC Java 1.0 application might be stellar, maintaining it may still be a royal pain. So in one way code does get "stale" when left untouched for too long.


Look at it this way: any legacy code that people care enough to still support is code that works, at least a little bit.


I take the term more to mean 'under-maintained code' than just old code.


> I don't know what the deal is, but a huge proportion of people writing code are plain incompetent.

Hooray for short term job security!


Gradebusters / Making the Grade, or something with names like that, used to use a Java applet to "secure" the web site with student grades. You could just download the applet and decompile it to figure out their trivial encoding of the IDs and PINs (which were just params in the HTML).

Or you could figure out just an ID (typically a student ID number, although more than a few were social security numbers, apparently), and use "1066" since they had a backdoor PIN in quite a few releases. Battle of Hastings, eh?

Want to know how users did web security instead of asking their admins for a proper .htaccess/server-level config setup? That's how.


Heh, I've seen a great example of "high end" software for partitioning servers for a specific vertical (so multiple hostile customers could share the same hardware). Java applet to provide "rich" experience cross platform. OK, debatable.

Security? On start, the Java app downloads a connection string to the MySQL DB. Which contains the root login for both SSH and MySQL. Then it prompts for your password and queries the Users table to see if you're allowed. And it helpfully logs this info into user's temp directory. "Ah yes, this is a known limitation in our current design."

Edit: They repeatedly lied to customers about various security fixes (I didn't do full disclosure on the numerous issues I found). They didn't care as they were sold for a world of money, then the entire product line was nixed. Most customers don't seem to care (I've found similar stuff across the board); hackers are targeting lower hanging fruit for the time being.


Education software is literally the worst.

Moodle was pretty ick to begin with, but you should've seen the state of one install I had to work on by the time I got to it.

I still have nightmares.


I've worked with Moodle adding custom reports and features to it. The shop where I worked at was stuck at Moodle 2.x and couldn't upgrade because it was a clusterfuck of custom code mixed with outdated plugins.

I feel you.

It worked, but god damn; I literally quit that job because of the stress of working with Moodle day in day out.


Needs disrupting


It's extremely hard to break into because like healthcare these gigantic institutions where nobody can get fired made choices ten to twenty years ago that have now become "the way it's done" and they won't accept a better solution.


I know. I am at one of those institutions. We practically have a social sciences lab for the Dunning-Kruger effect in place of IT decision makers.


In fairness, my university changed systems every year or so while I was there. They all sucked, so sticking with your existing solution is possibly not the worst decision.


Blackboard is a known aggressor with its patent portfolio and cash. They will acquire and extinguish or just kill you with a patent lawsuit if you present a significant threat to their products.


So were unauthorized users able to modify grades? Or just view them?


View. It was a system where the teacher would export things from their personal machine and it would dump a bunch of files into their ~/public_html.


Clearly they should have optimized this by stripping the @mail.mcgill.ca on the server side before serving the list.


You know, for all the ridiculousness of this code, at least it actually performed its function correctly, i.e. it would correctly identify all McGill email addresses. From the title I was expecting another multi-page regexp.


Unless a new user was added after that page was loaded.


Clearly you must be joking. They should've kept the emails on the server-side the entire time. Then would be a matter of validating the request by searching the email in let's say SQLite. Would it be done? Probably not. But at least it'd be a lot stronger than sending sensitive data to clients every damn request.


> mmariani 41 minutes ago | link

> Clearly you must be joking

No shit.


You're missing the sarcasm :P


Unless it's clearly stated sarcasm gets garbled over the wire. So in the hope of helping those who don't get it I decided to do the right thing just to be on the safe side. ;)


From now on, when you see a post taking the form "Clearly [obviously bad idea]", please read it out loud in The Simpson's Comic Book Guy voice and place extra emphasis on the irony. This ensures you will get the correct intent 99% of the time.


Clearly, acting like a dumbfuck in order to become enlightened is obviously the right thing to do.


'eh; keep trying, you'll catch on someday.


Oh come on, then sarcasm loses its humor. Anyone who didn't get it the first time probably falls into the same category of developer that is being criticized here. :P


wooooow


I've had something similar delivered to me on a project I hired out. The most frustrating part was not the code but the developers reaction to why it was so bad. He had no idea what the big deal was and thought I was being nitpicky.

Worse yet, was an initial claim that it was more efficient to do it that way. That was followed up with a claim that doing it differently wasn't possible.

Needless to say, I stopped working with that team of "developers".


>The most frustrating part was not the code but the developers reaction to why it was so bad. He had no idea what the big deal was and thought I was being nitpicky.

This is always the worst. I've had experiences like that on many an occasion, where the person is simply like "huh? what's wrong?"

You can't really fix that level of sheer incompetence, ignorance, and arrogance all wrapped into one.


> You can't really fix that level of sheer incompetence, ignorance, and arrogance all wrapped into one.

Sure you can. You can tell them why it's wrong and point them at reading material to fix it. Fixes incompetence and ignorance, and if you're lucky, arrogance. All wrapped into one.

Everyone had to start somewhere.


True, but it depends how arrogant and stubborn they are. If they're just simply arrogant and don't understand, you can do something, but in many cases they'll insist "this is fine. what's your problem?"


Or more simply, you can't help people who don't want to be helped =)


> Worse yet, was an initial claim that it was more efficient to do it that way

This is a classic excuse... so what if it leaks e-mail addresses (who looks at source code anyway?), makes clients wait (at least server's CPU is intact), and is just plain illegible (there is always a debugger if you want to fix a bug).


It's not possible the teach these people without encountering a massive, irreparable, hack. Nothing short of complete obliteration of their project through their own incompetence will work because the Dunning-Kruger effect is in full swing.

The smartest people I know are acutely aware of how little they actually know compared to how much there is to know.


Man. Reading posts like these has several effects on me. One is utter shock that anyone could be so stupid. Another is to remind me of how little I know (because I'm sure in the eyes of someone who actually knows anything about security, I'd probably provoke the same reaction). I'm also amazed that some of the people responsible for these things can still find work.

Here's my own personal story. The other day, I had a brain fart regarding my password for my online banking account. So, I got lazy and just clicked the "forgot password" link, answered the security questions, and within seconds, I got an email. It had my old password in it. Yes, my bank stores passwords in clear text. sigh.

BTW, I'm also looking for a new job right now, so if you're after someone with 2 brain cells to rub together who also happens to be a decent Python programmer, shoot me an email. (It's in my profile.)


Every financial institution I have interacted with has failed to store hashed passwords. So I looked up the reason why. Turns out that the Payment Card Industry Security Standards Council Data Security Standard (PCI DSS[1]) stipulates only that passwords and other sensitive are to be stored with encryption, and also transmitted as such†—but only amongst other financial institutions‡. So while they can’t keep your password in cleartext, they are free to send it by email, text message, or town crier.

[1]: https://www.pcisecuritystandards.org/documents/pci_dss_v2.pd...

†Requirement 8 (part 4):

> Render all passwords unreadable during transmission and storage on all system components using strong cryptography.

‡Introduction and PCI Data Security Standard Overview:

> PCI DSS applies to all entities involved in payment card processing–including merchants, processors, acquirers, issuers, and service providers, as well as all other entities that store, process or transmit cardholder data.

(It appears this does not include cardholders.)


>> Render all passwords unreadable during transmission and storage on all system components using strong cryptography.

>(It appears this does not include cardholders.)

Ah, very informative. So, here is the problem: the fact that crypto is specified when it's the wrong tool for this particular job (that of storing my online banking password). I don't want my password encrypted. I wanted hashed and salted with something like bcrypt, and I don't want it sent in the clear over email.


Not necessarily. They may be using a reversible hash. Not much more secure, but it beats plaintext.


Yeah, maybe they just look up the MD5 hash on http://md5.gromweb.com/ and sent him the result.

I used that site to show my boss his plaintext password to explain why MD5 alone is barely more than security though obscurity when trying to convince him that we needed to salt them as well - he agreed with me on the spot.


I actually did that for a system. I had grown a major case of the ass about my job. I hated working there and I hated spending effort on the work they made me do. They hired me to fix the legacy systems the previous .NET developer had made (and they were all OS X people who didn't have the courage to even look at a Windows machine), but every time something went wrong, they wouldn't let me fix the issue, they would only approve me time to fiddle the data in the database. "Just brute force it" was some kind of mantra from our CEO. Somewhere in the last 5 years, it seems non-technical people overheard the "brute force" meme and display the notion that they believe it's the always-practical, never-difficult solution to a problem when the programmer would prefer an overly engineered solution of negligible or negative gain.

So I had gotten tired of people forgetting their single-english-word passwords and making me overwrite their MD5 hashed password to a known-value that mapped to something like "password123" (yes, no salting for the hashes). So instead of manually resetting the password in the database all the time, I banged out a small web app that ran on my machine for printing every user and reversing all of their MD5 hash'd passwords. It didn't work for the ones who had chosen actual, random strings for their passwords, but that was maybe 1% of cases.

And then I shared the IP address to my machine as a link for every other engineer in the company (all 3 of them). One of the other engineers freaked out that I had "exposed" the passwords, but as far as I was concerned, the passwords were already exposed. He shut up when I pointed out that the work was done and that I had other things to do, things that were his responsibility but he couldn't do because he had a habit of taking on too much work.

From that point on, any time I had more than 2 repetitions to do something, I'd write the most basic of web app to do it, and I'd shove it onto that little server on my machine. The future repetitions would invariably come in and I'd save tons of time not doing it the manual way.

Seriously, this was easy stuff. Don't take this to mean I'm bragging about it. I'm mentioning it because it is so simple and so obvious of work to do in these cases, and it eventually got me fired. I made the mistake of trying to get credit for the work I did, for saving the company time, freeing myself up to do other work, and all the CEO could see was that I was no longer able to charge 3 hours to create new accounts in the system now that it took less than a minute through my admin app. "Wasn't bringing enough value to the company." One of my reports found a quarter million dollars in lost licensing revenue. Wasn't bringing enough value to the company.

So it's not just programmers who can be grossly incompetent. Oh, they certainly can be, I've had to clean up my fair share of systems. But I've found far more often that systems are bad because the programmer's manager was an asshole idiot who made unreasonable demands and forced the programmer to make compromises. Maybe that programmer wasn't the best programmer, but nobody can do as good of work as they are capable in that situation.


How does it beat plaintext?

A reversible hash isn't so much a hash as a bijective function. The security of all stored passwords then depends on the secrecy of the "hash" function.


Isn't that an oxymoron?


Gosh. For some reason, the "right answer" I expected to see was "do not try to validate the address; just send the e-mail and handle the bounce if it fails".

There is a whole other layer which is very good at handling incorrect or undeliverable addresses.


I recently sat in on a presentation regarding javascript module loading. Someone in our company had taken it upon themselves to roll their own solution, because, shit, they're being paid and apparently have no oversight.

The solution involved creating an entire cache/hash layer on the client using local storage.

At the end of the presentation I had to try to be as respectful as possible when I asked why basic browser caching and content expiration weren't enough. Apparently it wasn't even considered -_-


Perhaps they wanted it to work even when offline? It can be a struggle to get basic browser caching to work nicely in online/offline scenarios. As arethuza points out though, that is better addressed using appcache.

I could imagine doing something like this if you wanted to support say es6 modules, and so each file needed to be processed after it was loaded to work in the current browser. In that case, I can imagine wanting to cache a processed one on the client side, although if you control your whole stack, it'd be better to do that work on the server.


So they had used HTML5 local storage and missed the fact that HTML5 has a pretty nice offline caching feature?


Don't judge them too harshly. I am currently developing web app with offline capabilities. FT which provide the scaffolding code recommend using the cache manifest only for the absolutely minimum requirements (jquery + 10 lines of html) and use local storage for all the rest.

I suppose they know what they are talking about.


Maybe, but I'd worry that they're only suggesting that because the appcache can be a pain to work with.

localStorage is synchronous, and so putting huge amounts of data in there can delay page load.


AppCache is a pain to work with.

Admittedly, when I was working on it, we were trying to wrangle it into something it really wasn't meant to do, but that fact - that it wasn't a good fit for our application - wasn't at all obvious until I finally started encountering all the "gotchas", and reading more about them.


Not sure you read the article - there is an additional constraint in that only email addresses pertaining to the institution in question are allowed.


Yes, I decided to ignore that constraint because it doesn't make sense :).

The article suggests performing "server-side membership testing, which is O(1)", but I think this is a bit too much — you can do even easier server-side validation without the list of all valid e-mail addresses, just the information that "@[anything but these two domains] is not an OK target".


> Yes, I decided to ignore that constraint because it doesn't make sense :).

Then the customer won't pay you because you ignored their requirements. They might even sue you because you breached contract.

If the customer asks you for a mound of poo you write them a mountain of poo; you try your damned hardest to make sure that the room smells like poo when they are using the app. If they go home and tell their wife and kids about the giant mound of steaming poo they have been using all day then you have succeeded because they will go back to the one developer who knows how to stick to their senseless requirements.

At the end of the day all the other developers are giving them apps built on principles that only make sense to developers - principles that only really make sense for millions of users and not the few thousand that they have. Principles don't put a house over your head and food on the table - money does.


Heh, yes, the smiley face there encodes something like of ", although of course I know that in the 'real world' people can't just ignore product requirements that don't make sense, and hopefully people will challenge at the design phase insane requirements like 'client-side validation of all possible e-mail addresses' rather than implementing them."


As I understand it, it wasn't the customer who requested client-side email validation; the customer just requested validation against a list of known good addresses (instead of just against a list of known good domains), which is a perfectly reasonable requirement (for reasons given in other posts in this thread). It was the programmer who (insanely) decided to do the email validation client side instead of server side.


Yes, I decided to ignore that constraint because it doesn't make sense :).

It does if you want to give the user instant feedback when they type an address that is in a valid domain but does not actually exist, as Ntrails pointed out. If you think a requirement doesn't make sense, you're supposed to ask the client, not just arbitrarily ignore it.


Which is a much easier problem to solve. I suppose this web server is in the same LAN, or at least whitelisted, so send the email directly to the institution's SMTP server, which is presumably not an open relay.


Checking against the two viable email endings and handling bounces should they occur is reasonable. It wouldn't give the immediate user feedback though in case of spelling error etc.


Certainly that solution would have avoided this whole fiasco in the first place. Though immediate user feedback when entering an invalid email address is a somewhat useful feature as well, so maybe that's why the vendor included it.


A timely reminder to everyone:

http://thedailywtf.com/ is still going strong! Be there or be ... competent?


I quit them when Alex tried to takeover Programming Praxis by force when negotiations didn't seem to be moving fast enough for him.



Use of potent instead of competent saves three characters and conveys the idea better... or ruder.


Am I the only one who thinks that the repeated attacks on drugs and drug usage are really unnecessary (and not well informed on the topic)?


Nope, blog post was loaded with snark - her attitude offended me more than the poor code D:


Glad to hear that I wasn't the only one.


You know what would be awesome?! If the developer in question happens to frequent HN and responds to this thread, eh?


Developers like these don't even know HN exists.


One coder used framework he didn't completely understand. Originally there wasn't password requirement for users. But when site got more confidential data password feature had to be added. Well how it was done? When user gave login and clicked ok, came page asking for password. But if you changed url at this point, everything worked. When I checked the actual code, the user form logged user in and password form logged user out if password was incorrect. Oh boy, I did praise that guy. Same guy used base64 encryption, so urls would be unguessable. Guess if there's real authentication at all if you modify the url content correctly. - Well, of course there isn't. Btw. Why there aren't negative recommendations at linkedin?


It's good to be reminded how many employed programmers don't actually know how to program.

Actually, I'm not quite sure why or whether it's good to be reminded of this. Maybe it's better to try and forget it.


I love how she gives away the name of the actual software (desire2learn) in the tag. :-)


she


McGill seems to use a LMS branded as "myCourses", apparently the newest label Blackboard is using for it's "Learn" line.


McGill switched to 'Desire2Learn' last summer as Blackboard no longer supports 'webCT', I think (access to old courses hosted on webCT is now very limited). 'myCourses' was the branding chosen by McGill... I think now they just call it 'myCourses2', after the switch.


Hm... how come Desire2Learn hasn't been sued by Blackboard yet? Usually, as soon as a uni stops switches away to use a competing course management software, the competitor get sued based on the LMS patent: http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=H...

UPDATE: they //did// get sued, but won, and the above patent is invalid now. Way to go! http://en.wikipedia.org/wiki/Blackboard_Inc.#Legal_matters


I've once come across:

   if (!Boolean.FALSE.equals(aBoolean)) { 
      // ...
   }
I was pretty baffled.


I've seen a surprising amount of code that does:

    if (aBoolean == true) {
       ....
    }


At university I worked with small unmanned aircraft. We had a crash due to a piece of code with this form:

  int landing_flag;
  ...
  if( landing_flag ) {
    do_landing();
  }
It was C code that pre-dated a boolean type. A single corrupt data packet in a wireless link made landing_flag == 2345923 (some arbitrary large value) and thus the landing routine was triggered mid-flight.

We changed every instance of if( flag ) to if( flag == specific_flag_value ) to ensure that particular bug didn't rear it's ugly head again. I keep doing that now.


So, you changed a somewhat reproducible bug into one that has about a four billion times lower chance of occurring?

Good luck to the poor chap who will have to figure out what happened when that bug hits.

Also, you introduced a new error condition: a corrupt packet that should set a value of 1, but arrives as a value of 2 will not initiate the landing routine.

The right thing to do, IMO, is to prevent corrupt data packets from doing such stuff. Checksum the packets or, better yet, checksum and encrypt them. That prevents the enemy from taking over your plane.

Finally, I do not see how 'no proper bool' is relevant here. If the packet contained a single bit indicating the value of the flag, it still could get corrupted.


> So, you changed a somewhat reproducible bug into one that has about a four billion times lower chance of occurring?

Yes. You seem to imply that's a bad thing?

> Also, you introduced a new error condition: a corrupt packet that should set a value of 1, but arrives as a value of 2 will not initiate the landing routine.

A corrupt packet should not do anything, so that's good, not an error. We do not want the landing routine to be accidentally triggered in flight. Missing a valid packet is much better than triggering on an invalid packet. (It's a UDP protocol, so the entire system is designed to handle missed packets. Ground station re-sends commands until positive acknowledgement is received from the aircraft)

> The right thing to do, IMO, is to prevent corrupt data packets from doing such stuff. Checksum the packets or, better yet, checksum and encrypt them. That prevents the enemy from taking over your plane.

Exactly right. We were already using a checksum in the datalink, and the corrupted packet that caused the crash passed the checksum as valid! During the post analysis of the crash, I discovered that it was using an 8-bit XOR checksum implemented years earlier. 8-bit XOR is ok for detecting single bit errors, but is not good at detecting burst errors -- it does not detect ~12% of highly corrupted packets. I also updated the system to use a significantly more robust checksum after that incident.


It was a bad thing, the way you described it. Now that I know you also fixed the root cause of the problem, I can see it as an additional line of defense.

I think I wouldn't add it, though. Time is better spent on tooling that checks the variable doesn't get an incorrect value.


Like you said, an additional line of defense. For a web application there's no need. For a flight critical application where a failure means you just lost a few $100k worth of hardware, then I'll take every measure possible.


But that's not quite the same - older versions of C didn't have a real Boolean type so you could well run into those kinds of problems.

In a language that does have a proper Boolean type I still think checking equality with literal true/false values is a bit silly.


Typically that isn't done from ignorance, just some (imho misguided) idea about readability.


Probably the same kind of place that bans use of the ternary conditional operator because it's "too complicated" <sigh>


Eh, I can't say that I have any love for the ternary if operator. It is an ugly construct made necessary by the other flow control constructs being statements. Scala lets you just use if/else like that, the result is much clearer at the cost of 4 or so characters.


Guilty as charged. Although it's practically the same, I prefer the TRUE part. Same with the !aBoolean and FALSE.


I saw this in some coding standard for PHP.


Where it actually makes sense (assuming === and not == is used for comparison).


eh, that's not so bad really.

If I were writing something, and I really wanted to make it clear that I was testing a boolean to be true, I might write that. Then I can be absolutely certain the person reading it in 5 years won't misread it.

Actually, the more I think about it, the more I'd be inclined to do that in the false case ( ie if(aBoolean == false)... rather than if(!aBoolean).. ) because I worry that it's too easy to skip over the '!' - and I personally read that as "not aBoolean" rather than "aBoolean is false"...

I'll bet it complies to the same thing anyway, so it's just about readability at this point.


Why stop there? To really get the point across, you may want to do:

if (((((a == true) == true) == true) == true) == true)

That way, you can be even more certain!


People have different preferences when it comes to coding styles. There are some that are subjective (balance between clarity vs. simplicity), and there are idiotic styles such as !Boolean.FALSE.equals(aBoolean), or if (((((a == true) == true) == true) == true) == true).


The point is that using (a == true) instead of a is idiotic. Now, there are good semantic reasons to use (a == true) in some languages, but we're not talking about that, we're talking purely about style. It's no different from doing (a && true) or (i + 0) or (f * 1.0) when a, i and f would suffice. Not only is it longer, it forces you to stop and think, was there a good reason why this was being done? Can a hold truthy values that the programmer's trying to exclude? Otherwise, it's a sign of a programmer that doesn't know what he's doing - I've spent a decent amount of time teaching and grading and you see this a lot in first-year school assignments, rarely in high-quality production code. In almost all cases, it's a holdover from someone going in their head "if a is true" (because "if a" doesn't read well in human languages) and writing that out, not some kind of conscious effort to make the code more readable.


> In almost all cases, it's a holdover from someone going in their head "if a is true" (because "if a" doesn't read well in human languages) and writing that out,

It's interesting you admit that people are doing it "the long way" because that's how it naturally flows out of their head.

Doesn't it make sense that it would naturally flow into their head the same way?

What's the goal here - write very tight, concise code that fits some arbitrary standard of "correct"?

Or to write code that flows out of and into people's heads easily?


I was describing first-year students who are reading code as though it's written in a foreign language, not anyone with some degree of competence. For obvious reasons, professionals should not be internally verbalizing as they write or read code, any more than a fluent French speaker should be internally translating from English as they speak French.


Potentially, I can construct an independent statement to adequately convey my innervation to your statement of factual basis, that while being grammatically correct, purposefully adds needless complexity to the original statement of intent I am adequately attempting to convey.

Should I?


What are you talking about?

If it's somehow clearer to say if (x == true) instead of if (x), why isn't it also clearer to say if ((x == true) == true) instead of merely if (x == true)? And so on?


Huh. You sound just like me 5-10 years ago, everything black and white.

There is a grey zone in the middle, move in there.


There's grey zone and there's being obtuse. We're not talking about fancy tricks here - the clearest way to retrieve the truth value of a boolean variable is to refer to it directly, not comparing it for equality against a constant.


That might be the clearest way for you to write it today, though I'm talking about it being clear for someone else to read many years from now.

As so many posts on HN have said lately; writing code is easy, it's the reading that's difficult.


How is if (x == true) easier to read than if (x)? Deliberate obtuseness makes things harder to read.


Read my original comment. It would very much depend on context, and, as I said, I'd be more inclined to do it in the false case.

It's funny, you still seem to think there is a "right and wrong" here, and you can't see that coding style is just like writing a poem - each individual will do things a little differently.


What context does it depend on? Where is it reasonable to say (x == true) instead of x?

Edit: your original post said "If I were writing something, and I really wanted to make it clear that I was testing a boolean to be true, I might write that." What does this even mean? Simply writing x isn't really really clear, so you write (x == true) to make it extra clear? if (x) is testing boolean to be true and it's crystal clear. What else can it mean?


I personally think it depends on the complexity of the current function, and the complexity of the variable x. i.e. if x has been read and written to a bunch of times in the function already, it might add clarity to be very clear that you mean "if x is false".

The clearest way I can think of to write that, that's hopefully less prone to misinterpretation is if(x == false) rather than if(!x)

Like I said, these are just my personal opinion and an expression of how I code (and likely things that I find difficult or often misread when reading the code of others)


I've seen undergrads trying to write Ada:

  case x is
    when true => y;
    when others => z;
  end case;
You know, so the code doesn't stop compiling when someone comes along and adds values to the Boolean type.


Well, FileNotFound is a popular addition from what I hear.


Came across this one yesterday, slightly paraphrased:

  boolean updateMsg(boolean pPassed) {
    if (pPassed) {
      incrMsgCounter();
      logger.info("Message sent");
    } else {
      logger.info("Failed to send");
      return false;
    }
    return true;
  }
Yes, the calling code actually checked the return value. The code is full of stuff like this. Somehow I've got a morbid fascination and can't stop marvelling at how grotesque it is. It even overpowers the urge to read HN.


What's so bad about this one?

Other than its returning a boolean being pointless, I mean?


As 1stamour suggested the whole function was pointless. Not a big deal by itself but if you have layers upon layers of this stuff, code quickly becomes a buggy, unmaintainable mess. There are some awesome static code analysis tools like PMD and PHPMD for measuring code quality. It's not just "all programmers think everyone else's code is crap".


The code doesn't return just any boolean. No, it returns the boolean it was passed. And the calling code checks the result, what, against the boolean it passed in, I wonder? I'd worry about unintended side-effects depending on the language...


Somewhat unrelated, but out of curiosity, does anyone know of a site that lists — for all popular languages — various libraries/code snippets/routines which one can use to correctly (according to the RFCs) check the validity of e-mail addresses?

If not I may be compelled to create one.


The HTML5 standard actually defines a mostly-sane regex, which achieves that sanity through "a willful violation of RFC 5322, which defines a syntax for e-mail addresses that is simultaneously too strict (before the "@" character), too vague (after the "@" character), and too lax (allowing comments, whitespace characters, and quoted strings in manners unfamiliar to most users) to be of practical use here."

I like it.

Link: http://www.whatwg.org/specs/web-apps/current-work/multipage/...


The RFC-compliant email address is actually incredibly complex. There are loads of things that are RFC-valid email addresses that basically hardly anyone uses (like spaces).


Just send an email and see if it arrives.


/.@./ is pretty much the best you can do.

And even the dots are a bit iffy.


@"^[^@]{1,}[@]{1}[^@]{1,}\.{1}[^@]{1,}$"

I check it with this RegEx to make sure it's in the right format of [string]@[string].[string] to make sure that user at least tries to enter an email address, but beyond that sending and email and getting a response is the only way.


> @"^[^@]{1,}[@]{1}[^@]{1,}\.{1}[^@]{1,}$"

Is not

> [string]@[string].[string]

It's mere ^.+@.+\..+ (in both POSIX extended and PCRE dialects) that properly represents the latter.

And even such regexp filters out many technically-valid but obscure cases of RFC-compliant email addresses (for example ai, io, kh and ws TLDs have MX records, and supposedly hostmaster@io should be a valid email address).


Check out this incredibly complex rfc822 regex in action: http://regex101.com/r/fZ6cD5


There isn't. Look up the regex for validating an email address it you'd like to know why.


People who write similar code are what I like to call low-hanging fruit factories. I'm counting on them to make things as rubbish as possible, as long as they're not involved in any service I use, so anyone tempted to find vulnerabilities will go there first.


I'm sad to see so many smart people wasting their time discussing what some stupid person did. :(

I'm also disappointed I lost a couple of minutes of my life reading about this stupidity as well... just because it got 233 points.

So, I'm looking at YOU 233 who upvoted this. WHY DID YOU DO IT?


As expected. :)

Just click the downvote button on my comment. Don't bother to explain anything.

Oh, I know, you disapprove I called you smart. Sorry about that.


I've been on HN for a while (lurking), never seen a downvote button but here people referencing them on occasion. Why do I not see downvote buttons? Or is this some inside joke?


You need 500 karma to get a downvote button.

"Why don't I see down arrows?"

http://ycombinator.com/newsfaq.html


Not all users can downvote things, I'm not sure if it's directly tied to how much karma you have but it's a measure to give more weight to the opinions of the more respected members of the community instead of anyone who has an account. Hopefully this will mean that Hacker News never turns into Reddit.


Hate to burst people's bubble here on the privacy of email addresses, but it's routine at universities to have open, relatively unprotected LDAP directories or even web listings. That said, under Canadian PIPEDA (privacy laws), email addresses are considered personal information, so this would be a severe breach... As are all the times I get CC'd a bulk email rather than BCC'd.


"I stumbled upon this snippet while doing research for my thesis, 'Prolonged drug use and its effect on code quality.'" Lol.


Old slogan, booze bottle a day, keeps good code away.


[deleted]


From the article:

> [...] which is not actually called Hot4Learning, but the actual moniker is no less trite or gimmicky so I'll spare you


Obviously veiled reference to Desire2Learn (D2L).


This made my day. After spending the last 6 months turning down developers based on horrible code reviews, I finally feel vindicated for sticking to my instincts and hoping to eventually find someone that doesn't think code like that from TFA is "programming" and subsequently destroying my codebase and crushing my soul.


In order to truly understand the madness at play here, we really need to know how the list of email addresses was generated.


This would have been so simple to just make right in the beginning. They should have just hashed each of those user names and put the hash into a dictionary. Whenever someone entered an email check if it is in the dictionary. O(1) time and it wouldn't be a "data leak vulnerability". So close...


Check out our Eduware startup Coursio http://coursio.com/ for hassle-free education!

I vouch for the code, wrote it myself with some really good advisors around! ;-)

Get in touch with me personally, I'll give you a brief introduction!


Unrelated fact: Desire2Learn likes to employ large quantities of first year CS student interns.


Gee, what a newb. Here's how to do it in O(1):

    return userNamesStr.indexOf(curForwardUserName) >= 0
;)

(I'd love to not have to explain sarcasm, but people have an incredible difficult time understanding it here.)


lol, I was thinking that too -- "but indexOf is faster..."


Am I the only one who thinks that the amount of flaming is kind of unproductive?


It's a common thing I see with programmers. I know I've been guilty of it before. I see some poorly coded module and I'm just disgusted of how bad it is and then proceed to say some of the nastiest things about the developer who wrote it, sometimes just aloud to myself, other times to my peers.

These days I do my best to try to remember we all make mistakes, and that instead of flaming whoever is responsible for a mistake when it rears its ugly head, it's probably best to take it as an opportunity to discuss what I saw as shortcomings in the code with them and turn it into a learning experience if possible. Sometimes of course this isn't possible when you inherit code from devs who are long gone. Either way, I'm not sure any good is done throwing all that negative energy into the air.


It seems to be part of HN culture. Putting down this code make people feel better about themselves. Because obiously this is bad, even they can see that.

There is certainly value to the post but, yes, the flaming gets old - very quickly.


Unproductive, probably; fun to write, definitely.


For me it depends on what "validity" means.

1.) Prevent typos etc. Regex or Mailgun or Kicksend is enough. 2.) Prevent bounces, prevent wrong signups one needs to do double opt in.


The article has nothing to do with this aspect of address validation


1.) Yes, the article was about choosing the right approach for validation

2.) I urge developers to step back and think about the requirements they get. Concerning those I would have thought what validity in this situation means, e.g. light validation in JS, send validation / validation list on the server etc.


[deleted]


VRFY is usually disabled. The only way to verify an address is to send a message to it. If it doesn't bounce back, it... well, probably exists.


there is a huge market (education and higher education) where institutions (sometimes governments) will pay huge money for these sorts of apps, and for whatever reason, the only real players are often of this quality

at my university we use a similar system called OWL, which replaced a system called Web-CT. Both are horrendously slow, with fantastically poorly thought out interfaces.

There is a ton of money to be made here. Low hanging fruit.


How ironic that you mention WebCT, because that's what McGill in the article uses until very recently.


this should be submitted to thedailywtf.com...


It's #1 on HN, it has plenty of exposure already.


     if input_email in valid_emails_set:
        send_email(input_email, another_param, etc)
Their solution, while isn't wrong, could still be improved. With a somewhat modified 2822 regex with a more strict domain rule. But I would also assume you could just query the db.


I don't see how regular expression come in here. They'd have a list of valid addresses (in a Python dict or in a database or wherever) and they check against them.


You assume "in valid_emails_set" doesn't make a query.

It's fairly trivial to define your own __contains__ in python.


Indeed, since this is an internal system validating the uni's own emails it could easily query the uni's MSA for valid addresses.

Although for a set of 80k items and long-running processes (fcgi or wsgi) you could also load the whole thing in memory directly and not bother with a custom `__contains__`.


Dang, that programmer sure gets around! They did a stint on my team, too.


Sadly, this looks like something my boss might try to pull off...


[deleted]


The verbiage is sometimes used to exaggerate the condition of being stupid. I tend to enjoy that sort of flamboyant post on occasion, but others don't have much tolerance for it. YMMV.

[EDIT] Original poster took the "drug usage" vernacular a bit too literally.


To be fair (I didn't see the original comment), I'm not sure which drugs would result in designing such a terrible solution. I'm sure addicts around the world would not want to be painted with such a broad brush ;)


i wouldn't blame drugs... this is just stupidity, or perhaps even naivete. :)


lol


I was waiting for the part of story where she would describe how the developer had an actual drug habit or something, so much she talked about it. I wonder if she's ever really met someone who 'uses drugs', this mockery seems incredibly infantile.


> I would really like to know the combination and quantities of drugs consumed that resulted in this code. Do you know? Can you hook me up?

Meth. Two week binge. SilkRoad.

I can't imagine opiates did that.

And the only thing you'd do on coke is more coke in combination with hating yourself; not coding shit like this up.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: