Hacker News new | past | comments | ask | show | jobs | submit login

Oh, my point was that two kinds of names should not be used interchangeably. To store or not the .lower() is a matter of taste.. (Personally, I would store both names, just to avoid wasting computing time)



I think the act of storing both names is bad, because you multiply the amount of data that could possibly become wrong by 2.

With lower(), we can expect we'll get the right transformation of string A each time. If instead, we store string A, and then store string B as A.lower() and copy it... A.lower() will always be A.lower, but it's much easier for someone to come along, screw with the database, and change B.


I'm not sure how they can avoid storing both.

They need to store the verbatim username in order to know how to display the username in the UI.

They need to store the canonical username in order to efficiently know whether a given canonical username is in use.


Not necessarily, in PostgreSQL you could simply add a canonicalised index.


Well, in that case you're still storing it, you're just letting the the db store it for you.

But - when the issue here is the question of the reliability of the implementation of the canonicalisation function, having it done once in python, and then again by PG is going to be a huge issue.


Yeah, I see. I'm not a web developer, so maybe this is why I did not think about this way to break the data.

Well, I still think that it is better to have two (hopefully) correct fields in the database, rather than only one. (Consistency of the two fields can be checked once in a while).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: