Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

People need to comment SQL more!

We have a few ~200 line long queries in one of our applications. With comments one of them is 460 lines.

When things get hairy, don't be afraid to explain why you needed this CTE, what it's doing from a high level, point out that little gotcha that you ran into a few times while developing the query, why you needed to do X instead of the simpler Y, etc...

While there are downsides to excessive commenting (the big one is that the comments can get "out of sync" with the code and just cause confusion), when it comes to big/complicated SQL, I've found that more is better for the most part.



You can say that again! I started a DB review team at my company. Not only queries, but Postgres allows for tables to be commented on, you can place a comment on table, columns, constraints, domain, triggers, view, pretty much anything.

Initially a few folks rolled their eyes at that, but now everyone loves it. When you have thousands of tables, comments help, likewise when you have massive amount of queries. I'm also in the camp of let the DB do the work if it can do it faster, and likewise have 100-400 line queries that will turn into 5x LOC programs which take 20x time to execute. Writing tons of comments is important. Usually the query is right and when it works, no one remembers it for a year or two till business rules change.

My approach to commenting such queries after modification is to delete every single comment and start afresh. From memory, when I'm done. If I can't then I have no idea what the query was doing which is very bad. Sometimes folks modify query and get a happy result. If I can document it without reference to old one, great! After that, I look at the old one to make sure that I'm not missing anything.


I tend to take the "rubber duck debugging" approach to writing comments for big hairy SQL statements.

After i'm done (I tend not to comment these behemoths until at or near the end), I go and explain what the query is doing, sometimes almost line-by-line, to a fake "rubber duck" as if it were a junior developer. Even sometimes commenting what I didn't do, and why I didn't do it. Because as it turns out, future me is a bit of a cocky asshole that always thinks past-me was some kind of idiot that must have never thought to try that before...


I can totally relate to the future me being a cocky asshole part. I just had this experience coding the other day. I got about 15 minutes in thinking "this will be way better", git mv'ing files and committing etc until I ran into the exact same roadblock and it all came back to me. Now there is a sheepish trail of commits with messages explaining why I was wrong and it wasn't way better. I could have force pushed the commit before I started the changes but I wanted to shame myself so maybe I'll remember next time...


OP here, couldn't agree more regarding commenting[1]. Far too often people don't treat their SQL queries or database the same way they do code. Giving SQL the same attention as you do code with good formatting and comments can make life much easier the next time someone else or even yourself has to come back to some query years later.

[1]. http://www.craigkerstiens.com/2013/07/29/documenting-your-po...


> comments can get "out of sync" with the code and just cause confusion

I generally find it a safe assumption that were the comments don't agree with the code both are wrong, or will be next time someone makes a related change (as sod's law says they'll chose to trust the one that is least correct). Even when all the actors involved are me at different times.


I feel like for queries that are hard to understand at a glance, a picture is worth a thousand words. For the majority of people, myself included, writing helpful, high quality comments is really hard.

Instead, for a CTE, I would rather add a sample of initial data, and how they are transformed by the first two or so iterations of the loop.


> We have a few ~200 line long queries in one of our applications. With comments one of them is 460 lines.

You can decompose large queries into smaller views/temp tables, and such queries will be much more readable..


It depends how reused those subcomponents are, sometimes separating portions of the logic into views can obscure what's going on and end up increasing the LoC you need to parse to grok where a bug may be.

It's a trade off.


With most databases, you don't need to reuse it to get the maintainability win with temporary tables. (The notable negative exception being Oracle.) Furthermore it gives you power to control the query plan - a feature that I've found very helpful with both MySQL and PostgreSQL on occasion.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: