This is good intiative. We need to simplify pedagogy.
On a related note, I always found the syntax of SQL to be great. Someone told me that simplicity of SQL made relational databases even more popular. It is one of the best designed language where even someone new to the field won't feel intimidated. I'm surprised designers of SQL haven't won the Turing Award yet.
I agree except one thing: SELECT cols FROM table.
This should really be FROM table SELECT cols as every modern SQL editor will then be able to show you a list of available columns.
See LINQ. There they had the opportunity to fix it and it works just great.
More generally, statements should appear in the same order as they are executed, as far as this is possible.
This nullifies some other common sources of confusion, such as which statements are executed before a GROUP BY and which ones after (and why HAVING exists as a keyword). It's also similar to how CTE syntax is generally much more readable than subqueries.
The order of clauses in a SQL statement should have no relation on the order of execution because SQL is a declarative query language. The query specifies what the user wants and not how the system should compute the answer.
This declarative property also provides a (somewhat weak) motivation of why SELECT should come first. Precisely because it answers part of the WHAT, i.e., the schema of the result.
> Once you type the column names, it can guess the table name easily.
If the query is simple enough that a natural join would work and you aren't doing FROM clause aliasing (which can be useful to make reusable queries self-documenting), sure.
For more complex queries and obviously any time table or column aliasing are used, that becomes somewhere between less likely and logically impossible.
I'd rather have the enjoyment of being able to think in normal English speak rather than Yoda speak just to fix one edge case that appeared decades after they came up with the syntax.
I teach SQL to financial people and, for the most part, it isn't too bad. When I get to joins, I teach a subset of SQL where all joins are done through INNER JOIN / LEFT JOIN syntax and that helps a lot. Joins are the part that everyone has trouble with.
Urban Dictionary (https://www.urbandictionary.com) or even just typing something like "what does tbh mean?" into your preferred search engine will often help you find likely interpretations of unknown abbreviations, in my experience.
> Joins are the part that everyone has trouble with.
There is significant problem-solution mismatch in joins and some other SQL constructs which are therefore semantically quite controversial in many use cases:
In Postgresql, Lateral Joins enable function chaining for set returning functions. ie you can feed the results of earlier srf function to the arguments of following function.
This gets around one of Postgresql most glaring omissions - table valued parameters. Is incredibly useful once you get used to them.
I might be misunderstanding your point, but Postgresql functions can take (and return) tables:
CREATE FUNCTION fullname(person) RETURNS TEXT AS $$
SELECT $1.firstname || ' ' || $1.lastname
$$ LANGUAGE SQL
Could you go into a more detail wrt using lateral joins with set returning functions? I'm used to using them as a way to "extend" the row with later terms depending on prior terms as you say, but I'm not comprehending the srf case so well (or tbh the results that pop up when I search for "posrgresql set returning function").
In my experience it’s relatively rare that you need them. I think of them as the for-each of joins. It evaluates row by row, not a single time. Another analogy would be a correlated subquery.
Yes,I also suggest standards committee allow the option of putting FROM clause first - this would make it easier for beginners to grok what’s going on and enable better tooling such as auto-complete.
This is gorgeous, and a great topic. No matter what field you're working in, understanding databases is extraordinarily valuable...and SQL is among the longest lasting techs I've ever learned (e.g. name another programming language that is as valuable today as it was 20 years ago, if not moreso). Even with the rise of NoSQL databases, I suspect we'll all still need to know a little SQL in another 20 years (and it was already more than 20 years old when I first encountered it in the 90s). And, even if not the same language, conceptually understanding how to query data is probably timeless.
But Java and Python? Both languages have changed beyond recognition over the past 20 years. A programmer who hasn't touched either for 20 years would probably be overwhelmed.
Another difference is that even though SQL has changed, there isn't much you need to know outside of the DBMS you are using. With Java and Python on the other hand, you need to keep up with the changing library landscape to stay relevant.
Perl. And no, that's not just a factor of its current popularity, as the language has evolved and has regular releases.
Perl has amazing backwards compatibility, but has also solidified on much better coding practices and libraries in the last 20 years. You can still run that code from 20 years ago with a new interpreter and it will be safer and faster, or you can choose a more modern style or library and it will work just as well.
And to forestall anyone trying to bring up Perl 6 as a counterargument, unless you actually know what it is and how it interacts with Perl 5 and the community, I doubt the argument is actually saying what you think...
Sure, and I still use Perl as my primary language, just as I did 20 years ago, with some forays into other languages (like Python and C) along the way. But, this is survivor bias (one could argue SQL is also benefiting from survivor bias, though SQL was the dominant data query language then and it is the dominant data query language today, whereas one could argue only C was/is dominant in its space at both points in history). But, where's Pascal, Visual Basic (or Basic of any sort), Tcl, etc.? There's dozens of other languages that came and went, including data-oriented languages. All of which were popular during the 90s, and effectively gone now. And, only C among your examples is actually exhibiting the same kind of longevity (SQL is more than 40 years old, so it predates Python, Java, Perl, etc. by quite a lot)...I probably should have put the starting point for this thought experiment at 30+ years ago, but 20 years ago is when I first started using SQL, so it's what I used, which isn't really useful. Who cares when I learned the value of SQL, when it's an industry-wide phenomenon?
My point wasn't that one can't name languages with staying power. There are several. But, SQL is still special, I think. It's almost universally needed. If you go to work in a Node or Go shop today, you're probably still gonna need to know some SQL, just like if you went to work in a Python or Perl or Java shop in the 90s.
>e.g. name another programming language that is as valuable today as it was 20 years ago, if not moreso
If we include non-Turing-complete languages as “programming languages”, which we must for SQL to get that 20-year label since SQL wasn't Turing-complete until recursive CTEs were part of the language, then XML.
Also, even without that caveat, candidates include Bash, C, C++, Objective-C, Perl, elisp, Java, JavaScript, MATLAB, VHDL, and others. Some, while still being as useful as they were 20 years ago, are down from a peak in the intervening period, but arguably that's true of SQL, as well.
To your point. I learned SQL almost 30 years ago. Along Unix, vi, Perl it is a skill that I still use often. It is one of the best learning investments that I ever made.
I'm not the guy for trigger warnings but I find the choice of data unfortunate. There is a big percentage of people I won't recommend the page because of the dataset.
Congratulations! I love this kind of interactive tutorials as their helping more and more people learning stuff without a boring book and no exercises.
That said: Doesn't work in Firefox and Edge? Common, are we back in 1998?
Sorry for the rant, but please don't do that.
Thanks Dave! I share many of the sentiments in your writeup, especially about existing tutorials. Admittedly, I have been impressed by the quality of a few of the tutorials that others have suggested here. Will let you know if I think of anything!
Slightly embarrassing -- but the problem may be that the site isn't compatible with FireFox (at least pre-v63 FireFox). The site makes use of custom HTML elements which are not supported by default in pre-v63 FireFox (https://developer.mozilla.org/en-US/docs/Web/Web_Components/...). The good news is that all you need to do is to turn on a flag as set forth in the link provided. The bad news is that many other less tech-savvy people are going to be stymied by this.
edit: the site will now warn you if your browser is incompatible.
In 1998 you had to be running Windows. Now you can run any OS, even if the browser you have to use for a certain site isn't your favorite. It's an improvement.
People have always wanted to use "the web" as an application platform. It didn't work that well in 1998, but it's pretty good now. Progress has been made.
I was confused too. It appears the page does not work on Firefox. I get a "customElements is not defined" error in the console in Firefox and none of the interactive sections render.
Building a site for Google Chrome is like building for IE ActiveX back in the day. I'm surprised this isn't the main topic discussed here.If it doesn't work in Firefox, it isn't worth supporting. We need an open web.
Nice job, this is a great way to teach something like SQL. Any plans to add more chapters? Would love to see something around window functions AND CTEs as this format would go a long way in showing when and how to use them.
I also discovered the ability to select from multiple tables and do joins using where clauses instead, which to me is a lot more intuitive than explicit/imperative join statements. I didn’t see anything covering queries like those (maybe because there’s only one table in the dataset, idk i’m on mobile).
The O'Reilly "SQL Pocket Guide" is very well written, and, IMO, one of the best tools for learning SQL. It's extremely concise, and yet packed chock full of information. It's missing some things, sure, but once you have a strong foundation, the rest you can get from RDBMS docs. The fact that this is such a short book, and not a reference guide but an actual guide to using SQL, makes this book approachable and unintimidating. I highly recommend it.
I am looking for something to recommend to a new dev, and your link is definitely preferable. While the story in the opening post is cute, I just don't care about said story.
thanks a lot. it's an awesome project. in the introduction it was stated that this project is using Github pages for hosting but i couldn't find the repository address anywhere.
One suggestion to make things even more clear than they already are with this paragraph:
>The SQL query may look like an ordinary sentence, but you should view it as three Lego blocks: SELECT FROM executions LIMIT 3. As with Lego, each block has a fixed format and the different blocks have to fit together in particular ways.*
Explain what SQL is.
---
One example for this could be:
SQL means Structured Query Language, and with SQL we can build (structured) questions (query) using a syntax (language). To find what we want in the data, much like we build structures with various Lego bricks, we can use the following statement:
SELECT * FROM executions LIMIT 3
The SQL query may look like an ordinary sentence, but viewed as three Lego blocks, each block has a fixed format and the different blocks have to fit together in particular ways.
---
Just thinking of how to distill it even further...
It’s a lovely product and you can tell a lot of work went into making this course. But I fail to understand the author’s logic behind “a good course has to be free”. Why though? What’s wrong with creating value (like The OP) and then extracting it from the people that benefit from it?
In fact, there’s tons of anecdotal evidence that free MOOCS haven’t faired well in terms of retention precisely because of a lack of a buy-in from the people taking those courses.
Imagine people in low paying jobs who dont have the means to pay - there should be stellar free resources like this for them to grasp things with a solid enough foundation to get onto more complex projects.
Outstanding work. SQL was the gateway drug to coding for me and I wish this was around. SQL is my top recommendation for people looking to become more technical, and this will likely be what I send them first.
I just finished working through it. It is well written with an interesting if macabre example dataset. The interactive bits worked well once I enabled web components in Firefox.
I find the writing style interesting, but I'm not like others at my work who could benefit from learning SQL (business managers at a bank back office who rely mostly on Excel, then get stuck when Excel crashes with large datasets).
Are these funky, quirky books generally well-received outside the programming community? I quite liked Why's Poignant Guide to Ruby, for example, but I would never recommend it to anyone at work.
I really like this. Bookmarked. I have spotted a few bugs that you may want to check into.
Search the page for the word “llanguage”.
The sum/count checker told me it was incorrect for using lowercase nulls.
I normally write all my sql lowercase for brevity, but I also mostly use sql server.
I really like what you have done here and I hesitate to report it in your show hn thread because this really is great.
I really like your explanations. I have never been able to wrap my head around joins.
You say that: "Programming is best learned by doing", but I am struggling to do anything...
It would be great if there was some online area like SQL Fiddle with the data loaded, so I can run queries and see results in my browser without having to set anything up or am I missing something.
One option is to learn on bigquery (a google cloud service) - its "pre-loaded" with all kinds of public datasets that you can just start querying, with a web interface for doing sql.
Its pretty easy to stay within the free tier (1TB of data per month)
All the blue boxes are actually interactive code editors. You can run arbitrary queries against the dataset in them. I changed some of the wording to make it clearer. Hopefully that helps!
I have picked up SQL and statements from my time in industry. I really like how this book provides not only interaction but a dataset that actually means something. Also, interactive systems like this for teaching are amazing! I learned about how "_" is a wildcard for one character (I have been making do with % for LIKE statements).
Why did you use death sentence cases as data? I've spent around 20 minutes just reading the last words of various executed people. It's... disturbing and distracting.
The reason is explained in the front matter: “On one level, the data is simply a part of a mundane programming book. On another, each row represents immense suffering, lives lost, and in some cases amazing redemption and acceptance. In preparing for this dataset, I was deeply moved by a number of the statements and found myself re-evaluting my position on capital punishment. I hope that as we examine the data, you too will contemplate the deeper issues at play.”
It's deep and moving, it makes this course more than just a course. I just fetched a few rows and read Beazley's last statement and I now have a great respect for the creator of this course.
I strongly agree. Rarely do I see such a humanistic approach in tech. It brings an important matter to our attention and is meaningful. As a side note, what is meaningful is pedagogically valuable, because what has meaning is easier to remember.
it's data that asks to be investigated. There is no reason to query meaningless data. A query that tells you something you didn't know is infinitely more useful than a query that tells you what you already knew, and a dataset that encourages you to ask your own questions and craft your own queries is infinitely more effective than, say, an imaginary book catalog or some other nonsense that doesn't at all tempt the student to ask more questions of the data.
and besides: technology is a facet of society, it is not a purely rationalist act.
This breaks the site guidelines. You unfortunately have a history of posting uncivilly to HN. If you would please review
https://news.ycombinator.com/newsguidelines.html and follow the rules when posting here, we'd appreciate it. The idea here is: if you have a substantive point to make, to make it thoughtfully; and if you don't, to not comment until you do.
Why did you choose executions over something mundane like blogs or comments? Reading last statements isn't exactly something I want to be distracted by. The data is interesting sure and if this just happened to be based on something you did recently, I get it. It's not something I'd recommend to students doing data science or getting acquainted with SQL when there's less morbid source material out there. The interactivity provided is cool though, no doubt.
I found it engaging, and a truly real world example of how what we do can impact others, provide us with insights we might not normally have, and can help us influence the world around us.
1. In The Long Tail > Nested Queries, the first example contains the function LEN, which gives an error as LENGTH is expected.
2. On the questions with multiple answers, when showing hints, I would use different font colors for correct/incorrect statements as it will be easier to check your results.
As someone who has never really touched SQL except very briefly during a degree module, which basically taught me nothing, I enjoyed this. I think it taught me some good stuff. I will need to go back through it as I didn't pick it all up.
That said, as a novice, the challenge questions are not in anyway suitable for a beginner. I wouldn't, and didn't, have any idea how to do them; or know where to look to find a starting point. The previous chapters do not prepare you for the challenges and I came away thinking I was useless until I click the solutions and realised there was no way I would realistically ever been able to solve them without some form of further training in SQL.
Is there a trend towards pure SQL lately? Not that i m complaining (quite the opposite but let me not jinx it), but it seems we might stop seeing a new ORM each week for a while.
I can only imagine how much work this must have been (and not very accurately, I think). I did not have time to look over it in great detail, but I like what I have seen.
On a related note, I always found the syntax of SQL to be great. Someone told me that simplicity of SQL made relational databases even more popular. It is one of the best designed language where even someone new to the field won't feel intimidated. I'm surprised designers of SQL haven't won the Turing Award yet.