Hacker News new | past | comments | ask | show | jobs | submit login
Advanced Python Mastery (github.com/dabeaz-course)
779 points by a_bonobo on July 19, 2023 | hide | past | favorite | 155 comments



Wow. A now-CC-licensed 4 day (when presented in-person) Python training course that's been iterated on for 16 years!

David wrote https://www.dabeaz.com/generators/ which remains one of my all-time favourite Python tutorials. Looking forward to digging into this.


Beazley's Concurrency From the Ground Up is one of my favorite tech talks ever: In about 45 minutes he builds an async framework using generators, while live coding in an emacs screen that shows only about 20 lines and without syntax highlighting, and not breaking stride with his commentary and engaging with the audience.

It's 8 years old, but definitely worth a watch: https://www.youtube.com/watch?v=MCs5OvhV9S4


Exactly what I thought while watching that video, it's as if he's spitting out the characters as he speaks.:

"A fantastic, entertaining and highly educational talk. It always bothers me that I can't play the piano and talk at the same time (my wife usually asks me things while I'm playing). But David can even type concurrent Python code in Emacs in Allegro vivace speed and talk about it at the same time. An expert in concurrency in every sense of the word. How enviable!"


There's also the one where he live codes a Webassembly interpretor. But my favourite is his talk on lambda calculus. It's incredibly fun to follow through.


Yes, that talk is legendary. Very impressive how he is able to both talk and write the code at the same time.


Is that the one where he modified his interpreter to provide slapstick comedy as part of the talk?


the love of programming and engineering .. this is what motivated me at first place to do CS


Wow - that was awesome - wish I had known of this vid for years - but thank you.


It is entertaining and intelligent, but you won't learn Python from it and you won't get anywhere near a production ready implementation, since it glosses over all the hard parts.


Yes! My favorite Python talk, dude is a wizard


I wish they turned off ads on this. If this is PyCon surely they get PSF sponsorship money anyway...


You can pay YouTube to remove ads and help support content creators as well.


Why? I don't watch YouTube.

Moreover, this video is from PyCon. Which is sponsored by PSF. Which is sponsored by google, meta, AWS etc. So who is pocketing the ad money? Why do I want to pay to support that person? As well as Google and its advertisers?

The logic in your comment is incoherent.


Since we are sharing resources Fluent Python is my favorite reference on Python. It covers so many advanced features like concurrency, functools etc. It’s not the kind of book you read cover to cover it’s one that you go to as you need it, when I was working on python stuff I would read it once a month.

My favorite introductory book (not an introduction to programming but an introduction to the language) is “Introducing Python by Lubanovic” because it’s one of the only beginner books that actually covers the python module system with enough depth and the second half of the book gives a quick overview of a lot of different python libraries.


If we're talking favorite python tutorials, I am a huge fan of this tutorial on python entrypoints: https://amir.rachum.com/python-entry-points/


I'm guessing this was submitted after the "Ask HN" about leveling up to a production python programmer and i'm surprised no one mentioned these books:

1. Test-Driven Development with Python

2. Architecture Patterns with Python

The 2nd one is the closest you're gonna get to a production-grade tutorial book.

Related to this topic, these resources by @dbeazley:

Barely an Interface

https://github.com/dabeaz/blog/blob/main/2021/barely-interfa...

Now You Have Three Problems

https://github.com/dabeaz/blog/blob/main/2023/three-problems...

A Different Refactoring

https://github.com/dabeaz/blog/blob/main/2023/different-refa...

His youtube channel:

https://youtube.com/@dabeazllc


Am I the only one who is not particularly impressed by any of these links? Maybe I should see it as exemplatory examples, but they would not make it through a Code Review.

-> Currently serving as Application Architect for a medium sized Python application.


For the purposes of discussion, it'd probably be helpful to describe the issues you would identify during review


I’ve just found one. Shortening “input” to “inp” [0] is a big no.

[0] https://github.com/dabeaz/blog/blob/main/2023/three-problems...


Guess it's pretty decent code when the point of contention is about a single variable name being abbreviated.


This was the first and most obvious problem I found upon looking for a few seconds. I don't know if this is the only issue with the code.


The symbol "input" is a Python built-in.


Good catch. I would prefer shadowing an unused built-in over using “inp”. Alternatively, “input_” should do fine.


Don't shadow. Just don't. We've got enough variable names in the universe. :)


Strong agree. To clarify, I wouldn't shadow. I wouldn't use “inp” either.


One fairly common convention is to suffix with `_` to avoid shadowing, e.g. `input_`


shadowing is what instantly would have me reject a peer review. there is literally no excuse.

inp is just standard in python, along with uin (user input), or sometimes also raw or iraw.

You cannot win this battle. There is nothing wrong with 'inp'


Please don't shadow ever :(


Man..


That’s seriously the glaring issue you’ve found? It’s a tutorial ffs. Have you actually done code reviews professionally?


I didn't say it was a glaring issue. But, unnecessary cognitive burden on the reader is unacceptable.

> Have you actually done code reviews professionally?

Yes, I did.


That's the silliest thing for you to bikeshed about.

In a serious code review, that isn't even a starter of an issue, if you have context surrounding that variable. Furthermore `input` is an actual Python function, and shortening an example for learning purposes is not the same as asking other people do the same in production code.


haha you should see the code review comments at my <current job>. Random people spend hours bickering over variable names, while completely overlooking real issues in the code and rubberstamping their friends. It's like a friggin cult.


'written by the same author' -- thought it's dbeazley, it's somebody else in fact.


Fixed it, i meant that both of the books mentioned are written by Harry Percival (and the 2nd was written as a sequel) .


damn, I feel really dumb trying to follow the logic of all those function compositions in the three problems post


You shouldn't. Author seems pretty tongue-in-cheek about it:

> lambda has the benefit of making the code compact and foreboding. Plus, it prevents people from trying to add meaningful names, documentation or type-hints to the thing that is about to unfold.

Disclaimer, I did not read the entire post.


Python isn't the best language for exploring this sort of thing, by the author's own admission.


David Beazley will forever have my respect for his talk where he uses Python to untangle 1.5T of C++ code on an airgapped computer, as an expert witness in a court case:

https://youtu.be/RZ4Sn-Y7AP8

It's 47 minutes and totally worth it.


I don't even write Python but after watching Beazley's concurrency talk I will watch David Beazley talks all day. Adding this to the list


I taught this course to corporate clients for three or four years before developing my own materials.

The course materials for this course and the introductory course (“Practical Python”[1]) are quite thorough, but I've always found the portfolio analysis example very hokey.

There's enormous, accessible depth to these kinds of P&L reporting examples, but the course evolves this example in a much less interesting direction. Additionally, while the conceptual and theoretical materials is solid, the analytical and technical approach that the portfolio example takes quickly diverges from how we would actually solve a problem like this. (These days, attendees are very likely to have already been exposed to tools like pandas!) This requires additional instructor guidance to bridge the gap, to reconcile the pure Python and “PyData” approaches. (Of course, no other Python materials or Python instruction properly address and reconcile these two universes, and most Python materials that cover the “PyData” universe—especially those about pandas—are rife with foundational conceptual errors.)

Overall, David is an exceptional instructor, and his explanations and his written materials are top notch. He is one of the most thoughtful, most intelligent, and most engaging instructors I have ever worked with.

I understand from David that he rarely teaches this course or Practical Python to corporate audience, instead preferring to teach courses direct to the public. (In fact, I took over a few of his active corporate clients when he transitioned away from this work, which is what led me to drafting my own curricula.) I'm not sure if he still teaches this course at all anymore.

However, I would strongly encourage folks to look into his new courses, which cover a much broader set of topics (and are not Python-specific)! [2]

Also, if you do happen to be a Python programmer, be sure to check out his most recent book,“Python Distilled”[3]!

[1] https://dabeaz-course.github.io/practical-python/

[2] https://www.dabeaz.com/courses.html

[3] https://www.amazon.com/Python-Essential-Reference-Developers...


> https://www.dabeaz.com/courses.html

Well, unfortunately 5-day courses listed there are $1500 each.

If the free of charge course discussed here is really that good, it is a nice promo to go and pay for another. Ed Tech Lo-Fi style.


Something I hate in my own code is this pattern of instantiating an empty list and then iterating on it when reading files. Is there a better way than starting lst= [] and then later doing lst.append()

This is an example from the linked course https://github.com/dabeaz-course/python-mastery/blob/main/Ex...:

``` # readport.py

import csv

# A function that reads a file into a list of dicts

def read_portfolio(filename):

    portfolio = []

    with open(filename) as f:

        rows = csv.reader(f)

        headers = next(rows)

        for row in rows:

            record = {

                'name' : row[0],

                'shares' : int(row[1]),

                'price' : float(row[2])

            }

            portfolio.append(record)

    return portfolio
```


Whenever you see this pattern, think of using a generator instead:

    def read_portfolio(filename):
        with open(filename) as f:
            rows = csv.reader(f)
            headers = next(rows)
            for row in rows:
                yield {
                    'name' : row[0],
                    'shares' : int(row[1]),
                    'price' : float(row[2]),
                }
Now you can call read_portfolio() to get an iterable that lazily reads the file and yields dicts:

    portfolio = read_portfolio()
    for record in portfolio:
        print '{shares} shares of {name} at ${price}'.format_map(record)


or use the built-in csv.DictReader :)


It's not better than a generator, but I'm surprised nobody has mentioned the very terse and still mostly readable

    header, *records = [row.strip().split(',') for row in open(filename).readlines()]
but then you need a way to parse the records, which could be Template() from the string library or something like...

    type_record = lambda r : (r[0], int(r[1]), float(r[2]))
At this point, the two no longer mesh well, unless you would be able to unpack into a function/generator/lambda rather than into a variable. (I don't know but my naive attempts and quick SO search were unfruitful.) Also, you're potentially giving up benefits of the CSV reader. Plus, as others have clarified, brevity does not equal readability or relative lack of bugs:

In the course example, it's reasonably easy to add some try blocks/error handling/default values while assigning records, giving you the chance to salvage valid rows without affecting speed or readability. In fact, error handling would be a necessity if that CSV file is externally accessible. Contrast that with my two lines, where there's not an elegant way to handle a bad row or escaped comma or missing file or virtually any other surprise.

Anything else I can think of off-hand (defaultdict, UserList, "if not portfolio:") has the same initialization step, endures some performance degradation, is more fragile, and/or is needlessly unreadable, like this lump of coal:

    portfolio = [record] if 'portfolio' not in globals() else portfolio + [record]
So... your technique and generators. Those are safe-ish, readable, relatively concise, etc.


> It's not better than a generator, but I'm surprised nobody has mentioned the very terse and still mostly readable

> header, *records = [row.strip().split(',') for row in open(filename).readlines()]

Better would be:

    header, *records = [row.strip().split(',') for row in open(filename)]
No need to read the lines all into memory first.

Edit: Also if you want to be explicit with the file closing, you could do something like:

    with open(filename) as infile:
        header, *records = [row.strip().split(',') for row in infile]
That is if we wanted to protect against future changes to semantics for garbage collection/reference counting. I always do this, but I kind of doubt it will ever really matter in any code I write.


> No need to read the lines all into memory first.

It looks like that code does read the whole file:

(with a foo.csv that is 350955 bytes long:)

  % python -V
  Python 3.11.4
  % python
  >>> f = open("foo.csv")
  >>> f.tell()
  0
  >>> header, *records = [row.strip().split(',') for row in f]
  >>> f.tell()
  350955
I thought that using a list comprehension to bind header and records was eagerly consuming the file, so I changed it to a generator comprehension with

  >>> f.close()
  >>> f.open("foo.csv")
  >>> header, *records = (row.strip().split(',') for row in f)
  >>> f.tell()
  350955
nope, I guess the destructuring bind does it?

  >>> f.close()
  >>> f.open("foo.csv")
  >>> headers, records = f.readline().strip().split(','), (row.strip().split(',') for row in f)
  >>> f.tell()
  125
not as neat, though. Is there a golf-ier way to do it?*


The parent poster was pointing out that this requires having two in-memory complete copies of the file:

    [... for row in open(filename).readlines()]
The readlines return value is one copy, and the list comprehension is another copy. However, that first copy can be avoided with:

    [... for row in open(filename)]
The entire file must still be read to evaluate the list comprehension.

Additionally, this doesn't do what you think it does:

    >>> header, *records = (row.strip().split(',') for row in f)
Compare to this, using a variable for clarity:

    >>> gen = (row.strip().split(',') for row in f)
    >>> header, *records = next(gen)


    def read_portfolio(filename):
        record = lambda r: {
            'name': r[0],
            'shares': int(r[1]),
            'price': float(r[2]),
        }
        with open(filename) as f:
            rows = csv.reader(f)
            headers = next(rows)
            return [record(r) for r in rows]


Swap the square brackets for parentheses in the return statement and it will return a generator expression.

That will read the file as needed (ie as you iterate over it) instead of loading the entire thing in memory.

    for record in read_portfolio(fn):
        # do stuff


If you do that, it will try to read from a closed file.


Or even:

  def read_portfolio(filename):
      with open(filename) as f:
          rows = csv.reader(f)
          headers = next(rows)
          return [
              {
                  "name": r[0],
                  "shares": int(r[1]),
                  "price": float(r[2]),
              }
              for r in rows
          ]


I don't think you can really improve on this.

You could use a list comprehension, but that can be unclear and hard to extend, depending on the situation. It can be a nice option if most of the parts in the generator can be broken out into functions with their own name, though.

You could turn it into a generator, which can cause some fun bugs (e.g. everything works fine when you first iterate over it, but not afterwards), so IMO that's best used when it needs to be a generator, for semantics or performance.

You could turn it into a generator, then add a wrapper that turns it into a list (keeping the inner function private), or use a decorator that does the same, but it's less clear than this pattern.

So, i'd just learn to live with it.


Yeah I think using a list comprehension is overkill. The main reason I like list comprehensions is because I don't introduce variables (even temporarily) that I don't really need. I think that clarifies the code. But putting the code in a separate function also avoids introducing those variables to the current scope only at a cost of putting the code somewhere else (which I personally think has a cost). In this case I would just use a function or (probably) just inline it as you don't like.


> which I personally think has a cost

Yeah, so many people don't get this, but too many small functions can be hard to understand -- that's why I qualified that option.

In this case i agree that inlining it is fine, i was talking about the general pattern.


There is always list comprehension.

So `portfolio = [{'name': row[0], 'shares': int(row[1]), 'price': float(row[2]) for row in rows]`

But if it's more complicated than this (like if there is conditional(s) inside the loop), I'd recommend just stick with the current approach. It's possible to have even multiple conditionals in list comprehension, but it's not really very readable. If you do want to, walrus operator can make things better

(something like `numbers = [m[1] for s in array if (m := re.search(r'^.*(\d+).*$', s))]`)


They can be more readable than that at least, e.g.:

    keys = "name", "shares", "price"
    portfolio = [
        dict(zip(keys, row))
        for row in rows
    ]
If I had to do more complex stuff than building a dict like this I'd move it into a function. That tends to make the purpose more clear anyway.

That said, it's fine to append to a list too, I just prefer comprehensions when they fit the job. In particular, if you're just going to iterate once over this list anyway, you can turn it into a iterator comprehension by replacing [] by () and save some memory.


You can alternately stick the logic into a function, which maintains the readability.

    def get_record(row):
        return {
                'name': row[0],
                'shares': int(row[1]),
                'price': float(row[2])
        }
    return [ get_record(r) for r in rows ]
or

    return list(map(get_record, rows))


I do like to use ad hoc functions to make things cleaner (mainly for the handy "early" return behavior), but in this case I don't find it's much better than "just create an empty list first and do a for loop".


Personally, I enjoy the pattern of making the return type a dataclass and make this function a static method on the dataclass, something like `def from_data(self, data: Dict) -> PortfolioRow`.

In 2023, it's rude to return dictionaries. :)


> In 2023, it's rude to return dictionaries. :)

Why do you say that?

I think you were partially kidding, but also half serious. What’s the issue with returning dictionaries, and why should we be returning dataclasses instead?

Asking for my own learning.


Data classes are "self-documenting" with respect to the "keys" you can expect to be present. Relatedly, they enable meaningful type hints:

  def some_method(input: Dict[str, int]):
    ...
You can just as easily call `some_method({"foo": 1})` as `some_method({"bar": 2})`.

vs.

  def some_method(input: MyDataClassWithFoo):
    ...
Now you can't pass a dict where the key is "bar" and presumably get a KeyError when it tries to look up a "foo" key, you can only pass a MyDataClassWithFoo.


Regarding the 2023 part, it's because dataclasses in Python are pretty fully-featured and part of the stdlib, so building rich data structures is quick, ergonomic, and helps remove/centralize boilerplate and parsing (aka validating, see https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-va...).

Regarding rude, yea it was a bit tongue-in-cheek (I would never hold a grudge against somebody for returning a dict).

You should generally define an interface for your function that's as precise as its logic allows for. `Dict` is as good as `Any`: it doesn't tell you very much about what the function internally expects. The sibling comment to mine does a good job going through this.


Sigh (re: sibling comments) whatever happened to PEP 20 in particular:

``` There should be one-- and preferably only one --obvious way to do it. ```


This is absolutely the most misinterpreted line in the Zen of Python. The key word there is obvious, not one.


    with open(filename) as f:
        rows = csv.reader(f)
        next(rows)
        return [
            {
                'name': row[0],
                'shares': int(row[1]),
                'price': float(row[2]),
            } for row in rows
        ]


Why don't you like it? I am asking because you are asking for a better way. Better in what way?


You could “yield” the record instead of constructing the list. This makes “read_portfolio” into an iterator instead of returning a list. Use a list comprehension or list constructor to convert the iterator to a list if needed.


You could make read_portfolio a generator ( https://wiki.python.org/moin/Generators ). But that might confuse inexperienced Python programmers.

Personally the way you've done it is the most Pythonic IMO. List comprehensions are great but would be less readable in this case.


If there is no library for your case like pandas (or even csv.DictReader), you could always use an iterator:

    def iter_portfolio(rows):
         for row in rows:
             yield {'name': row[0]}
    
    rows = ...
    portfolio = list(iter_portfolio(rows))


I too have run into this situation, and while list comprehension makes it possible, it's never clean looking.

Honestly, this is the approach I've been using even though I hate it. Specially if your code is going to be read by anyone other than you.


I actually kinda like that pattern in terms of readability, though I think a generator would outperform it.


If it’s a lot of data and part of a pipeline you’ll get memory saving. 1,000 lines and reading from a CSV it won’t matter really.


You could try to squeeze it all into a list comprehension.


I have also encountered this quite often. I'll say the ideal solution would be "postfix streaming methods" like `.filter` and `.map`. Unfortunately, Python doesn't have those (prefix `filter`s and `map`s are not even close), and you have comprehension expressions at best. To make things worse, complex comprehensions can also create confusion, so for your particular example I'll probably say it's acceptable. It could be better if you use unpacking instead of indexing though, as others have pointed out.


Another good (and entertaining) resource is James Powell's talk "So you want to be a Python expert" [1], the best explanation I've seen of decorators, generators and context managers. Good intro to the Python data (object) model too.

[1] https://youtu.be/cKPlPJyQrt4


Let's assemble an 'expert python' curated list.


Yep, just recently rewatched this. James is an extremely clear presenter. I recommend this to everyone trying to get to that next level with Python.


David beazley, also known as the Jimi Hendrix of python


I saw his talk live in 2014 and the dude is amazing. I loved his summary of building Python libraries from the ground up during legal discovery because he discovered a hidden Python installation on the terminal his opponents gave him that allowed him to parse thousands of documents very quickly.

https://youtube.com/watch?v=RZ4Sn-Y7AP8


This is very cool, good for Beazley for making this freely available. I really should take the time to work through this material. For 40 years I have been a “Lisp guy”, slightly looking down on other languages I sometimes used at work like C++, Java, etc.

However, because of available ML/DL/LLM frameworks and libraries in Python, Python has been my go to language for years now. BTW, I love the other comment here that Beazley is the Jimi Hendrix of Python. Only those of us who enjoyed hearing Hendrix live really can get this.


You've listened to Hendrix live? Well now I'm envious ;)


Well, it was just the one time.


The author is of this course is also one one of the original authors of SWIG which got discussed here just yesterday. Two HN front pages in two days!

https://news.ycombinator.com/item?id=36769912

https://www.swig.org/guilty.html


People should not take that an endorsement of Swig.

Please use ctypes, cffi or https://github.com/wjakob/nanobind

Beazley himself is amazed that it (Swig) is still in use.


It's probably not the case this time, but a lot of the time someone will browse around after clicking on a link here and find another interesting thing worth posting. Love that about hn.


Also, the python content of Fred Baptiste, 4 parts course on Udemy, is the gold mine. As deep and detailed as I never imagined :-)


I've attended two week-long courses of the author, David Beazley, and they were both amazing. Highly recommended.


Beazley is a mad man, so much fantastic stuff from him. My favorite is probably Ply and Sly as CFG parsers. I didn't even think to check for books he wrote but now I have to go down the rabbit hole.


It was for Python 2, and the book was published many years ago, but Beazley's Python Library Reference book (if I remember the name correctly) was one of the best software reference books that I bought and read.

I googled, and this seems to be the current edition:

https://www.amazon.in/Python-Essential-Reference-Essentia-De...


> # pcost.py

> total_cost = 0.0

> with open('../../Data/portfolio.dat', 'r') as f: > for line in f: > fields = line.split() > nshares = int(fields[1]) > price = float(fields[2]) > total_cost = total_cost + nshares * price

> print(total_cost)

yikes what a terrible reference implementation! Least they could do is reduce([...], +) as a two-liner


https://www.artima.com/weblogs/viewpost.jsp?thread=98196

map/reduce (for better or worse) get a bad rap. Some blog or training I took when first starting up with the language told me 'map/reduce Bad' and I have generally avoided ever since.


Beazley is one of the masters and his other books are great. My nitpick here is that, as in any other language, the basics are there forever and the advanced features/techniques get old and replaced every now and then.


This is very good.

David is also doing online immersive courses - 1 week long but I believe he also splits them into one day sessions now. Highly recommended !

He has a real talent for explaining complicated concepts in a very simple and approachable way.


I appreciated greatly every talk dave made I saw. Am naturally quite thrilled to see what's he got for us here. Thanks.


Is there a C version of something like this?


C Unleashed by Heathfield etc and Modern C by Jens Gustedt are two am aware of. Oh and also Expert C Programming by Peter van der Linden


Just skimming through it, and wow I'm impressed! I wish people had something like this for other languages!


Python is a tool for people from diverse backgrounds to work together with code. "Advanced mastery" works against this goal by elevating your level beyond what others can read and understand.

But then again, I don't use list comprehensions, because I don't comprehend them, so what do I know.


Just because you can't read or understand masterful code, does not mean you should deny someone else's mastery of Python.

Are the "people from diverse backgrounds" (whatever that means) denied any of their rights and privileges by a programmer's use of "advanced" code? No. Quite the contrary: less-experienced people may still read, learn from and improve themselves by the advanced/beyond-their-level code they encounter. So what's your real issue?

Code is written to solve problems; not to please the lowest common denominator. If you can't read code, you're that denominator, and that's on you.

I find your statement terribly odd too. A true Python master tends to write highly readable code; idiomatic and Pythonic code tends to be readable, unlike in other languages. So if you can't read advanced Python (or even use list comprehensions, which are basic in the Python scheme of things), you're not qualified to opine on what "advanced mastery" of Python entails.


> Python is a tool for people from diverse backgrounds to work together with code.

Where did you get that idea? Python is a programming language. That some find it more accessible than others is orthogonal to the work needed to master it.


So people should suck at writing Python code for the sake of teamwork? What?


Mastering your tools is step one of being a professional.


Ah, the lowest-common-denominator argument for why your code is crap. That is the reason that Python code has become what PHP was 10 years ago: The mark of projects to avoid at all cost.


That's not nice. Everyone is on their own journey, their own learning curve, to a level experience needed to fulfill their own ambition. Rather than saying, "your code is crap" wouldn't it be more productive to encourage this person to challenge themselves to get outside their comfort zone?


That person explicitly dismissed "advanced mastery" and thus challenging and bettering oneself or themselves as being non-inclusive. I find this attitude highly offensive, because my motivation towards mastery is not of that kind. I'm happy to include people, explain them my code, help them along towards mastery.

Software also has an ongoing quality crisis, all while being more and more influential in people's lives. That person's attitude towards writing quality software helps to deepen that crisis and is therefore harmful.


> Everyone is on their own journey

Not everyone is trying to be nice either.


This is a perfectly sane, at least from business point of view, argument. It is being treated with a tantrum of downvotes. That in turns shows a lot about the state of affairs in a certain bunch of programmers. Most importantly that they put their own narrowly defined version of excellence above the importance of maintainability and thus tangible business value.

To put it bluntly. I'd hire the downvoted guy/gal in a heartbeat but would shy away from the downvoters. Why? Because I need to deliver business value which pays for our salaries. And I need to do it today and tomorrow and in years from now.

This is a message to normal people that understand that coding is a _social_ activity that has an audience in the present (your coworkers) and in the future (poor maintainers). Not only you're not alone in this but you are the majority.


I think downvoting that comment is stupid, but I also think the comment is wrong: Mastery does not imply incomprehensibility, but rather the opposite.


I have several decades of experience reading and using other people's code. When code is written at "master" level, that excludes most of the people who could generate business/social value out of using or improving it.

I appreciate the aesthetic beauty of great code. But it has a cost compared to average code.

This is doubly true for a language like Python, which occupies a niche of "lingua franca between users with wildly different backgrounds bringing value to the table by being able to use and change the same software".


What you are talking about is not code that is written at a "master" level, what you are talking about is code that is clever. An important part of mastery is understanding what kind of code to write, when. Mastery means writing code that is easy for others to read and modify, when writing code in a professional setting. Mastery can also be writing clever code with "aesthetic beauty", in a different setting like an academic or personal project. But people who write clever inaccessible code while working professionally with a team have a few more steps to walk toward mastery.


Mastery of Python, in my opinion, is mastering "Pythonic" code, which incorporates readability as a fundamental tenet. In addition, the gigantic universe of not only the standard library but also numerous external libraries and tools, of which a master can leverage to undertake tasks both rapidly and efficiently.

So, to me, a Python zen master would not write incomprehensible code, but instead write readable code very quickly that effectively and efficiently solved the problem they are facing due to their comfort working inside the Python ecosystem.


My personal goal is to get as close to Norvig as possible for general Python code: https://github.com/norvig/pytudes/blob/main/ipynb/Advent%20o...

(short but exact comments, inline testing, good function and variable naming, overall good use but not overuse of the standard library, functions very rarely more than a dozen lines, generally understandable code)


> I think downvoting that comment is stupid,

Why?


Not OP. Nothing in the original comment was technically wrong. It does feel misguided and perhaps a little naive. But it seems more like an opportunity for a real conversation to both understand why they think that way and as a way to educate why there might be a better way.

I reserve downvotes for posts that are flagrant, factually wrong, or are otherwise against the rules. Flagging might also be used. But using downvotes to have a voice not be heard feels wrong, too. What was said doesn't hurt anyone, even if a vast majority of people around here might disagree with it. Downvoting because you disagree feels wrong.


Downvoting to indicate disagreement is how HN has always worked: https://news.ycombinator.com/item?id=16131314


This might come as a shock, but pg has been wrong about lots of stuff, including this. Downvotes should be used for bad comments, not comments that you disagree with. These aren't the same thing. It's fine that downvotes and upvotes aren't symmetrical in this sense. They aren't anyway - highly upvoted comments don't get bolded the way that downvoted comments get grayed.


> Downvotes should be used for bad comments, not comments that you disagree with.

I disagree with this. But I can't downvote this comment because it is a reply to my comment. This restriction specifically exists because downvoting to disagree is how HN has always worked.

> highly upvoted comments don't get bolded the way that downvoted comments get grayed.

Highly upvoted comments float to the top and therefore have more visibility.


> This restriction specifically exists because downvoting to disagree is how HN has always worked.

I don't think so? For instance, I can upvote replies to my comments, so it's not they have set up some "you can either comment or vote" system. This restriction is just a nudge toward positivity rather than negativity.

> Highly upvoted comments float to the top and therefore have more visibility.

Yes, but downvoted comments float down and are grayed out. The point is just that the two things aren't totally symmetrical.

(Also I think some of this stuff was implemented years after pg's pronouncement about downvotes.)

And listen, I didn't say "people who downvoted that comment aren't using HN correctly and should be booted off the site!". I just said "I think it is stupid to downvote that comment". And I do. It's stupid to downvote perfectly reasonable comments that you simply disagree with. Again, most people don't use HN that way, irrespective of anything pg said in 2008, or we would see a lot more gray comments, and I would have gotten a lot more downvotes over the years on stuff I've said that people disagree with, instead of comments telling me why I'm wrong.


Bad comments should be flagged.


Abusive comments should be flagged. Bad comments should be downvoted. There are lots of comments that are bad because they are not constructive, or are off topic, or are arguing in bad faith. These don't deserve to be flagged. They deserve to be downvoted. And there are lots of comments that are constructive and in good faith, but just reasonable to disagree with. These don't deserve downvotes, they deserve a comment disagreeing, or an upvote on an existing comment disagreeing with them.

For what it's worth, I contend that - notwithstanding what pg and dang said many years ago, this is the revealed preference of most HN users, because it's quite rare to see a comment that is downvoted, just because lots of people disagree with it.


> revealed preference of most HN users, because it's quite rare to see a comment that is downvoted, just because lots of people disagree with it.

This does not match my experience at all.


Outside of this thread, which seems to have brought "downvotes are for disagreement!" crusaders out of the woodwork, I essentially never see grayed out comments that are in good faith but just wrong. That may be because there are roughly the same number of people who agree with the wrong take as disagree, but I doubt it. I also get five to twenty upvotes on a comment every once in awhile, but I don't think I've ever seen more than about two downvotes on a comment, including on ones that get strong disagreement in replies. People don't just agree with me a lot more than they disagree with me, it's that most people aren't using the votes in this symmetrical agreement/disagreement way. Which I think is a good thing, even if it's not what the site's powers that be intended.


It is technically and factually wrong.

Advanced != incomprehensible. Incomprehensible isn’t a feature of advanced either, you can be a novice and still write incomprehensible Python code.


Because downvotes aren't for disagreement. There is nothing wrong with the comment, it isn't aggressive or trolling or in bad faith or anything. It's just reasonable to disagree with it.


> Because downvotes aren't for disagreement.

On Reddit, not HN: https://news.ycombinator.com/item?id=16131314


I'm not going to downvote you just because I disagree with you, I'm going to write you a comment about what I think instead :)

I don't consider that appeal to authority canonical. My opinion is that downvotes should be used for bad comments, not for comments you disagree with. These aren't the same thing.


> I'm not going to downvote you

You can't downvote me because my comment was in response to yours. It's not an appeal to authority. It's a statement of fact as also seen in the implementation itself.


It was tongue in cheek :)

It is the definition of an appeal to authority to quote an authority figure as the final word on some debate. I've been here since before that first comment from pg about this, and no, I don't agree that it's a "fact" that this is "how HN has always worked", and no, there is nothing in the implementation that unambiguously makes downvotes be for disagreement.

But again, I'm not arguing for strict rules that work the way I prefer. I'm saying, this is a community I participate in, and I have my own opinions about how best to participate in it, which aren't necessarily aligned with the people who created the site 15 years ago. Other people are entitled to differing opinions about this, and can use the site how they prefer, but I still have my own opinions and will advocate for them.


> This is a perfectly sane, at least from business point of view,

It's perfectly sane to actively avoid trying to understand the your tools of your trade better?


I too would have disagreed 10 years ago. 15 years ago I may also have downvoted.

Now I recognise that we should, unless there's a very good reason (not for style), keep our code stupid-simple.

And doing that is harder than making it clever.


It was a bit dumb to make a definitive statement about what Python is used for, rather than just saying it's fairly often used for that.


Thanks, but I have a job. :)


I agree with you, though I wouldn't draw the line before list comprehensions, those are very basic.


Thank you. Timely as my team is looking to skill up on advanced python.


Is there something similar for Java/Kotlin?


>generators

Oh man, that is some job security!

Generators seem unpythonic.


dabeaz signed my copy of python essential reference for 1.5.2


The legend :-)


[flagged]


I agree that the average quality of Python instructional material is quite low. The language is very popular, and it's often pitched as a beginner's language or a language for those who are not (or do not want to be) professional programmers. Free (uncurated) platforms like YouTube and LinkedIn make it very easy to distribute poor quality material (and provide very weak feedback to encourage quality improvement.)

I strongly reject any assertion that David Beazley's materials, his instructional abilities, or his capabilities as a programmer are lacking. Having worked with David, I can provide testimonial to his skills (though his body of work speaks for itself.)

The example you highlight amounts to little more than nitpicking, and it suggests a fundamental misunderstanding of the instructional process. Trust me: I have taught this exact course (as well as “Practical Python”) to numerous corporate audiences.


For context, the bullet points you reference are on slide 1-10 of the supplementary slide deck[1] and are provided as part of an accelerated review. This is a recapitulation of materials covered in the introductory “Practical Python” course in unit 01-02[2]. The “Practical Python” course is designed to be taken by attendees who have minimal experience with Python, including those who have minimal prior experience with programming at all.

In the context of an intermediate/advanced course, these are clearly being provided as an overall framing, and are not intended to be read as a precise description of Python's execution model They are, instead, intended to be glossed over (perhaps by an attendee who has somehow skipped the introductory course.) As a result, it would not be appropriate for these bullet points to discuss the finer points of Python's expression/statement dichotomy. It is clear that their intention is to express, with simplification, the general nature of Python's execution model and to distinguish it from tools which the attendee may already be familiar with (e.g., C++.)

There are a number of mistakes made in your own explanation, some of which have been highlighted by other posters. I will provide my own corrections to illustrate ① how distracting, pointless, exhausting, and useless a precise accounting of the mechanisms would be (especially in the context of this course and given the likely profile of a course participant) and ② that there may be some unearned confidence leading back to the source of this criticism.

I have my own criticisms of this course, which I have shared in another comment. But, as a personal aside, I have often found that, when I pit myself against the world—everyone else is stupid and wrong—it has provided me with a good opportunity for self-reflection. I have found a lot of personal growth in interrogating and questioning my own confidence and striving to find meaning and truth in my instructional work.


>> A Python program is a sequence of statements

> Here's a Python program that contains no statements: 42.

It is true that Python's grammar features a statement/expression dichotomy, unlike many other tools. If we want to speak to Python's grammar, we should make sure to consider two general eras—before and after the introduction of the PEG parser. The PEG parser was introduced to Python with PEP-617[3].

Let's consider first the grammar used in Python 3.8, prior to the PEG parser. You can find this in Grammar/Grammar[4]. As we can clearly see, there are a number of “entry points” for a well-formed Python programme[5]:

    single_input: NEWLINE | simple_stmt | compound_stmt NEWLINE
    file_input: (NEWLINE | stmt)* ENDMARKER
    eval_input: testlist NEWLINE* ENDMARKER
We can see from the above that, with the exception of `eval_input`, we consider a well-formed Python snippet to be a sequence of statements. The programme snippet `42` would be parsed as an `atom` which forms an `atom_expr` which is part of a rightward-chain that begins with `test` which eventually rolls up to `expr_stmt` where a `testlist` is considered to be an a . This elides a number of details (because, for most users, even this simplification is exhausting and useless) and may itself be slightly incorrect, but it illustrates that, as another poster asserts, the CPython reference implementation grammar prior to the PEG parser considers single expressions in the context of a `file_input` to be `expr_stmt`—expression statements.

The Python parser considered a file input to be a “sequence of statements.”

But, of course, who cares?

Remember that the goal of an instructor is to present just the right level of detail that an attendee can do something useful. It is the case that the expression/statement dichotomy is useful, especially when considering the common expression⇋statement dualities we see in the grammar, but this is not a topic for day one, slide one of an intermediate/advanced course.

By the way, if we look at the PEG grammar, we see similar[6]. A `file[mod_ty]` input is comprised of `statements` (and an `eval[mod_ty]` input is comprised of `expressions`.)

Therefore, it is incorrect to say that “a Python program may contain statements or expressions.” Instead, we should say that the Python interpreter parses a simple, single-file Python program as a sequence of parser-level `statements` which my themselves be value-producing entities (which we might refer to as “expression”) or non-value-producing entities (which we might refer to as “statement.”) We might note that there are places where “statement” grammatical entities are invalid to use and places where “expression” grammatical entities are invalid to use, and that this “statement”/“expression” dichotomy is one that is found in other programming languages (but that there are other programming languages which are avoid this distinction.) We might further note that this dichotomy has affected evolution of Python be introducing dualities—for all but a few “statement” forms, there is an equivalent “expression” form. There may be contortions required to exactly match one to the other. (e.g., `while`) There are cases where, absent significant contortions, there is no dual (e.g., `try/except/finally/else` or `match/case`.) As a consequence, there is not considered to be a simple way to transform any multi-line Python programme into a single-line, single-expression equivalent. But, of course, probably nobody really cares. You've lost the class on slide one.

Next, it is true that semicolons can separate Python statements in some cases. It is important to note that, with the exception of silencing last-expression output in a Jupyter notebook, it is possible to never encounter the use of a semicolon in real Python code.

It is unfair to assume the author intends to convey that the execution of a Python programme is strictly in the order of the appearance of the lines of code in a file, without considering that function calls contain a body of statements which are executed only on function evaluation. Instead, we should interpret this to bullet point to mean that Python programme executed top→down with statements executed at runtime in a manner dissimilar to how C++ works. For example, `def f(): pass` is executable code in Python, and this statement is, in fact, executed. This is why we might argue that the “mutable default argument” problem is largely a matter of misunderstanding Python's execution model. We could consider that the “execution” of the `def` starting statement means the parsing and compilation of its contents into bytecode, rather than the execution of the contents directly. After all, even `f()` on `def f(): …` is not guaranteed to actually evaluate the body of `f` in all circumstances.

It is incorrect to suggest that the presence of features like `sys.meta_path` or `sys.path_hooks` invalidates the top→down nature of Python's execution model. Sure, you can implement an import hook that `exec(''.join(open(…).readlines()[::-1]), …)` but this could easily be considered a modification of the input rather than a modification of the `exec` mechanism. In fact, there are many ways to generate bytecode without passing through the `exec` machinery, but these are generally considered esoteric. If we consider only the `exec` machinery, then we will see that it will invoke the standard tokenisation and parsing process which will execute its payload line-for-line from beginning-to-end. Sure, we can say that the presence of import hooks mean that a programmer could subject a file input to preprocessing prior to passing it to the `PyRun_`/`PyEval_` mechanisms or that a programmer could generate executable bytecode in any arbitrary fashion bypassing these mechanisms… but it's going to be rare to encounter these… on the first slide… on the first day… of an intermediate/advanced Python course. So who cares?

This has already been pretty exhausting, but I think I have adequately demonstrated that it is not the case that “every statement [here] is a lie” and that we can casually dismiss the work of David Beazley as that of a “witch doctor” or “snake oil peddler.”

In fact, I would suggest that I have put forth some evidence that the original criticism belies a weaker and less thorough understanding of not only the Python interpreter but of the instructional process than its confident tone might suggest. I hesitate to suggest further.

It's definitely unfair to cast aspersions on the CPython core development team. Python is an established, mature language with a growing, increasingly diverse development team. It is bound to be the case that there will be new contributors who do not have as strong an understanding of the entire language and the entire interpreter. It is not the case that the CPython core development team is as untalented as you suggest. In fact, I would assert that there are programmers among the CPython core team who are some of the most talented people I have ever met. Many are experts not only in Python, but in C, in C++, and across all languages that they use in their work.

[1] https://github.com/dabeaz-course/python-mastery/blob/main/Py...

[2] https://github.com/dabeaz-course/practical-python/blob/maste...

[3] https://peps.python.org/pep-0617/

[4] https://github.com/python/cpython/blob/3.8/Grammar/Grammar

[5] https://github.com/python/cpython/blob/1663f8ba8405fefc82f74...

[6] https://github.com/python/cpython/blob/main/Grammar/python.g...


> The Python parser considered a file input to be a “sequence of statements.”

You missed the whole point and decided to answer a completely different question. Intentionally. Because you don't like the inconvenient fact that OP wrote something idiotic. OP wrote in their course that Python runs a program that is a sequence of statements. Which is nonsense. Python doesn't run or execute statements at all. This is not how it is implemented, nor was ever intended to be. Statements and expressions are read not run.

Now, for some reason you decided to ignore the part where Python is executed interactively, and conveniently found that in some other context 42 is a statement? -- well, so what? Python the implementation, the documentation, the infrastructure -- all of this is made by incompetent programmers. It's all bad. They ended up putting what they claim to be an expression into a statement in the parser because that's how it was convenient to write the parser? -- well, who cares? This shouldn't be the test for when something is an expression or a statement. Expressions evaluate, i.e. have a defined evaluation result, statements don't. That's all you need to know when describing the difference. Your archeological digging into a pile of dung that is Python implementation are misplaced and only misguide those who'd like to know anything about this language.

> to mean that Python programme executed top→down with statements executed at runtime

This is a pile of absolute horseshit. No, Python doesn't execute statements. Stop writing nonsense. Statements don't exist at runtime. What the hell is even this "top->down" you are talking about. Do files have top and bottom in your world? What is the top of the file?


Very verbose and obfuscating reply to a comment that has been censored and that we cannot see.

It is symptomatic for the Python space that criticism is drowned out by consultants, who write loads of perfect English and focus on the irrelevant.

The only part I'd agree with is that Beazley is competent. Has he written and maintained production ready code though? I wish that people who actually do something would also receive such defenses.

But no, in the Python space the trainers and consultants are superior to the people doing the work. and they rake in the cash.


I learned a lot from your replies. Thank you for the effort


> or his capabilities as a programmer are lacking.

I don't know anything about his programming abilities, but PDF linked from OP is full of asinine bullshit. He doesn't know what he's talking about and is very confident about it.

This is not nitpicking, this is very illustrative of how the author thinks -- he doesn't. He cannot think clearly, he cannot analyze the subject he's discussing. He cannot even see the problem with whatever nonsense he's spouting.


> A Python program may contain statements or expressions. Here's a Python program that contains no statements: 42.

According to the Python Language Reference, that is a Python program made of one expression statement:

https://docs.python.org/3/reference/simple_stmts.html#expres...


Python documentation is written by morons who don't understand well the subject they are discussing. Yes, it may be surprising, but it's very possible and happens all the time that a person can do something proficiently, but not know what or how they are doing.

Python documentation is full of nonsense, contradictions, bad definitions etc.

There's no such thing as "expression statement", it doesn't matter that they say so or write it in documentation. It's like saying "fraction integer" -- it has to be one or the other.


They're in good company—the C standard also has something called an "expression statement" which is similar.


Well, you know what they say about polishing certain things. Python is just not meant to write serious applications, despite all the startups using it. Let's stick to OpenCV and ML applications, NOT line-of-business. Doesn't even have static typing, might as well use JavaScript.


Beazley is one of the good ones, but no other language has this level of marketing. Python is the Coca Cola of languages.


Java marketing in the 90s was more intense and much less organic, since it was totally driven by Sun.


I completely agree with this -- no programming language has ever had as much marketing behind it as Java. It was absolutely overwhelming for about 6-8 years from maybe 1997 - 2005.

And what a disaster that has been -- a bunch of people now consider object-oriented programming to be a sensible approach.


Java by Sun back in the day is much more the Coca Cola of languages. Today, maybe C# by Microsoft.

Python's organic growth and adoption is more of an OpenCola model.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: