Nice course. Looks like good, practical material for people who want to dive in....

jakear · on June 19, 2020

I’m in the midst of a python project right now - I’m fairly familiar with python but I haven’t used in a while, preferring TS. I’m currently majorly kicking myself for not starting out with mypy. Unfortunately even with mypy, soundness guarantees are nowhere near as powerful as TS, and the typing ecosystem is lacking (no definitely-typed equivalent i know of), so working with libraries is pretty much just guesswork. I really have no clue why people love this language so much.

pansa2 · on June 19, 2020

In my experience, the people who love Python are those who are happy with dynamic typing - usually because they only use the language for small projects.

When starting a large project, it would be better to skip Python altogether and use a proper statically-typed language.

Type hints only make sense when you’re working on an existing large Python project - just as Guido was doing at Dropbox when he added them to the language. This is because type annotations are the worst-of-both-worlds - they require the verbosity of static typing and provide few of its benefits.

> This is not a course for software engineers on how to write or maintain a one-million line Python application. I don't write programs like that, nor do most companies who use Python, and neither should you.

ramraj07 · on June 19, 2020

If you already know your project is going to be as big as Dropbox's entire codebase, salut don Corleone!

But otherwise, I feel like trying to choose the language that scales to a million lines sounds like dooming in the beginning itself by over-engineering. I attempted that myself but kept coming back to python because it's fast to code, it's forgiving as hell and its customizable. Once you're used to it's quirks you can breeze through it fast. I'm sure ruby and js have the same as well, and with ts you get better typing, but with the typing you can do with 3.8, I'll argue that python gives the _best_ of both worlds if it's especially mixed with a good IDE like pycharm. If you really want, you can also incorporate pytype or something from the get-go. If my Greenfield project's biggest problem is that python doesn't scale, then I'm a very happy man.

flyinglizard · on June 19, 2020

Python doesn’t scale well beyond one file, one developer, one time write. Once a program needs maintaining, updating, expanding, debugging - the experience quickly deteriorates.

ramraj07 · on June 19, 2020

The codebase I work on now is python and I prefer it to both the java and scala codebases I've worked on before that were much smaller. Python gives a better debugging experience than java and a better updating experience than scala (especially someone else's scala) in my opinion.

at_a_remove · on June 19, 2020

I guess I should ask what you consider a small project. I have some five-digit LOC projects written entirely in Python. I have never noticed this typing issue everyone is very concerned about and am often baffled by the emphasis placed upon it. What am I missing?

I am willing to entertain the idea that I have been very lucky or that I have some programming mannerism which has caused me to skate by this kind of thing, but I just don't get it.

DannyB2 · on June 19, 2020

It is not impossible to create a large program in a dynamic language. It simply requires extreme discipline.

At the same time, it is not impossible to write a large program entirely in assembly language, it just requires extreme discipline, even more so than python.

It's just a question of degree.

But when you DO get a type problem, and in any sizable code base, you WILL, it only manifests AT RUNTIME, and possibly at a customer site or in production, rather than manifesting at compile time.

jakear · on June 19, 2020

It’s certainly possible to write untyped code and have it function, but you’re sacrificing development velocity by requiring both you and future maintainers to manually do all the things that a type checker can do automatically.

at_a_remove · on June 19, 2020

Perhaps I have not run into this because I tend to avoid situations where "velocity" is something within two degrees of me uses as an adjective describing a project.

jakear · on June 19, 2020

Replace it with whatever language you use to describe the rate at which things get done and re-read.

ed25519FUUU · on June 19, 2020

Are you the only one who works on the project? You might not notice it in that case, but the engineer who comes after you will.

at_a_remove · on June 19, 2020

The only times I have heard from people who have worked on projects after me is the odd phone call to thank me for the documentation and the clarity of code, just as an intro to ask me a "what should we do about this?" or some kind of obscure historical follow-up.

Case in point, I had written an apartment search website in Perl. Now, at this point I had been pretty disenchanted with Perl as a whole due to the "there's more than one way to do it" culture combining with the "executable line noise" syntax to give rise to a lot of very impenetrable write-once, read-never-again code from others. So, when I did this, I used real, appropriately-named variables, eschewing the convenience of $_, and I made sure each line of code did one thing and only one thing if at all possible. I did not pack a lot into a line. Each line should be obvious. Where I felt that it might be subject to interpretation, I added comments as to what I was doing and why. Each function had is own comment section which discussed what it was for and why, possible room for improvement, and so on.

Eventually the duty passed to another set of hands, a bunch of students who had never seen Perl (the rise of PHP was strong at that point) and in the follow-up, they mentioned that they found it very straightforward to simply re-write each line.

I heard back in the past few months about one of those very large projects I did in a previous job, just keeping touch with people. I asked about one of my personal pet projects. "No, it just keeps running." "It's very obvious where to make any changes." All Python. No type stuff.

I just don't see what I am missing.

ngcc_hk · on June 19, 2020

Wonder any one with similar experience for js vs ts. Python does not have that ...

yodelshady · on June 19, 2020

It's unquestionably more for short scripts - what I mostly do - instead of big applications. My memory is of learning Java, where you're all but forced to create custom classes for everything, just in case you need extensibility later. With Python? It can .quack(), and that's all I should care about.

I understand generics have helped here, but they still don't seem quite there. And ironically, I'm finding ML tasks in python to be something that could really benefit from type hints.

Side note, and this bugs me: if people love it - empirically, they do - and you don't understand why - surely that should bring about some introspection? It seems all-too-often to bring about the opposite reflex.

Badly indented python code doesn't run, which it shouldn't - my understanding and the machine's are different. Well-indented python code has a lot less visual noise than other languages. That alone should give pause for thought. Why would you not want that feature?

jakear · on June 19, 2020

Your comparison seems entirely about python v Java. I encourage you to check out some modern gradually types languages to help understand what Python lacks. TypeScript is particularly good for this.

> Why would you not want that feature?

Easy. Braces/etc. become basically invisible after using the language for a bit so they don’t bother me whatsoever. However moving code around in Python is always a pain because you must make sure everything is placed at the correct indentation level rather than simply letting the formatter take care of it. I’ve definitely experienced bugs from moved code having its first/last line at the wrong level, and it can be particularly confusing when the last line is at a different indentation than the previous. It’s so much easier to just grab a brace-enclosed block and smack it down somewhere (which also provides a good sanity check that you’ve yanked the entire block and aren’t missing any lines).

> Badly indented python code doesn't run

I wish this were the case. What’s actually the case is that badly indented python code will give you potentially different results than what you expect, which ranges from syntax errors to failed tests to very hard to diagnose bugs.

> Side note, and this bugs me: if people love it - empirically, they do - and you don't understand why - surely that should bring about some introspection?

Side note, and this bugs me: if people hate it - empirically, they do - and you don't understand why - surely that should bring about some introspection?

yodelshady · on June 19, 2020

"invisible" syntax is precisely want I don't want. That's practically the definition of a bug!

I know I set myself up for the last line. I can only say I've really made a good faith effort to try to understand the explanations for braces, and none make any sense to me. I've had nearly ten years writing more or less python alongside braced languages (I agree it's not a massive pain), and outside of the REPL - I have literally never seen an IndentationError (err.. that's the same as "doesn't run" to me), or "hard to diagnose bugs". Almost never a TypeError, either. Maybe a dozen or so?

I need to understand, from source code, how the instructions flow. For that, I and essentially all other humans need indentation. I never want the machine to interpret the instruction flow differently to me. I genuinely cannot understand how someone can fail to produce python code to that standard. I trivially can in any language which includes syntax specifically for giving a machine a separate understanding of program flow to humans.

sooheon · on June 19, 2020

> why people love this language so much

Because no matter what problem you have, after 12 minutes of googling, you can pip install foo. The language almost doesn't matter because most programming is via library api.

jakear · on June 19, 2020

I can do the same with npm, but if there’s a @types available (or better yet built in), I barely have to read long-form documentation and I get smart completions with documentation, types, etc right in my editor, specific to the exact expression I’m editing.

dragonwriter · on June 19, 2020

And...exactly the same with mypy, either via in-library typings or the typeshed.

Projects in the npm ecosystem may be somewhat more likely to.have typings if the project exists, but I find that, for anything other than web frontend, where JS is obviously king, the tool I'm looking for is more likely to actually exist in the python ecosystem.

jakear · on June 19, 2020

Are there mypy typings available for things like sci-py and simpletransformers, and if so how do I acquire them?

ghostwriter · on June 19, 2020

You can gradually annotate these packages locally for your project with [1], and then contribute it to [2].

Automatic stub generation may also be helpful [3]

[1] https://github.com/python/mypy/wiki/Creating-Stubs-For-Pytho...

[2] https://github.com/python/typeshed

[3] https://mypy.readthedocs.io/en/stable/stubgen.html

jakear · on June 19, 2020

There’s a couple dozen packages on there. This can’t be described as “exactly the same” as npm/typescript if the scale and level of community involvement [1] is nowhere near the same.

1: among the most important things here, in addition to the strength of the type system, where mypy also lacks

ptx · on June 19, 2020

People love the language because the language doesn't matter?

I don't think that quite explains it. We loved Python at version 1, before pip and PyPI were around - although the "batteries included" standard library played a similar role to some extent. For me it's mostly the syntax and the simplicity.

sooheon · on June 19, 2020

Just my opinion, as someone who loves the ecosystem and tolerates the language.

rxhernandez · on June 19, 2020

> I really have no clue why people love this language so much.

You get used to it somehow. I did C and Java for about 10 years before Python. My first exposure to Python was a Computational Physics class and it was maddening not knowing what types went where. Eventually you realize it doesn't really matter as long as you're disciplined and rely on others who are disciplined enough to document their code well. I haven't had the chance to use mypy in the 7 years I've been using Python (across 3 industries; medical devices, scientific devices, and web) but I would be surprised if it significantly reduced the number of bugs I've seen in practice.

jakear · on June 19, 2020

It’s certainly possible to write out types in English and cross your fingers and hope they’re correct and stay correct as the file gets edited by more and more people, but why do that when you can do it in a language that an intelligent type checker can understand and validate, and further use to provide smart editor tab-completions/etc?

This is especially big for any refactoring work - just yesterday I had to change the format of a config file that gets read by both TS and Python scripts. For ts, I updated the typedef and I instantly got every error in the codebase annotated. For python, I had to manually go through each line, using my own brain as the type checker. My brain is much faster and better at being a “squiggly red line spotter” than a type checker, so I finished my TS work in a few minutes and took probably an hour going through the python, despite the projects being similar in size.

the_af · on June 19, 2020

This.

My experience with Python goes like this:

"I have to write a really small script... ok, not so small that I can do it with bash. I know, I'll use Python! This is going to be a small, throwaway utility anyway"

(Some weeks later, it's turned into a large monstrosity and I need to refactor something, and everything breaks because refactoring anything nontrivial written in Python is a dangerous proposition)

"Ouch! Python: never again!"

(Repeat a few weeks later)

ghostwriter · on June 19, 2020

You can try Turtle next time, if you prefer fully type-checked shell scripts

https://hackage.haskell.org/package/turtle-1.5.19/docs/Turtl...

the_af · on June 19, 2020

That's awesome. I love Haskell. Will definitely try Turtle next time!

disgruntledphd2 · on June 19, 2020

I mean, you need to write tests before you refactor, but that's true in all languages.

I do agree (and have the scars to prove it) that this isn't a panacaea but it does help, a lot.

That being said, one of the worst bugs I had in python was in passing in the wrong types to a constructor function, and my tests didn't catch that. To be fair, Mypy would have, but I hadn't annotated that part. At the end of that debugging situation, I would have gladly killed somone for enforced static types in Python.

That being said, the data model is a work of art, and a core reason why I enjoy coding in Python. It's just a shame that pandas kinda sucks.

the_af · on June 19, 2020

I find the problem with writing tests for Python is twofold:

- Most Python programs I write start their life as tiny scripts, always with the certainty they'll never grow (Narrator: they always grow). I don't know many people who write tests for their scripts...

- Testing in Python means too much effort in the wrong places. Consider that most refactoring problems would have been caught for free by a language with static typing. I've experienced runtime errors because I changed the return type of a function from a single value to a list (or viceversa). A statically typed language would have let me know of my mistake for free, so that I could focus on tests that really matter.

CraigJPerry · on June 19, 2020

Are types a free lunch or are there trade offs, what are you getting and what are you losing by adopting types?

How do the benefits you get, relate to the difficulty of problems you’re solving when programming?

Eg if the cost of types is increased coupling of code but the trade off benefit is it makes fixing typos easy then that’s a bad trade. You’ve made a hard problem more brittle in exchange for making a trivial problem even easier.

jakear · on June 19, 2020

Could you give an example of how code could be coupled at the type level, but not logically coupled? It seems impossible to me.

> trivial problem

Type checking is quite non-trivial, especially as the logic and types get more advanced (conditional types, index types, mapped types, etc etc). Not just “typos”.

CraigJPerry · on June 20, 2020

>> coupled at the type level, but not logically coupled?

    float multiply(float a, float b) { return a * b; }

You can’t add doubles with this code because it’s coupled to the float type. Summing a for b times has no logical coupling to floats or doubles just as floats or doubles have no logical coupling to either implementation of addition - both types are added by the same operations. You can swap the types or the algorithm.

Before you reach for the polymorphism or further pollute the universe of types consider that it costs less lines of code to erase types here.

What errors would preserving types help with here? What has more utility, less lines of code spent for the same outcomes or an implementation that exists further along the spectrum of typing in the “at compile time” direction? - I don’t believe that question can be answered but in practice i find less lines of code correlates more strongly with outcomes i care about than the degree of typing applied.

jakear · on June 20, 2020

> What errors would preserving types help with here?

Pretty simple, it helps when you do a refactoring that changes the type of the value passed somewhere from number to number[] and the compiler instantly tells you that that’s illegal.

I did something similar just the other day when I had a shared config file read by both TS scripts and Python scripts and I needed to change the format to support a new feature. On the ts side I updated the typedef and the compiler pointed out to me every area that needed updating, on the Python side I had to spend a good chunk of time manually going through the script to figure out what needed updating.

CraigJPerry · on June 20, 2020

>> it helps when you do a refactoring that changes the type of the value passed somewhere from number to number[]

Typing isn't a strong enough tool to combat that class of problem; you can fix the type signatures while introducing a semantic break. You need tests and if you have tests, what does typing bring? Keep your tests fast, measure and maintain a speed of >250 tests per second and remember to test behaviour not implementation - you won't go far wrong.

IDEs have been performing refactoring changes on our behalf in dynamic languages for years at this point. Don't manually edit when we've never had such powerful tooling available to us

  * https://www.jetbrains.com/help/pycharm/product-refactoring-tutorial.html
  * https://www.jetbrains.com/help/pycharm/structural-search-and-replace.html

Even concepts that i'd say are strictly the domain of static typying, like automatically pruning dead code behind retired feature flags, make headway today: https://github.com/jendrikseipp/vulture

Typing has a time and a place, that is without a doubt but the world is better viewed without the static typing lens permanently affixed in place, your efficacy will increase.

jakear · on June 20, 2020

I don’t have tests for these small utility scripts and they work perfectly well. Tests also don’t pinpoint the error.

I’d never choose to have myself do more work if it’s not needed and the computer can do it.

Those automated refactorings would Not have helped in the situation I mentioned above.

CraigJPerry · on June 21, 2020

Testing isn’t solely about whether code works today - you’re unlikely to checkin something completely broken. In fact i advocate NO tests for throwaway code, move fast.

It’s whether it continues to work tomorrow after you’ve added or removed a behaviour. Tests are the secret sauce that keep the cost of change low.

Refactoring (a commit with zero changes to test code) is the other key pillar of long term codebase health.

iso8859-1 · on June 19, 2020

Are you counting null pointer exceptions? Because I think Mypy is pretty effective at combatting those.

leu-mas · on June 19, 2020

I've been really enjoying pydantic lately. Run-time validation of types without forcing you to learn something other than normal type annotations.

https://pydantic-docs.helpmanual.io/

carapace · on June 19, 2020

Back in the day, coming from C, Python was magic. (And it wasn't C++ or Java.)

dragonwriter · on June 19, 2020

> Static-typed languages have too many benefits right off the bat. You can (mostly) avoid all of these types of checks.

Python’s major typecheckers, which add features faster than the type systems of most major statically-typed languages, already support a more robust type system than many industrially-popular statically-typed languages.

It's not Haskell, or even Scala, but then neither is Go or Java.

Now, obviously, you don't get the performance benefits of type-informed static compilation with Python, but performance isn't the issue most people seem to be discussing here.

iso8859-1 · on June 19, 2020

A large majority of projects out there are not using type hints. You'd have to roll your own type hints. And sometimes, the type hints are impossible to upstream:

For example, let's say you use Mongoengine. Now, you can query a collection by using MyObject.objects.get(...) and it will return a MyObject. You might be able to make that work in Mypy. Now, you use some fancy aggregation feature, passing a dict into Mongoengine and it changes the type of the result you get back. How will you typehint that? I think the only way would be to special case it for all your queries.

Dynamic languages have 'reflection' all over the place. Type systems are to terminate quickly, and if they don't, you are hacking them and they perform terrible (game of life in C++ templates and such).

So even if performance is not the goal, you can't even get the correctness property right.

In Haskell, there are libraries that will generate code from schemas, including migrations. They are typed to various degrees. There also shallow embeddings of the PostgreSQL query language (with types for the returned type, unlike the Mongoengine example before!). I just want to demonstrate that in Haskell, typing is not all-or-nothing either, but it is a spectrum, where a Haskell lib will typically end up being more typed than a Python lib. You could 'simplify' their type signatures (get dynamic typing) with GHC.Generic and you'd get something like what is common practice with Python. But it is pretty much impossible to go the other way.

kqr · on June 19, 2020

> For example, let's say you use Mongoengine. Now, you can query a collection by using MyObject.objects.get(...) and it will return a MyObject. You might be able to make that work in Mypy. Now, you use some fancy aggregation feature, passing a dict into Mongoengine and it changes the type of the result you get back. How will you typehint that? I think the only way would be to special case it for all your queries.

Isnt this the case any time you use a DSL to query external data? Only sometimes someone else have type-hinted the things for you.

I work in a strongly typed language (F#) that interops with JS, and one of our principles is to do the type checking on the F#--JS boundary and then not have to worry about it again.

dragonwriter · on June 19, 2020

> How will you typehint that?

For that specific contrived scenario (mongoengine, which I've never used, apparently uses .aggregate for the latter rather than overriding .get) with two stub @overloads of [...].get for typehinting, one covering what it takes and gives in the aggregate case of interest and one for what it takes and gives in the simple case.

j88439h84 · on June 19, 2020

That's not idiomatic python, just use the type annotation system.

pansa2 · on June 19, 2020

I’d argue the opposite - that `isinstance` checks are idiomatic, whereas type annotations have not been widely adopted.

j88439h84 · on June 19, 2020

I dont think you'll find a long time pythonista who thinks isinstance is good python style.

pansa2 · on June 19, 2020

I agree that, in general, duck-typing is preferable. However in cases like this, where you really want to ensure the count of shares is a whole number, I can see an argument for `isinstance`.

What I can’t understand is the argument that occasional use of `isinstance` is bad, but also that pervasive nominal type-checking via annotations is good.

4ec0755f5522 · on June 19, 2020

It usually means there is a better way to do what you are doing.

If it's user input you should sanitize there. In Python this is often easy because you can usually cast as the type you need e.g. with int() and then raise exception if it cannot.

If it's your code, then you should make sure you are passing in an int if that's what is required. If you type hint the input as int, and somewhere in your code you pass in a string, you will get a warning in the IDE.

I would argue that's a very different time and place than checking for instanceof in the running code and that's why isinstance is bad and type-checking is good.

There are always exceptions but it's bad to write them in examples when teaching code.

pansa2 · on June 19, 2020

> If you type hint the input as int, and somewhere in your code you pass in a string, you will get a warning in the IDE.

Statically-typed languages can guarantee to get this right 100% of the time. Can type-checkers for a highly-dynamic language like Python guarantee the same?

mic47 · on June 19, 2020

> Statically-typed languages can guarantee to get this right 100% of the time.

No they don't. They usually provide escape hatches for things typesystem does not cover, so there are cases where typechecker will just trust that you know what you are doing (even if you don't).

But more importantly, you don't need 100% to be useful. For aid in IDE, high precision (with somewhat lacking recall) is good enough. Of course, for refactoring, higher recall, the better (but you could substitute lacking recall with tests, which is suboptimal, but viable).

But it's interesting question on what python/mypy (python typechecker) can actually do. The answer here is it depends on configuration. Mypy with default configuration typechecks only typed code (i.e. functions which have type annotations) so you get guarantees only there. But you can configure it to be more and more strict (checking untyped defs, not allowing untyped code, and more), which increases guarantees you get (and it also increases the number of valid programs that it rejects). You can get in python into really strictly typed code, but you can also hit the wall if you need libraries that does not provide proper type hints (unless you write type hints by yourself).

ghostwriter · on June 19, 2020

> What I can’t understand is the argument that occasional use of `isinstance` is bad, but also that pervasive nominal type-checking via annotations is good.

it doesn't have to be nominal, MyPy supports structural sub-typing through Protocols [1]

[1] https://mypy.readthedocs.io/en/stable/protocols.html

hhmc · on June 19, 2020

Even is this case, would it not be better to cast via `int()` and deal with the failure as appropriate?

heavyset_go · on June 19, 2020

It's idiomatic. Type annotations aren't enforced at runtime.

int_19h · on June 19, 2020

It's not idiomatic to force a specific type like that even so. Idiomatic code would accept any type that has the same operations (and their semantics) as int.

xthetrfd · on June 18, 2020

I believe that some recent Python version has added (optional) type annotation support. I haven't ever used it though.

dragonwriter · on June 19, 2020

> I believe that some recent Python version has added (optional) type annotation support.

Annotation support was added in 3.0, which isn't really a recent version of Python, having been released about 11.5 years ago.

Also, mypy, while it can use Python 3.x annotations, also supports static checking of Python 2.x code using type comments.

pansa2 · on June 19, 2020

> Annotation support was added in 3.0

However, support for type annotations, specifically, wasn’t added until Python 3.5 (via PEP 484).

dragonwriter · on June 19, 2020

What PEP 484 was standardize the use of annotations as type annotations and provide ancillary out-of-the-box support for type hinting via annotations, particularly the typing module.

Mypy was actually using python 3.x annotation for type annotations before PEP 484 standardized them brought the stdlib typing module, but with PEP 484 there was a common, language-defined standard baseline for mypy and other efforts.

scubbo · on June 19, 2020

It's super-easy to use! More flexible than Java, but (so far as I can tell) just as powerful in terms of catching errors.

willcipriano · on June 19, 2020

You can even benefit from it without bothering with the type annotations (though they make it more effective).

https://google.github.io/pytype/

ed25519FUUU · on June 19, 2020

Maybe your IDE experience is different than mine, but a typed function will gladly accepted an untyped var without complaining, and during runtime could care less.

ghostwriter · on June 19, 2020

> but a typed function will gladly accepted an untyped var without complaining

this can be made prohibited by MyPy's --strict flag [1], and enforced with a mypy git pre-commit hook locally, and with a MyPy CI job.

[1] https://mypy.readthedocs.io/en/stable/command_line.html#cmdo...

j88439h84 · on June 19, 2020

That's a mypy setting

anentropic · on June 19, 2020

There is type inference in all of the Python type checkers, it's not necessarily an "untyped" var because it lacks an annotation.

Running a type checker in Python is a way to avoid runtime type errors, just like in compiled languages, but without a performance benefit.

pansa2 · on June 19, 2020

Type annotations aren’t equivalent to this check, though, because they’re not enforced at runtime.

h8hawk · on June 19, 2020

Why should type annotation be enforced at run time? In statically typed languages there is no type checking at run time, your type system already proved the type of variable.

pansa2 · on June 19, 2020

Because Python’s type annotations don’t prove anything. That’s why, for example, they can’t be used to increase the performance of the interpreter.

ghostwriter · on June 19, 2020

> That’s why, for example, they can’t be used to increase the performance of the interpreter.

They can (as in - there's API for that, the rest is up to a community effort) improve performance of a final program, if type-annotated code is passed through Cython with ``annotation_typing=True`` flag:

http://docs.cython.org/en/latest/src/tutorial/pure.html#stat...

https://github.com/cython/cython/issues/1672#issuecomment-29...

ed25519FUUU · on June 19, 2020

It’s not enforced at all, and only serves to provide development feedback.

dragonwriter · on June 19, 2020

The annotations are a language syntax feature than the runtime doesn't enforce. There are a number of separate static typecheckers, such as mypy, that do allow AOT static verification (which is, after all, all languages like Haskell have; runtime enforcement of types that have been statically verified in advance isn't super common or necessary.)