Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Python does a great job at putting together powerful scripts quickly. But I have a really hard time opening up a big Python codebase and making sense of it quickly. No static types can make data pretty hard for me to follow. This may just be something I need to improve at, but I have a distinctly better feeling working with unfamiliar code with static types. Python makes me feel I’m at the will of the documenter to be informed, whereas the types self-document in many cases in other langs.


I think this is actually a big difference between SWEs and programmers who are scientists/data scientists/analysts etc. first. In my experience, people with more formal education in programming find type systems to be very helpful, while people from other backgrounds, find them confusing and an nuisance.


I don’t think you need a “formal education” to see the benefits of strong typing. Whether someone is attracted to it or not probably has more to do with their specific use case.

Strongly typed languages is simply a trade off where you get peace of mind by paying for it with extra time and effort spent. For some people that’s a no-brainer and for some people that’s just something that gets in the way.

My background is mostly in sysops and monitoring, and for me types are a life-saver because I value stability and predictability over almost anything else. If my job was something more similar to “ship new stuff fast” I’m not sure that would be the case.


I agree it's a trade off but for me the ratio of effort spent/effort saved in lack of bugs and peace of minds is like 1/1000. Such a tiny bit of effort for huge payback in effort/mental energy saved.


> If my job was something more similar to “ship new stuff fast”

Ironically, the slowest moving codebase I have ever seen was written in Python. It was impossible to "ship new stuff fast" without breaking things. I think Python works up to a certain size/complexity level, then it completely falls apart.


Yes; I've come to realize the yarn "Internal quality and productivity rise and fall together" is more true than I ever thought. And with python, (and IME most dynamically typed languages), the internal quality has far more ways to go awry.


> the benefits of strong typing

A couple of comments here:

1. Python is very strongly typed, every single item has a clearly defined type. That said, it is dynamically typed, meaning that the variables are not a "containers" for predefined types, but simply labels on objects.

2. What brings benefits is not static typing itself, but static analysis which is mandatory in most statically typed languages (i.e. the program won't compile/interpret if the analysis breaks). With tools like mypy Python has a perfectly capable static analysis - even without any type annotations, although they massively improve it - it's just not mandatory and the program will happily run even if it fails.


> I don’t think you need a “formal education” to see the benefits of strong typing.

Which is not what P said. They said those that have it tend to favor strong typing; those that don't tend to not.

IME I've seen the same thing. It's not a rule, it's a generalization that is often wrong, but more often right. Again, IME. YEMV.


Just anecdotal, but I dropped out after a year or so of CS and I value types more than the vast majority of developers I've ever worked with, most of whom have much more formal education than I do.


You may well be right. Personally, I dislike writing python because it lacks static types and all too often I end up triggering runtime exceptions, especially when integrating with third-party code. Do you suppose scientists/etc are more tolerant of runtime errors, at least while writing the code?


In my experience, the biggest issue in numerical/scientific code is to get correct results, these cases will hardly every be caught by typing. Moreover in many scientific programs you deal with well defined inputs, i.e. the chances of e.g. calling a function with an incompatible type are quite low. The most common error I have encountered is a mismatch in dimensionality/shape of input arrays (which I believe isn't straight forward to type for either, in particular if you want the shapes to be flexible, but all inputs to have the same shape).


Try pyflakes, it will find a majority of those.


Wemake [1] is much more comprehensive, although you might find it overwhelming at first. It includes a lot of useful flake8 rules that tackle complexity, enforce consistency and best practices and much much more.

[1]: https://wemake-python-styleguide.readthedocs.io/en/latest/


Styling is not the problem the poster was complaining about, and why I specifically chose pyflakes out of dozens of such tools.



I feel naked without static types and my background is in humanities.


Python has optional type annotations. Most new code tends to have them, and they help tremendously navigating and understanding large codebases.


It's honestly not very good compared to a "real" type system (platform). I've used it about a year ago for a couple of projects, and it was painful and ultimately not very useful exercise.


I've used Mypy since 2016 on big as well as small codebases, and it has been extremely useful for me. It caught numerous bugs at typecheck time. The benefits of better readability and e.g. better jump to definition and autocomplete are harder to quantify, but subjectively they feel substantial. If you are gradually annotating a big codebase, it does require a "critical mass" before types start to pay off, I agree that adding them later can be painful.

Mypy's type system is quite advanced compared to some statically typed language like C and Go; it has generics, union types, inference, and it can prove properties of types from control flow. I work mostly in Rust, Haskell, and Python, and I rarely find Mypy limiting.

You do have to embrace types though; if you have a dictly typed program where everything is a Dict[str, Any], then putting that in annotations isn’t going to be very helpful; converting the dicts to named tuples or dataclasses is.


As an alternative perspective, I've found mypy to be pretty poor in comparison to other rich type schemes. Particularly in comparison to the Javascript/Typescript ecosystem, it really feels like taking a huge step backwards.

Partly that's because the ecosystem support isn't there - lots of libraries don't have types, or have types that behave oddly, or even require their own plugin for mypy. I suspect that's going to slowly change over time, but I feel like the infrastructure doesn't feel as strong as it did in the early days of Typescript, and the gradual change feels even more gradual.

Partly it's just that the syntax is painful as soon as you want to do anything remotely complex. For example, generics are defined in different ways depending on what you're making generic, and the TypeVar system is usable, but ugly and very unintuitive. There's also missing syntax for things like inline type assertions, which aren't great, but are often useful for mixing static and dynamic code together.

And I think partly it's also that Python has a much higher level of dynamism in the type system - lots of "clever" things are possible in Python that just wouldn't be possible in the same way in Javascript - which means that mypy has a much more difficult time in trying to describe the idiomatic Python patterns that people are actually using. In Typescript, figuring out the correct types feels like an extension of writing natural Javascript code and understanding the flow of data through the program. With mypy, it feels like I'm restricted to writing the subset of Java-like code that conforms to the type checker.

It's a disappointing feeling, because I like static types, and I had hoped it would help solve the problem of large codebases in Python. But in my last project it became so painful to use, and felt like it was adding so little (and preventing so few actual issues), that I kind of regretted pushing for it.

I hope a lot of these problems are just teething issues, and that over time it'll get better. But right now I would be very cautious about using it in production.


> - lots of "clever" things are possible in Python that just wouldn't be possible in the same way in Javascript

JS has Object.setPrototypeOf(), among other things, so I doubt this.


In practice, that's very rarely used, partly because inheritance is so uncommon in Javascript programs. However, for a long time, Javascript lacked anything like the __getattr__/__setattr__ methods, it has no real operator overloading, and decorators are more limited and generally rare outside of certain ecosystems. These sorts of metaprogramming techniques that are very normal in Python are more rare in Javascript, and when they are present, they're generally easier to type.


> You do have to embrace types though; if you have a dictly typed program where everything is a Dict[str, Any], then putting that in annotations isn’t going to be very helpful; converting the dicts to named tuples or dataclasses is.

Or even just TypedDict, which often works without any changes besides annotations.


>> It's honestly not very good compared to a "real" type system (platform).

> I've used Mypy since 2016 on big as well as small codebases, and it has been extremely useful for me.

You can both be right. It's both not very good compared to a strict type system, and still way better than not having it.


The mypy experience is awful with numpy and pandas.


While I agree that type hinting by its very own nature feels a bit bolted on, I vastly prefer going into code bases which include type hinting. I personally always add type hinting to the code I write, as I actually consider it quite useful.

In what ways do think it's painful?


I remember mypy being slow and buggy. I remember one mypy upgrade broke all our builds because they changed some type of resolution thing. IIRC after some outcry they backtracked and started providing some migration path.

The other thing which rubbed me the wrong way was that the python was happy to run the code with completely wrong type hints.

I guess I went into it with wrong expectations, even though it says right in the name - it's "type hints". The whole experience felt more like a formalized documentation with partially working optional verification (which can't really be relied upon).


I've also had a mixed experience with mypy. Take a look at using pyright for static type checking instead, it's worked quite well for me.

But do write type hints. I recently got thrown into a large-ish project where neither types nor docs where used. Trying to figure out wth a parameter was supposed to be wasn't a pleasant experience for a newcomer. In addition to improving DX, I also believe it's alot more effective in the long run.

I saw how these guys were developing: write code, run code, deal with the runtime crashes they encounter, then run code some more and deal with other unexpected runtime crashes. It would have been a lot faster and more stable if they'd just used type hints and static type checking, as their IDE could've easily found many of these bugs for them immediately.


Type hints are basically for documentation and metadata. You also find a bunch of third party libraries and frameworks, like pydantic, fastapi, etc. that makes use of them.


> Type hints are basically for documentation and metadata.

They are also for static checking.

If you choose to run the code despite that failing (which you can, because there is no execution dependency on type-checking) that’s a choice, but its kind of odd to complain that Python lets you do that.


I'm not the original poster and not complaining about it. I once worked at a shop that ran mypy checks as a precommit hook and never really found it terribly useful, but to each their own.


I don't follow. Python's static types can be gradually introduced to an untyped code base. Sure, they may be unhelpful if you don't use them everywhere. But how can they be painful? It's not like they prevent your code from being compiled?

Having a Scala code base not compile because someone went all-in on type-level programming, and now simple changes require an understanding of category theory and the associated compiler flags .. that's real pain.


I have experience in both and I agree with OP.

python's type-documentations are useful but sometimes they are just wrong which makes it impossible to actually trust in them.

> Having a Scala code base not compile because someone went all-in on type-level programming, and now simple changes require an understanding of category theory and the associated compiler flags .. that's real pain.

Well, the same can happen in python if someone goes crazy and uses a lot of reflection / dynamic changes (i.e. overwriting builtin methods etc.). In both cases it sucks and should have been prevented by other people, but at least in the case with Scala you still have a compiler that can help you "unroll" those changes because it tells you when you screw up. In python you can only have tests or pray.


Oh wow you should definitely give this a shot again. Type hints in Python save the entire language for me.

People do have bad habits of cramming everything into dictionaries but if you do some type hinting and use data classes heavily you’ll really have a good experience.

I could take or leave mypy personally.


Indeed.

I wish I'd kept it but a few years ago someone did an analysis of all the public python code in github to see what functions were called the most.

#1 was `type()`. <shocked face>


It’s as real as typescript… not useful for code generation or optimization, but very helpful for correctness.


It's a far cry from an actually typed language. I like Python and deal with its code bases for a living, but frankly it has gotten too big for its britches.


> Python has optional type annotations

Moreover, it has static type checkers with varying degrees of support for type inference.


I find they clutter the code more than they help. Good names and docstrings and comments are more valuable.

And static types are for people who are too lazy to write test code.


Sorry but this is a garbage take. Typing negates the need for writing useless tests that just verify the correct value types are passed - it does not negate the need for writing functional tests.

Where did you come up with this?


You seem to not understand duck typing.

Functional tests will test the correct types, too.

Edit: If you need an object that swims and quacks, you don't need to care if it is a rubber duck or an animal.


For quick scripts, I've recently taken to Deno

It has typescript built in so I can very quickly make a script.ts file anywhere and run it in the CLI with `deno run script.ts`. Works flawlessly and I get access to TS's amazing type system without any build step or setup or even any additional files

for any data-intensive scripts I still sorely miss pandas though


What's the story in Deno with the standard lib and filesystem interaction? Because those are the two vital parts of scripting for me. I dabbled in Node scripting for a bit and ditched it because I didn't feel it was a viable option compared to Bash or Python.


It's all built-in. No extra dependencies or even imports necessary. Relative and absolute file paths work as expected. You can save and read any files freely

You can import synchronously, asynchronously, as a stream, as a string, etc. All the functionality you'd expect is built-in with the Deno module (e.g. `Deno.readTextFile('./my/path.json')`) or even with module imports


When reading large projects written in C I find myself surprisingly little helped by static types. Almost everything is structs, pointer to structs, or structs with pointer to more structs.

I could imagine that if developers choose to use less complex objects and less indirection then types in such large projects would be more useful in explaining the data. It just hasn't been my experience so far.


I feel exactly the opposite, compared to Python at least. It's usually easy to find the definition of the structs, and then you can quickly make sense of what a given function can and cannot do, and what it as access to.

With Python I never know, since something might have dynamically added or changed some methods or fields along the way. I almost always end up sprinkling dir() all over just to figure out what exactly is going on.


I think that's because it's C, where you have to use void* occasionally, structs are often anonymous to avoid compile-time increases from including another header file, and there's a hellscape of pointers involved in doing anything significant. If you picked up a modern language with static typing, like Rust, Go, Kotlin, or even C++/Java, you might see some significant readability benefits.


Imagine how much worse those structs with pointers to structs would be without static typing though. This feels like a complaint against C, not a complaint against static types.


It is a comment about how large projects, at least from my experience, seem to use complex data structures that takes several indirection to follow.

In both C and python I have also see something akin to a mini language inside the large projects, where understanding the code becomes almost impossible without documentation. In C, both macros and pointers to structs with more pointers can do a lot of heavy lifting to hide every detail of how something is being done and just leave the intention. Great if one want to see a high level concept and creating new features, but terrible if you want to know where that one bit of information is being stored and how to inject your own code into the core parts of the project. Similar, large projects in python tend to use meta, monkey patching and other dark magic patterns to really hide the low level details while making it easy to create new features.


I think that's a take in the category of 'you don't want types, you want better names'.

Static types are very useful for compilers but looking at a function and seeing int -> int -> int -> string -> bool -> int, says very little about the semantics of a program. It's always the names and documentations that tell human beings how to make sense of a program.

When we put things in record types, the sense-making value isn't in the static analysis but in the fact that our vague collection of parameters now has a proper name that indicates what it's all about.


> Static types are very useful for compilers but looking at a function and seeing int -> int -> int -> string -> bool -> int, says very little about the semantics of a program.

Sure, but on the other hand a type of APIProxyEvent -> LambaContext -> APIGatewayProxyResponse says quite a bit more about the semantics.

Unless a function is highly abstract, int -> int -> int -> string -> bool -> int is probably an underspecific type signature.

EDIT: To be clear, I generally find that the thesis “you don’t want types, you want better names” comes from assuming bad types, and suggesting replacing them with good names. And, for casual inspection, yes, good names may be superior to bad types. On the other hand, I can’t statically check good names, I can statically check types (good types, or bad-because-underspecific types, but good types, as well as telling me more as a reader, will also statically catch more possible errors.) Ultimately, what I want is good types and good names,


It's not so much assuming bad types as saying that names and types properly used serve different needs. For example, if you have types defined in your program, you could literally replace them with some random characters and the type-checking would be equally good, for the compiler, even if you don't undestand a thing. The value of types really is in the structure they represent, and enforcing certain constraints, not in any human understanding of the program. You could even have badly structured types with decent labels.

When you're using types as a means to check names you're likely to misuse types. Synonyms are a good example of this, where people will make so many types each type only ever occurs once, instead of having a well named variable of a more generic type.


> It's not so much assuming bad types as saying that names and types properly used serve different needs.

Different but overlapping needs, yes.

> For example, if you have types defined in your program, you could literally replace them with some random characters and the type-checking would be equally good, for the compiler, even if you don't undestand a thing

Sure, but for human consumption you want good names of types, not just good logic of types, just like you want good names of variables.

But, while you can (without types) overload names of variables with the human information that would be in names of types, this is bith less ergonomic that separating the two kinds of names, and doesn’t support type logic the way types that are both well-structured and well-named do.

> The value of types really is in the structure they represent, and enforcing certain constraints, not in any human understanding of the program.

I could not agree more strongly with this, it is no more true of types than it is of the rest of code: yes, with any part of code the logic is was matters functionally, but humans maintain the code, so if its not written (both names and structure, within the variations that produce correct behavior) for human understanding, its less useful, and potentially useless.

> When you're using types as a means to check names you're likely to misuse types.

I don’t know what this is referring to. If there is a statically verifiable feature, it is a real type constraint.

> When you're using types as a means to check names you're likely to misuse types. Synonyms are a good example of this, where people will make so many types each type only ever occurs once, instead of having a well named variable of a more generic type.

Synonyms/aliases (at least in languages I am familiar with) are explicitly not types, but alternate names for types. They aren’t checkable, only the underlying type is. They are useful in much the same way as named constants.


The other day I had to make sense of some Python code that used numpy heavily. It was an absolute disaster to figure out what the code was actually doing because of Python's lack of proper typing.

Take for example, the numpy.dot function. This is from the documentation:

> numpy.dot(a, b, out=None)#

> If both a and b are 1-D arrays, it is inner product of vectors (without complex conjugation).

> If both a and b are 2-D arrays, it is matrix multiplication, but using matmul or a @ b is preferred.

> If either a or b is 0-D (scalar), it is equivalent to multiply and using numpy.multiply(a, b) or a * b is preferred.

> If a is an N-D array and b is a 1-D array, it is a sum product over the last axis of a and b.

> If a is an N-D array and b is an M-D array (where M>=2), it is a sum product over the last axis of a and the second-to-last axis of b:

So to figure out what this call to dot actually does and what it's output type is I need to know the actual types of 2 inputs. Of course, those inputs are the results of other function calls, that are also dependent on the types of multiple inputs. When reading code you have to manually keep track of each type as you read through it, you have to look up every function to see what it does and what it can return under what conditions.

Programming languages are not for computers, they are there for humans to understand what is going on. Python absolutely fails at this because python code is almost totally unreadable. It is at best a write-only programming language.


You're complaining about a library, not Python.


> Static types are very useful for compilers but looking at a function and seeing int -> int -> int -> string -> bool -> int, says very little about the semantics of a program. It's always the names and documentations that tell human beings how to make sense of a program.

Sure, agreed, but if a program contains poorly named types then it's a good bet that if types were not mandatory, those authors who could not be bothered with properly naming their types would be equally incompetent at naming their variables, parameters and functions.

IOW, if author competence is important for typed programming to be readable, then it is even more important for untyped languages!


One thing about math heavy fields is that their core primitives are well rounded and concise, you can write complex math formula because you have a mental map of domains and projections.

On the other hand every business comes up with its own little world.. and suddenly you're in the dark. Here the need for strong and static typing helps.


datatypes are a great way for making static types. Python's power is you can use as little or as much typing is right for you.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: