Hacker News new | past | comments | ask | show | jobs | submit login
A retrospective on Requests (stapletoncordas.co)
125 points by zdw 10 months ago | hide | past | favorite | 42 comments



They should just go ahead and make breaking changes after a big announcement so that everyone is clear that will no backward compatibility at the next major release. Otherwise the majority of design, development and mental energy will be spent in compatibility issues leading to full-burnout with little to no feature gain.

You only need to look at C++ papers where superhumanly smart and driven IQ 200+ people desperately try with all their will to define a language or stdlib feature/improvement that preserves ABI backward compatibility and then finally give up and throw the towel after two dozen iterations and rejections. And also suffer from severe burnout in the process.


Breaking changes in requests would be a disaster.

The problem is that they're not just a library people use directly: they're a library that is often used by other libraries.

The worst possible form of dependency hell is when you want to use libraryA - which depends on requests<3.0 - and libraryB, which depends on requests>=3.0, at the same time.

At that point, you're simply stuck. You cannot use both libraryA and libraryB at the same time until they both upgrade to the same underlying requests dependency.

Python projects are affected by this because it isn't currently possible to have two versions of the same library installed at the same time, because they clash with each other in the sys.modules global namespace.

This isn't true of all modern programming languages - I've been meaning to dig around and figure out which ones are affected by this problem and which are not - I think Node.js and Rust have mechanisms for this, but I don't think Go or Java do.


> I don't think Go or Java do.

Go's module system was specifically designed with this problem in mind: https://go.dev/blog/versioning-proposal

I think Java projects get around this when they have to with shading, but that's a bit clunky.


Java had some magic classloader ideas back in the day. OSGi, I think it was? I don't remember them using the term shading, but I also don't remember ever liking how that went. Usually very confusing for all involved. (Unless you were on the happy path, I think.)


It can be handled "kludgingly" by incrementing the name. e.g. foo --> foo3


Not sure why that's "kludgey".


Dart doesn't have this problem.

There is a global cache of downloaded packages, but it's just a pure cache and happily stores multiple different versions of the same package.

Each Dart application (or package) has its own set of selected package versions independent of any other application on the same machine.


Can you run two versions of the same Dart library/package within the same application?

Python (thanks to virtual environments) works fine at multiple package versions, it's only when they need to be running in the same application process at the same time that you run into problems.


No, by design within a single program you only ever have a single version of a given.

Because Dart is used primarily for client-side applications (and initially only web applications), we care a lot about code size. NPM's approach of silently allowing multiple versions of the same package adds a lot of bloat, which is fine for a server language but not great for a client one.

Also, nominal static typing makes having multiple versions of the same package potentially very unpleasant for users. Say you have an application that depends on packages "a" and "b". "a" and "b" both depend on a package "foo" and each gets their own version of "foo". The "foo" package (both versions) define a type Foo. It's entirely possible for an instance of Foo from "a"'s version of "foo" can flow through the program over to "b" and then get passed to "b"'s version of "foo".

The end result will be compile time and/or runtime errors like "Expected a value of type Foo but got a value of type Foo." This is not a great user experience. NPM avoids this problem by being dynamically typed. Go mitigates it by being structurally typed for interfaces.

Obviously, better error messaging can help, but it's just generally confusing to have multiple versions of what a user thinks of as "one" package when really there are multiple floating around in the same program.

For the most part, selecting a single version of any shared dependency works pretty well. It's definitely not perfect. It can make it harder for heavily used packages to make breaking changes. But the overall trade-offs seem to work fairly well.


npm you either have as many copies of a dependency as its used, or you flatten it and have the same problem.

go codifies foo -> foo2 as foo -> foo/v2


NPM and other Node.js package managers happily deduplicate dependencies are compatible within the specified version constraints. So if 3 different modules in your build specifies `foo^2` and two other modules specify `foo^3`, in general you will have two copies of `foo` loaded.


Contrary view, they shouldn't make breaking changes to a library. Stability is a nice thing.

If there are truly better ways of doing things /that matter/ in some space, find a way to incorporate them. Easy parallels to screw driving options. I'd be surprised to not see a flat head on outlet covers in any house. I'd be surprised to see traditional drives (philips, flat, or the one that looks like philips...) on most anything made by a crew nowadays. There are objectively better drivers, but it would be obnoxious to constantly have a different driver for every item in your house.


Or just make new project and let community deal with requests if they want.


I don’t see a market need for a new one. Python’s own stdlib urllib is pretty good for most cases, except async, and aiohttp is covering that gap.


HTTPX is the new one, and it's very popular. I've been using it in place of Requests for a few years now.

https://pypistats.org/packages/httpx - 2 million downloads a day (requests has 15 million https://pypistats.org/packages/requests)


Download count is an interesting measure but I can’t tell if urllib3 is more popular than request or it ranks higher up only because it’s a direct dependency of botocore! I wish there was a way to categorise these downloads by python versions.


> I wish there was a way to categorise these downloads by python versions.

I don't know if I maybe misunderstand your wish.

But if you scroll down on the pypistats link, there are the download numbers split up by python major and minor versions.


Pretty sure Deno for Node has similar problems. Sometimes you just gotta let the past go.


Only if the delta of the improvements is big enough which isn't the case for Deno


Sorry what? I'm saying Deno suffers due to Node backward compatibility. A write up about it:

https://www.baldurbjarnason.com/2024/disillusioned-with-deno...


Yikes. I always had a weird feeling about Requests looking from the outside, but it's worse than I thought.

If you're looking for alternatives: https://www.python-httpx.org/


Or just the built-in urllib is pretty usable as-is at this point.


My main issue with urllib is it requires too much boilerplate and it ends up being harder to read the code at a later date in my experience. In general I prefer the standard library, but it’s a bit of a toss up in this case.


Plain urllib3 staring from v2 is also pretty nice to use.


I use and have contributed to httpx but I'm scared of moths and had to block the image in my browser. Looks like they updated the image to be an even more detailed one :(


For reference, it's a butterfly, not a moth.

Source: https://github.com/encode/httpx/issues/834


I’ve seen comments here on HN about people having similarly visceral reactions to the name CockroachDB and swear they will never use it for that reason alone. I always laughed at the absurdity but now I actually have some sympathy towards this type of reaction. If I opened the docs for an unsuspecting python package and was greeted with a detailed, half-screen photo of any arachnid I would probably throw whatever device I was using.


For folks that don't recognize the author (Ian Stapleton Cordasco), he's one of the core maintainers for python-requests (sigmavirus24 on github).


It's good to see an API critique. I don't think enough attention is paid to their design.

The problematic design is something I see fairly frequently from people who aren't familiar with sum types, aka modelling data that has logical ors. It's very common in the Javascript world as well, for example, and in Go it's baked in with functions returning four possible values (success and error, success and nil, nil and error, nil and nil) when only two are valid.

A better design uses a combinator libraries / builders / fluent API. These are little finite state machines and the best libraries make it so invalid transitions cannot be compiled (which requires a type system).


I feel this is bit of a recurring theme in Python world where nice simple things are dragged into serious (enterprise) production world and then they accumulate gazillion bits of cruft, hiding the originally nice simple thing. It's quite fundamental problem trying to please both the ad-hoc small script and X million line production system users; Python straddling the gap imho ends up pleasing neither.

My preference would be to focus on excelling on the small-scale stuff and leave the complex stuff for ecosystems better aligned with that; basically let Java be Java and Python be Python.


It's one of the stereotypical failure cases of dynamically-typed languages. I cite this not to bury them in general, but because it's something that people programming in dynamic languages need to be aware of. If object oriented languages seem to frequently end up with "God Objects" dynamically-typed languages end up with "God Functions", where Just One More Parameter at a time, a core function grows slowly and steadily until it is a monstrosity of several dozen parameters and there is simply no conceivable way to document how they all interact with each other, test all the combinations, conceptualize all the combinations, etc. It's very dangerous because it happens often just one parameter, or even fractions of a parameter ("let's make this string also able to be an array of strings") at a time, each step seeming sensible at the time but the totality becoming a nightmare.

Statically typed can incrementally end up with too many parameters too, but the static typing generally helps reduce the complexity and there is generally more incentives to start binding the parameters up into meaningful other structs. Dynamically typed languages on the other hand have the tendency to start widening what types each parameter can take which makes the problems even worse. Is "targetURL" a string, an array of strings that will automatically turn the return into an array of results, an object that implements ".toURL()", a function that will be automatically called with some magic parameters, a None which will then invoke other magic to get the URL from somewhere else, etc.?

(Don't worry, dynamic typing fans, static languages have their own characteristic failures, like the aforementioned God Object.)

Dynamically typed code bases are not doomed to end up this way. It can be avoided with discipline, which starts with the awareness that it is a problem at all. And it's a very good idea to learn to avoid them, because they're terribly difficult to tear back apart once constructed. I've worked in a code base that darned near had a "JustDoEverythingWeCanPossibleDo(...)" function, where (...) doesn't represent the true horror because there were rather a lot of global variables getting set that had huge impacts on the process as well. Trying to pull that apart to do any real work was, ah, an experience.


I would argue that Python itself is a case of this. It peaked early as an ultra-dynamic language, and was very consistent at it - e.g. all declarations are executable code, which is actually fairly straightforward conceptually and easy to explain - but this approach doesn't mesh well with things like class declarations and imports, which turns out you need for more serious development.

More recently, this is the case with typed Python. Again, the dynamic-ness of Python - which easily exceeds that of JS - was part of the original design for type annotations. Turns out it doesn't work well with things like forward references or type parameters, so ugly hacks (like stringifying type names) were introduced to deal with that. Now there's yet another revamp to fix the resulting ugliness and inconsistencies (take a look at https://peps.python.org/pep-0649/ to see what I mean).


Modern python is basically a bucket of puke in between the Annotated crap, async/await, decorators everywhere..the elegance is gone for good. Sad.


> My preference would be to focus on excelling on the small-scale stuff and leave the complex stuff for ecosystems better aligned with that; basically let Java be Java and Python be Python.

Agreed 100%. Strongly prefer python for the simple scripting and conceptually simple mental model, and others for more complex stuff.

Also, don't use tools that were not designed for the job from the beginning and then complain that it does not fit your use case.


Don’t you think that ship has sailed? Python powers software of incredible complexity these days including back-ends for large companies and all of AI (yes, I know the “hard parts” are written in C++, but still).


> Don’t you think that ship has sailed?

No, those ships are setting sail with anchors down, and complaining why they're moving so slow.


>No, those ships are setting sail with anchors down, and complaining why they're moving so slow.

It's 2024 and you still think the choice of language in which you build your product is what makes or breaks a company?


It's quite rare for companies to fail because of one bad decision, but I believe that choosing python for large complex projects will have long term negative consequences, irrespective of whether it will on its own kill the company or not.


So do you think for example FAANG-like companies would have done pretty much as well if they wrote all their stuff in Scratch[0] instead, or do you think the choice of programming language matters?

[0] https://en.wikipedia.org/wiki/Scratch_(programming_language)


Interesting post on a library I've used a lot.

> Requests has always had modules intended for its use and only its use. At some point, we tried to document them for ourselves and that led to people using them externally and then filing bugs against them. This is a perfect example of 3 adults who did not consent explicitly to the use, but Python says that we must have because we didn't hide them well enough.

I don't understand the problem? It's bad that users reported bugs in internal modules?


It's bad that users used modules marked explicitly as not got public use, because now those modules can't be changed without breaking user code.


> I gave it but realized that most people only know that Requests does the right things for them for TLS

as long as you don't work somewhere that has its own internal CA infrastructure

requests decides to use its own built-in CA store instead of the carefully tweaked OpenSSL default (the location of which you may not know declaratively)

meanwhile urllib works fine

(yes I read further down where this point is sort of mentioned)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: