Hacker News new | past | comments | ask | show | jobs | submit login
Httpx: A next-generation HTTP client for Python (python-httpx.org)
463 points by tomchristie on Jan 10, 2020 | hide | past | favorite | 115 comments



I've been using httpx 0.9.3 in production now for a couple months. I switched from requests when I realized I needed async support and it has been a dream to use.

The only issue I've run into has been with my attempt to reuse the same AsyncClient to make multiple concurrent requests to the same remote host. It looks like this issue may have been fixed in 0.10 or 0.11 so I'll be upgrading soon to check.

Also, be sure to check out the other fantastic projects by Encode. https://github.com/encode I stumbled upon httpx after using Starlette and Uvicorn for one of our microservices and and been pleasantly surprised by how easy to set up and use they are.


I'm currently using aiohttp. There is features I don't find in butterfly docs like : limit connection, limit connection per host or per time frame.

Do you know if those exist in butterfly?

I'm already deep in aiohttp and it's not an easy task to learn an async client (at least in my case, but I'm no dev) so if I do switch it would be for more features (retry option on exceptions, limit requests per host and per time frame...). But that's only my opinion.


I prototyped moving to aiohttp from requests/multiprocessing. The speedup was amazing. Saw something like a 60% reduction in runtime for our use case. Only reason the code hasn't gone live: we currently use requests_negotiate_sspi for authentication, which sadly isn't supported by aiohttp. Not sure if httpx supports it. Looks like it might. Planning on giving it a shot next week.


We don’t have any third party packages for that authentication style yet, but we do have an API for supporting custom auth flows... https://www.python-httpx.org/advanced/#customizing-authentic...

If you’re interested in trying httpx you’d be very welcome to raise an issue an issue related to NTLM/Negotiate authentication - it’d be really helpful for us to work through that and figure out if our auth API is sufficient for implementing that, or if there’s anything we’re missing.


Thanks, will do.


> Fully type annotated.

This is a huge win compare to requests. AFAIKT requests is too flexible (read: easier to misuse) and difficult to add type annotation now. The author of requests gave a horrible type annotation example here [0].

IMO at this time when you evaluate a new Python library before adopting, "having type annotation" should be as important as "having decent unit test coverage".

[0]: https://lwn.net/Articles/643399/


The huge win against requests is that Https is fully async.. you can download 20 files in parallel without not too much effort. Throughout in python asyncio is amazing, something similar to node or perhaps better... That's the main point.


FWIW, requests3 has "Type-annotations for all public-facing APIs", asyncio, HTTP/2, connection pooling, timeouts, etc https://github.com/kennethreitz/requests3



Sorry, have you checked the source? Are these features there or only announced? Has requests added a timeout by default finally?


It looks like requests is now owned by PSF. https://github.com/psf/requests

But IDK why requests3 wasn't transferred as well, and why issues appear to be disabled on the repo now.

The docs reference a timeout arg (that appears to default to the socket default timeout) for connect and/or read https://3.python-requests.org/user/advanced/#timeouts

And the tests reference a timeout argument. If that doesn't work, I wonder how much work it would be to send a PR (instead of just talking s to Ken and not contributing any code)


>But IDK why requests3 wasn't transferred as well, and

That's thing... Who knows..


TIL requests3 beta works with httpx as a backend: https://github.com/not-kennethreitz/team/issues/21#issuecomm...

If requests3 is installed, `import requests' imports requests3


> "having type annotation" should be as important as "having decent unit test coverage".

I mean this is pretty spot on IMO. I've worked with many languages, and have concluded that having a powerful type system catches soooo many bugs before you even try to run the code.

And they're usually "stupid" bugs too, forgetting to sanitize inputs etc. Even worse is when a language tries to be "smart", so you end up with "1" + 2 = "12" and no errors at all.


Type systems and static checks in general are great. What do you think of the example type in the post above? It seems like the type system might not be expressive enough to handle existing APIs, that’s something which was hard for TypeScript too, but has slowly been getting better. Perhaps that will be the case in Python too.


Personally, I use type aliases in cases where I need to describe very dynamic types that make intuitive sense but are too wordy to pattern-match visually. So you might decompose that example like:

    FileSourceType = Union[basestring, file]
    FileSpecType = Union[
        # (filename, file_source)
        Tuple[basestring, Optional[FileSourceType]],
        # (filename, file_source, content_type)
        Tuple[basestring, Optional[FileSourceType], Optional[basestring]],
        # (filename, file_source, content_type, custom_headers)
        Tuple[basestring, Optional[FileSourceType], Optional[basestring], Optional[Headers]]
    ]

    ...

    files: Optional[
        Union[
            Mapping[basestring, FileSpecType], 
            Iterable[Tuple[basestring, FileSpecType]
        ]
    ]
You theoretically lose some "glance value" because now you have to look in two places for the type...but in practice, I think you can figure it out a lot easier than the original example. Obviously you don't want to do this in simple cases, but it can make pathological cases like the above a lot easier to chew on.


Some parts of that example seem pretty silly. URLs for example: he describes how just about anything can be passed in as a URL, and requests attempts to call __str__ or __unicode__ on it and then parse it. Considering (essentially?) all types in Python have __str__, this is a perfectly reasonable place to use the Any type. The issue isn't with the expressiveness of the typing system, but with the absurd flexibility of input handling by requests.


The right type is ‘object’, not ‘Any’.

‘object’ is the base class of all types: you can put anything into an ‘object’, and you can only do very generic operations like str() on what you get out of an ‘object’ without further checks like isinstance().

‘Any’ is an unsound escape hatch that disables type checking: you can put anything into an ‘Any’, and you can do anything with what you get out of an ‘Any’, and the type checker will make no effort to stop you from doing something wrong.

https://docs.python.org/3/library/typing.html#the-any-type

https://mypy.readthedocs.io/en/latest/dynamic_typing.html#an...


Actually the proper way to handle is to use protocols. They provide already SupportsBytes type, and you can similarly create custom SupportsStr, but IMO it is silly to do that. This approach increases chance of bugs, and it's just easy to use str type and for developer to wrap such value in str(). I believe that's why SupportsStr was not introduced.


You should check out the Literal type in Python. It provides for some of the stupider things in libraries, like return types that vary based on the value of an input parameter, as in Python’s open() method, which might return a text stream or a bytestream based on the value of the second parameter.


> And they're usually "stupid" bugs too, forgetting to sanitize inputs etc. Even worse is when a language tries to be "smart", so you end up with "1" + 2 = "12" and no errors at all.

The last one is no longer possible in Python 3 and will throw an error. That's another thing I am glad they fixed.


That's from 2015, his second complaint about interfaces vs inheritance is solved in recent versions with Protocols. Compatibility with 2.7 shouldn't be argument today either.

The complex union is as you say more a sign of a too flexible api rather than problem with type hints, the type hints just brings the root issue to the surface. I mean, why would you accept either a mapping or a list of tuples as input? Just let the user call dict(..) on the tuple-list first if they have it in such format? The documentation doesn't even mention that lists are ok for headers, only dicts: https://2.python-requests.org/en/master/api/#main-interface.

The file-tuple api with various length tuples is perhaps valid and the most convenient way to implement such options, but it's still an exceptionally unique api which requires exceptional type hints, it can be made slightly simpler which chc demonstrated above.


FastAPI creator here... if you use FastAPI, HTTPX would probably be the best match for sending requests, just saying... :D


Sometimes I find it hard to justify all the time I spend on HN. But discovering FastAPI is probably going to compensate for plenty of this time.

This is a really awesome library, thanks!


Thank you! :D



Incredible! Just started with python, this looks pretty good to use.


:D


This framework looks incredibly simple and powerful. I’ve coded a bunch of these features manually in the past. I think FastAPI is going to earn a spot in my next project!


Writing FastAPI code as I procrastinate right now!


Ha! I'm glad it's enjoyable enough as to do it as "procrastination" :p


Hey, FastAPI is great. Thank you.


:D


Thanks for the tip, we'll be sure to try it. And thanks for the framework!


:D


What does FastAPI actually add on top of Starlette which it’s built upon?


Autocompletion everywhere, and from that, data validation, serialization, documentation. Dependency injection, OAuth2 stuff. Web standards. https://fastapi.tiangolo.com/features/


Using it in production starting last month! Keep up the good work!


:D


I use FastAPI, its awesome - thank you, just saying...


Hehe, thanks! :D


I'm really impressed by the work done by Encode and Tom Christie. IMO it feels like httpx is set to be the go-to http client in the near future.

A friend wrote respx https://github.com/lundberg/respx which is a utility to mock HTTP requests made by httpx. It works similar to the requests mocking library responses https://github.com/getsentry/responses


With all the alleged drama surrounding the requests module and the promised async support, I am pleased there is a drama-free alternative.

https://vorpus.org/blog/why-im-not-collaborating-with-kennet...


I don't understand why this is such a complicated category and that many platforms do not have solid http clients in standard library.

On every single project I do, it's just a bunch of posting JSON and getting a response synchronously. Over and over.


I blame the bloat.

In some high-level languages even BSD sockets (and many other POSIX functions) aren't in the standard library, and there are various wrappers to provide "ease of use" and integration with a language's runtime system; plenty of complexity (and alternatives) even at that point.

RFC 2616 (HTTP/1.1, 1999) may seem manageable, but it's much more than just posting data and getting a response, and IME many programmers working with HTTP aren't familiar with all of its functionality. Then add TLS with SNI, cookies, CORS, WebSocket protocol, various authentication methods, try to wrap it into a nice API in a given language and not introduce too many bugs, and it's rather far from trivial. But that's just HTTP/1.1 with common[ly expected] extensions.

Edit: Though I think it'd also be controversial to add support for particular higher-level protocols into standard libraries of general-purpose languages, even if it was easy to implement and to come up with a nice API.


> RFC 2616 (HTTP/1.1, 1999) may seem manageable

But it's probably not, as it's underspecified and ambiguous, which is part of why its been replaced as the HTTP/1.1 spec by RFCs 7230-7237 (2014).


Plenty of reason why it's hard to ship in standard library. Here's some off top of my mind:

- Should the library includes its own CA store, or use the system's CA store? These kind of library often include their own CA store (since they changes often), and httpx seem to use 3rd party lib to handle that (certifi). This is hard to do in a standard library for variety of reasons (users rarely update their python installation, system CA store is not always available/up to date, etc).

- While the http protocol itself is pretty stable, some part of it are still changing overtime. Things like compression types (brotli is gaining traction these days, and we might get a new compression types in the future), new http headers added, etc. Security issue also show up all the time. The user will want tighter release schedule than python's so they can get these stuff sooner. The situation is even worse for users that stuck in a particular version of python for some reason since they now won't have access to these new update ever.


> Should the library includes its own CA store, or use the system's CA store?

The CA store should be a configurable option, and one of the supported options should be the system CA store.

> The user will want tighter release schedule than python's so they can get these stuff sooner.

Ruby is moving stdlib to default and bundled gems, which addresses this. There's no reason that “delivered with the interpreter” needs to mean “frozen with the interpreter”.


> The CA store should be a configurable option, and one of the supported options should be the system CA store.

It's more complicated than that, especially if you aren't on Linux.

On both the other two big general purpose platforms (Mac OS, Windows) the vendor provides a library which implements their preferred trust rules as well as using their trust store.

On Linux what you usually get is the list of Mozilla trusted root CAs and you're left to your own devices. Mozilla's trusted list is IMNSHO a shorter more trustworthy list than supplied by Apple or Microsoft, but it misses nuance.

When Mozilla makes a nuanced trust decision for the Firefox browser that decision doesn't magically reflect in an OpenSSL setup on a Linux server. Say they decide that Safety Corp. CA #4 can be relied upon to put the right dates on things, but its DNS name checks are inadequate and no longer to be trusted after June 2019. Firefox can implement that rule, and distrust sites with an August 2019 cert from Safety Corp. CA #4, while still trusting say, the Safety Corp. CA #4 certificate on Italian eBay from March 2018. But there's no way for your Python code to achieve the same outcome relying on OpenSSL.

Python's key people seem to think that it's better for Python to try to mimic what a "native" web browser would do because that's least surprising. So on Windows a future Python will trust Microsoft's decisions, on macOS they'd be Apple's decisions and only on Linux will it be Mozilla decisions. Today it's Mozilla's trust store everywhere.

Hypothetically in the ideal case you'd have your own PKI and you'd have all the necessary diligence in place, hire your own auditors, maybe even have contractors red-teaming the CA you trust for you - but we don't live in a world anything like that, most people are implicitly reliant on the Web PKI and probably tools like these needs to accept that.


HTTP/2 was only standardized 4 years ago. HTTP/3 is being actively developed. That's actually not really stable on a language's standard library timespan, IMHO.


It's simple. Python was created before HTTP even existed. Since then a lot of things had changed, and once you create an API, it is hard to rewrite it when new use patterns emerge.

It's easier for a 3rd party package to come up with a better api, because they can start brand new. Also when there's a radical change it is easier for a new 3rd package to take over. Httpx is example of such evolution, although not due to changes in http but this time changes in Python, it makes use of new functionality that's harder to implement in requests, mainly adding async support and type annotations.


Python was created before json existed, and yet: https://docs.python.org/3/library/json.html

It's not a hard rule, sometimes things do end up in the standard library.


Of course it could be added later (there's urllib that almost no one uses), I meant that Python is older than HTTP and HTTP evolved a lot during that time.

Once you create an API, it's hard to change it. HTTP initially was very simple and evolved over time. Things like REST, JSON encoded messages, cookies, authentication (albeit rarely used), keep-alive were added incrementally. Today's HTTP is used completely different than 30 years ago. Python already had urllib, then urllib2 which was renamed back to urllib in Python 3, but its API is still behind how HTTP is used right now.



Indeed :) —> https://golang.org/pkg/net/http/

> Package http provides HTTP client and server implementations.



That's good, but that's arguably lower-level than httpx.


httpx makes a pretty good first impression on me. The homepage provides examples, key selling points, install instructions, and links to any further reading one could hope for. However, I am missing one thing: The features this offers over the popular requests library do not seem to require httpx to be a competitor, but an extension or fork. There surely must be some major incompatibilities that allowed this library to do something fundamentally different, right?


from comments here, two already big ones (for me) is type annotation and async support.

Will need to check how it compares with aiohttp which is quite good and also has these.


> The features this offers over the popular requests library do not seem to require httpx to be a competitor

Yes it has to be. Requests is not as great as everyone thinks it is. It's API is simple, sure, but when you need more advanced features (doesn't even support HTTP/2 AFAIK) like timeouts, proper exception handling (which you cannot do in requests)) it actually sucks.

httpx is a far superior library already!


I know this is unrelated, but probably a lot of Pythonistas here and I've been wondering: what is the async web framework of choice for you guys today? As for DBs, still SQLAlchemy? What about a prettier (js) alternative?


I've been using FastAPI https://github.com/tiangolo/fastapi which is built on top of Starlette as my main async framework and Uvicorn as my primary web server.

Starlette and Uvicorn are both made by Encode https://github.com/encode and in my experience they consistently put out quality stuff.


I used aiohttp and it is quite good and solid (I also like that it comes with type annotations, which enables autocomplete in PyCharm), I did not have chance to compare it to FastAPI.

Regarding database access, my recommendation might not be popular, but I really like asyncpg[1]. They basically dropped DBAPI and instead created their own API that functionally matches PostgreSQL, giving you full control and increasing performance.

As a result of that you probably won't be able to use ORMs (there's SQLAlchemy driver, but as I understand it is hacky and you are losing the performance benefits).

Personally I compared my code with and without ORM and there's no change in number of lines. If you use PyCharm and configure it to connect to your database (I think that might be a pro feature) it will then automatically detect SQL in your strings provide autocomplete (including table and column names) and provide highlighting, removing all the advantages that ORM provided.

[1] https://github.com/MagicStack/asyncpg


Went from Flask + FlaskRESTPlus to FastAPI. Can't recommend FastAPI enough.

https://github.com/tiangolo/fastapi



Starlette, by the same author as httpx.

https://www.starlette.io/


> async web framework

Not enough experience with async to comment.

> DB

peewee is good enough and more ergonomic compared to SQLA for a lot of use cases.

> formatter

black. To be clear it often produces truly horrendous code, but at least there’s no arguing and no fussing over options or details.


Yeah, it's weird that I find code formatted by black much less aesthetically pleasing than js code from prettier. But we're still using it.


fwiw:

I've moved to Trio over asyncio (I did plain old async for a couple years and Trio makes a ton of sense)

Quart-trio over Flask (just to get a Trio-friendly flask-a-like server) - plain old aiohttp worked really well too. It takes a bit more roll-your-own work, but you get exactly what you want.

peewee over SQLAlchemy (less committed to this change, but peewee has been fine so far and is much more streamlined) I'm mostly just using SQLite. the async version of the ORM looks pretty new, i'm not using it yet.


In the process of porting a Pyramid app to Starlette and very impressed. It's lightweight, well-documented, and a breeze to use.


Honestly I'm not so into async and more into Lambda with Flask. Here's what I'm using right now for that: https://spiegelmock.com/2020/01/04/python-2020-modern-best-p...


sanic seems nice.

also, sqlalchemy is an over-engineered system imo. i only go for it when i have no other choices. otherwise i use a database client directly.


I’ve been a very happy Tornado user for years.


Aiohttp + Aiopg which uses sqlalchemy


Aiohttp + sqlalchemy


Django + Channels


Looks like this does not include certifi [0] and loads system certificates by default. This is a breath of fresh air to see, because so many packages want to use their own certs and have a custom system to override it to use system certs.

Edit: Well, looks like it does use certifi. But my grumble still stands, I don't understand why does everyone want to mess with your certs.

[0] https://pypi.org/project/certifi/


It uses certifi by default, and it's down in the dependencies: https://www.python-httpx.org/advanced/#changing-the-verifica...


Bummer. Luckily, it doesn't invent a new way to override, `SSL_CERT_FILE` is mentioned in the environmental variables.


I want to mess with certs in Python so that my web crawler can actually access the whole web. If you don't talk to a wide variety of hosts, you probably haven't noticed that it's broken for 1%.


If you work in a corporate environment, you would probably notice that systems that insist on bundling their own certs without an easy to activate option of using system cert store are broken. (And even if the library has an easy to use option, if it's easy to not expose it, much software built on the library will still be broken.)

People should be empowered to substitute cert stores, but the system store should be the default.


A lot of corporate environments are fortunately legally bound and/or princilped about not doing mitm and are using the happy path of internet CAs.


You don't need to be doing MITM to get value out of using an internal CA.


It's a pain, sometimes it's worth it, oftentimes not.


Why yes, my web crawler operates in a corporate environment. It's almost as if different companies have different needs.


Finally a client that mixes async support and a “requests-compatible” API (for those times you don’t want to go as low as say an aiohttp or the like).

Looking forward to trying it out

Edit: maybe the homepage can include a very simple async example as well?


As an aiohttp user, what do you mean? The API od aiohttp is not low level and is very similar to requests api.


What are the selling points for people already using aiohttp?


Maybe support for trio and HTTP/2?

Also I ran into hard-to-debug issues when there were lots of requests in the past. I'll check again with httpx soon.


One thing requests still has going for it is that it’s used under the covers by a lot of client libraries with a dynamic import, allowing you to pass in a session object to control things like connection pooling and custom headers.

I do look forward to httpx becoming the new “standard” though. Tom is a great developer, and his ecosystem of tools are going to have a really big impact in python web dev over the coming years.


Too bad they carried over the biggest nuisance from requests, having to call raise_for_status() after every single request. One extra line everywhere and another thing that can be forgotten and cause strange errors when you least expect it.


Thoughts on this one [0]? :-)

[0]: https://github.com/encode/httpx/issues/752


I like async support, but always, in async examples, doesn't the await keyword just block anyways?

I like to see the usage of async inside of a main loop, where other things can be processed while waiting for the async response to come in.

Further, not all applications have the idea of an event loop, so async may not be needed, but it's useful to have the option.

I use async operations to multiplex operations in a queue. The linux scheduler can handle the execution time slices for me, I'm not going to build a scheduler, but the queue controller's role is to accept jobs and handle timeouts and results of each async operation.


> doesn't the await keyword just block anyways?

The await keyword only blocks the execution of the coroutine in which it is used. It releases the event loop so that other coroutines can continue processing while the result of an awaitable is being fetched.


One of the highlighted features is directly calling into WSGI applications[1], which uses flask as an example.

Question: is there any advantage of this over flask’s builtin werkzeug test client?

[1] https://www.python-httpx.org/advanced/#calling-into-python-w...

[2] https://flask.palletsprojects.com/en/1.1.x/testing/


Comparing the code samples, it looks like it's easier to spin up testing with Httpx than Flask, but that's a superficial conclusion. I don't have experience with Httpx so take this with a grain of salt. Based on the comparison, I'm going to play with Httpx for testing the next time I use Flask because the simplicity looks rad.

Based on similar experience with other tools, that possibly means that Httpx is great for simple testing but if you need to go deep it's better to use the framework provided. That's an assumption, though, so I'd love to hear more from others.


Flask’s sample code is doing a bunch of things like spinning up db and stuff and setting up a pytest fixture, so it’s not a fair comparison. You can replace

  with httpx.Client(app=app) as client:
      ...

in the httpx sample code with

  with app.test_client() as client:
      ...
for flask’s builtin test client and the rest is basically the same. Now that’s a fair comparison and neither is simpler than the other.

I guess one advantage of httpx is that developers might generally be more familiar with the requests response object API than the werkzeug response object API.


There is also quite nice httpx library for Ruby: https://gitlab.com/honeyryderchuck/httpx


Great! Sometimes you don't need huge magic libraries with tons of features, but solid, good, plumbing.

Will definitely be checking out and potentially replacing requests and aiohttp.


My project has a dozen or so libraries, many of which require different and conflicting requests, urllib3, etc., libraries. I have to setup the packages very carefully to ensure the right module gets the right version of its HTTP library and it has generally turned me off to Python HTTP libraries altogether. For my own code, I stick to the basic built-in libraries, regardless of how difficult to use they are.


I’ve used httpx to test server load issues for our most sensitive system and it works exactly how you’d expect it would. Highly recommend it.


I use intensively requests-cache, that add local caching (in sqlite) for python requests and is configurable (should it cache non 200 response, should it cache POST ...) and it's amazing when scrapping. I didn"t find a similar companion package for httpx.



Is that another dead butterfly?

Great package though. Love the dual support for async and sync requests.


I immediately wondered the same, and I think not. The upper wings are not at a 90 degree angle to the body, so alive and well. Apart from being a black and white drawing of course.


I have been using httpx 0.95 in production. So far so good.


Do you mean 0.9.5?


looks nice, but appears to have a quite a few third party deps that requests doesnt?


Requests simply vendors what it needs, afaik. Which is sub-optimal. At least this gives you visibility into those deps.


After a brief glance, I'm not sure what this offers beyond requests?


Proper async support was the big selling point for me.


ah, yes that's attractive.


Async support and http2 without patching


thanks.


> urllib3 - Sync client support.

Well that's going to work wonders for async now isn't it?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: