Hacker News new | past | comments | ask | show | jobs | submit login

On top of that, the author uses aiopg rather than asyncpg[1] for the async database operations, even though asyncpg is (allegedly) a whole lot faster.

1. https://github.com/MagicStack/asyncpg




asyncpg is not scalable. It can only do "session pooling" because it needs advisory_locks, listen/notify, which will end up needing a lot of Postgresql connections.


Can you share more information on this (articles, etc)?


There is no 1 article to explain but you can research each part.

1. One Postgresql connection is a forked process and has memory overhead (4MB iirc) + context switching.

2. A connection can only execute 1 concurrent query (no multiplexing).

3. Asyncpg to be fast, uses the features that I mentioned in my parent post. Those can only be used in Session Pooling https://www.pgbouncer.org/features.html.

The whole point of async is to some other work while waiting for a query (ex a different query).

If you have 10 servers with 16 cores, each vcore has 1 python process, each python process doing 10 simultaneous queries. 10 * 16 * 10 = 1600 opened connections.

The best way IMHO: Is to use autocommit connections. This way your transactions execute in 1 RPC. You can keep multiple connections opened with very light CPU and pooling is best.

I've done 20K short lived queries/second from 1 process with only ~20 connections opened in Postgresql (using Pgbouncer statement pooling).




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: