Of course that is true but it undeniably helps. I wonder if postwar Britain watching its hegemony decline will be anything like the current decline of the US.
Honestly it is mostly caused by people from the machine learning end of the ecosystem not understanding how cooperative multitasking works and trying to bolt a web framework to their model.
That coupled with the relative immaturity of the python async ecosystem leads to lots of rough edges. Especially when they deploy these things into heavily abstracted cloud ecosystems like Kubernetes.
FastAPI also trys to help by making it "easy" to run sync code but that too is an abstraction that is not majorly documented and has limitations.
Everything you say here is true, but if you do an analysis and run benchmarks on non-toy projects you'll quickly find that async Python is a bad choice in virtually every use case. Even for use cases that are extremely IO bound and use almost no compute async python ends up dramatically increasing the variance in your response times and lowering your overall throughput. If you give me an async python solution, I will bet you whatever you want that I can reimplement it using threads, reduce LOC, make it far more debugabble and readable, make it far easier to reason about what the consequences of a given change are for response times, make it resistant to small, seemingly inconsequential code changes causing dramatic, disastrous consequences when deployed, etc etc etc. Plus you won't have a stupid coloring problem with two copies of every function. No more spending entire days trying to figure out why some callback never runs or why it fires twice. Async python is only for people who don't know what they are doing and therefore can't realize how bad their solution is performing, how much more effort they are spending building and debugging vs how much they should have to put into it, and how poorly it performs vs how it could perform.
The main problem with those is the memory controller starts to degrade a lot when you use all 128 of those cores.
We were doing some testing and > 96 cores at 100% caused a massive degradation in performance. We ended up going with dual 32C/64T Epycs (which cost twice as much) as a result. If they fix it in the Altra One chips they will be back on the table though because they were very good power wise for our workload and quite price competitive in a supermicro chassis.
It is bad in the sense we need to reach out for external libraries to avoid manually writing all the boilerplate with handling errors, and any async runtime works, as long as it is tokio.
Probably don't even need to work that hard. The Saudis got a bunch of nuclear secrets the first round so I am sure F35 info can be brought to Mar a lago.
This is cheap compared to GB200, which has a street price of >$70k for just the chip alone if you can even get one. Also GB200 technically has only 192GB per GPU and access to more than that happens over NVLink/RDMA, whereas here it’s just one big flat pool of unified memory without any tiered access topology.
We finally encountered the situation where an Apple computer is cheaper than its competition ;-)
All joking aside, I don't think Apples are that expensive compared to similar high-end gear. I don't think there is any other compact desktop computer with half a terabyte of RAM accessible to the GPU.
I mean expensive relative to who, Nvidia? Both are enjoying little to no competition in their respective niche and are using that monopoly power to extract massive margins. I have no doubt it could be much cheaper if there was actual competition in the market.
Fortunately it seems like AMD is finally catching on and working towards producing a viable competitor to the M series chips.
Also I assume this extension is pretty obvious so it wont take long for CF bot detection to see it the same as playwrite or whatever else.
reply