The article is too light on details to estimate if trillions is impressive or not. For example, if my single-server system easily handles 100 mio per day and the load is almost exclusively CPU bound (like with most AI tasks) then scaling to 1 trillion per day might be as easy as buying 10k servers, which is totally a thing that mid to large sized companies do to scale up.
The fact that makes this Meta paper impressive is NOT scaling up to 1 trillion per day, it's that they manage to do so while keeping request latency low and CPU utilization high. Anyone who's been with Heroku long enough probably remembers when suddenly instances would be 80% idle and still requests were slow. That was when Heroku changed their routing from intelligent to dumb. And Meta is doing the opposite here, reducing overall deployment costs by squeezing more requests out of each instance than what would have been possible with a simple random load balancer.
The fact that makes this Meta paper impressive is NOT scaling up to 1 trillion per day, it's that they manage to do so while keeping request latency low and CPU utilization high. Anyone who's been with Heroku long enough probably remembers when suddenly instances would be 80% idle and still requests were slow. That was when Heroku changed their routing from intelligent to dumb. And Meta is doing the opposite here, reducing overall deployment costs by squeezing more requests out of each instance than what would have been possible with a simple random load balancer.