For your typical scaling org, I think data layers are often the main issue. Moving from a single postgres/mysql primary to something that isn't that represents the biggest hurdle.
Some companies are "lucky" and have either natural sharding keys for their primary business line, are an easily cacheable product, or can just scale reads on replicas. Others aren't, and that's where things get complicated.
Tbh, That's why for our new projects we've completely ignored relational databases. They're a pain in the ass to manage and scale poorly.
DynamoDB, on the other hand, trivially scales to thousands (and more!) of TPS and doesn't come with footguns. If it works, then it'll continue to work forever.
This is funny to me since modern relational databases can get thousands and more TPS in a single node. My dev machine reports 16k TPS on a table with 100M rows with 100 clients.
> pgbench -c 100 -s 100 -T 20 -n -U postgres
> number of transactions actually processed: 321524
> latency average = 6.253 ms
> tps = 15991.775957
Yep. And 98% of software written today will never need to scale beyond 10k TPS at the database layer. Most software is small. And of the software that does need to go faster than that, most of the time you can get away with read replicas. Or there are obvious sharding keys.
Even when thats not the case, it usually ends up being a minority of the tables and collections that need additional speed.
If you don't believe me, look through the HN hiring thread sometime and notice how few product names you recognise.
Most products will never need to scale like GMail or Facebook.
Some companies are "lucky" and have either natural sharding keys for their primary business line, are an easily cacheable product, or can just scale reads on replicas. Others aren't, and that's where things get complicated.