It's a bold boast, but honestly, it doesn't mean much if the problem domain isn't similar. Reddit in particular runs into issues because so little of what they do is widely cachable. The site is heavily customized per user, and experiences an awful lot of cache churn. I don't disagree with your sentiment (design matters more than choice of language), but I could serve 16 million static pageviews/day off of a $20/month Linode. That doesn't mean that reddit could do the same.
Reddit has 8 million unique visitors, but only 300k subscribers to /r/pics, the top auto-subscribed subreddit. Generously assuming half of people manually unsubscribe, that's at most 7% of their user base that are logged in. I know full well about cache churn with logged in users, but that's a lot of lurkers.
I'm also not sure what makes a Facebook game's domain more "widely cachable". 100% of users are logged in. The vast majority of actions taken are changing state. Any app touchpoint is writing to the database. Page caching is nearly impossible.
I can't claim to know the specifics of your application, but consider that a single upvote invalidates however many thousand copies of a given page. An action in a Facebook game might invalidate a page for 10 or 20 or 50 users. An upvote, (or a comment, or an edit, or an upvote on a comment) invalidates that page for upwards of 300k+ users. Combine that with the rate at which reddit's pages change, and you're talking an obscene amount of CPU time spent building fragments/pages to stick into your cache. Scope of visibility of a change means a lot.
You obviously know how to scale an app if you were pulling 16 million pageviews a day, and I don't intend to discount that at all. I just mean to point out that while fundamentally the same problem, reddit has to deal with a version of that particular problem that most applications don't begin to approach.
>a single upvote invalidates however many thousand copies of a given page
I never understood this. Does reddit really need to spend the capital making sure I see a stranger's upvote the moment it occurs? A 60 second delay to refresh pages in batches seems perfectly reasonable. Perhaps with a client side script to mark my own upvotes so the system doesn't look like is losing my selections.
I'm not sure what you want to hear, so I'll elaborate a lot.
Warbook was a Facebook application written in Ruby on Rails I ran by myself in late 2007 - 2008. It grew to over 16 million pageviews a day. At the time it was more pageviews than Twitter.
I scaled it using the following stack: Perlbal for load balancing, LightHTTPD for static assets, Mongrel for dynamic requests, Memcached for caching, and MySQL for relational data storage.
I used two medium instances for load balancing, one medium instance for asset hosting, 15 small instances for mongrel, one XL instance for memecached, and one XL instance for MySQL.
I used memcached as a "write-through" cache. Everything in cache was considered fresh. Every write of a cachable object would write to both MySQL and memcached. Every read of a cachable object would start with memcached first and failover to MySQL. This reduced reads on the database by 95%.
If you don't mind me asking what happened to it? Facebook now reports about 5,500 monthly users. I'm guessing you were acquired by SGN but I'm don't see why that would cause traffic to drop so much.
SGN switched their core focus from Facebook games to iPhone games in mid 2008. At that time, they droped support for all of their Facebook properties except (fluff)Friends. Games on the Facebook platform are doomed to rapid traffic loss without constant adaptation and viral tuning. Even with it, user retention is hard.
Isn't every single subreddit cachable?
Only the starting page isn't 100%, but I'm sure that many users have the same combination of subreddits (e.g. the default ones).
Would be interesting to see if they have certain "user classes". Something like:
For logged in users, you're still going to need the voting status for the current user on every single submission as well as that submission's hidden status for the current user to decide whether or not a submission should be displayed in the listing.
I believe they almost never hit the DB directly, so these are probably recached immediately (or submitted to both the cache and the DB at the same time), but that still means quite a lot of traffic.
Scaling is caching and architecture, not writing your app in Java.