Hacker News new | past | comments | ask | show | jobs | submit login

Twitter got a huge performance and efficiency boost even rewriting the front end on the JVM (in Scala).



Twitter is a one-in-a-thousand company with traffic amounts that most of us here will dream of ever achieving. Even the more people that don't take specific interest in running large scale deployments.

Even writing you frontend in Java first would (most likely) not have helped you coping with the new-found challenges in that heights.


Perhaps on the efficiency side. On the performance side it would be true even on a small site (unless it was all static, cached content).

http://www.techempower.com/benchmarks/


Twitter is a one-in-a-thousand company with traffic amounts that most of us here will dream of ever achieving.

And not every low-traffic site is simple CRUD. If you do image processing, applying machine learning, natural language processing (or a thousand other things that are CPU intensive), it's nice that you can directly use one of the many high-performance libraries available in the Central Repository in your webservice.


You probably won't reasonably run most of these applications in your frontend framework either.

Also, the example was the Twitter frontend. So I respond to Twitter frontend. Don't derail.


Twitters original architecture was an absolute disaster.

I'm sure they could get a speedup by rewriting the front-end, but the fact that they kept trying and fail to scale stuff that with anything remotely like a sensible architecture is network IO bound rather than CPU bound even with MRI on the hardware available when they launched was a good sign that their main problem had nothing to do with language.

Their "simple" case of updating n followers for every status update when n is small is trivially distributable with a relatively simple message bus routing fabric; their "hard" case of updating when n is excessively large is simple-ish by simply structuring the follower lists as a tree of "virtual" follower lists that acts as forwarders; you can do that with off the shelf components.

To illustrate just how basic this problem is: I've built message buses for scenarios like that using mail servers even - though if you do that, at least with a standard mail server, you will be limited by disk IO on forwarding nodes unless you spend too much money (you don't need message updates to be durable "in transit", as if the system is suitably.

Another simple approach to the large follower problem is to pull from "expensive" publishers and push from cheap ones. You can do that too with 25-30 year old software and some scripts to maintain the lists: Push via SMTP, pull via NNTP and make the "mailbox" you post status updates to for people with large follower lists a SMTP->NNTP gateway.

No, I'm not suggesting they should actually do that, just pointing out that it is viable, and it was viable with the kind of data volumes they are dealing with well over a decade ago. Yet they had problems handling it years later with a much smaller data volume than they have now.

When you get to the size Twitter is now, it makes sense to sacrifice developer effort for saving some percents here and there on servers, but their complaints about language when they started their rewrites was distractions from the real issues, whether they realised it themselves or not - it's not like they were handling data volumes at the time that lots of people haven't managed to handle with much more modest resources.


This is a pretty horrible example. Twitter is a real time communications platform and even ignoring scale its not the right choice for Twitter. I doubt there's many folks at all in the Rails community who would say that Rails is the right stack for Twitter.


The front end is just a bunch of calls to services and HTML rendering from templates. It really has nothing to do with real time communication.


Except that little thing that shows you new tweets immediately...


That "little thing" which maintains and generates the timeline should be IO bound whether it's written in C or Ruby unless it's written by someone who does not understand the problem space, and ought to be a separate service. We can agree Rails doesn't make much sense for it, but frankly, language does not matter. Understanding how to minimise context switches for a network service and how handle fan-out does. It's not a hard problem - it's been solved dozens of times over by people writing queueing message servers, NNTP servers and SMTP servers, amongst others.


I don't disagree with any of your points. I just disagree with the assumption that twitter frontend is something static with no time-related components.


The frontend should be. It is folly to handle things like the timeline update there, where it is time sensitive, rather than as a backend process that is not - nobody will notice if their timeline is a few seconds out of date. They will notice if it takes more time for the pageload to complete.


That is just polling essentially the same endpoint that renders the page.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: