Twitter: From Ruby on Rails to the JVM [video]

rrrazdan · on July 30, 2011

I was recently in a quandary over the choice of technology. I started RoR and I really like it. However I was concerned about long term implications of that choice. The thing that I am taking from this talk is that I shouldn't worry about that, right now. If and when I need to scale, I will have enough resources to make a better choice. Resources that I don't have right now.

xal · on July 30, 2011

Shopify is still 100% ROR and we serve hundreds of millions of requests. You will be fine :-)

It's a competitive advantage for us, we move faster then the rest of the market.

lionheart · on July 30, 2011

Well, correct me if I'm wring but Shopfiy is a completely different scaling problem from Twitter. As I understand it Shopify's individual hosted stores are pretty much self-contained. So you can pretty much stick each one on it's own server with it's own database and it'll be fine. Twitter accounts all have to be able to talk to eachother in real time so you can't do that.

My current startup has a Shopify-like architecture which is what I'm counting on to help me if I ever need to scale fast.

So I think the first question you have to ask yourself when considering scaling is: what is my architecture like?

jsavimbi · on July 30, 2011

> So you can pretty much stick each one on it's own server with it's own database and it'll be fine.

That is not a scalable solution. Yes, it'll get you up and running out of the box, but as you keep spinning up servers to host each store and it's database you'll have to keep adding exponential resources (hardware, software, meatware) to the problem and keep you from achieving economies of scale.

psykotic · on July 30, 2011

>Yes, it'll get you up and running out of the box, but as you keep spinning up servers to host each store and it's database you'll have to keep adding exponential resources (hardware, software, meatware) to the problem and keep you from achieving economies of scale.

No, you'd be adding resources at a _linear_ rate relative to the growth of the customer base. The point about economies of scale is true enough but has nothing to do with a lack of exponential growth in costs.

jsavimbi · on July 30, 2011

You're right; I'm becoming stupid in my old age.

lionheart · on July 30, 2011

Yes, its not an ideal solution for the long term, but its something easy you can do to scale quickly if you have this architecture. It'll make sure your site stays up.

Afterwards, you can spend the time doing it properly when things have calmed down a little.

That's a lot easier than scaling something like Twitter.

jsavimbi · on July 30, 2011

No, sites go down all the time and need to rebooted, reconfigured, redeployed, strangled, etc. by a human who has to watch the servers 24 hours a day. I'm ignorant to the number of customers that Shopify is hosting these stores for, but let's say for example that one human can monitor 100 virtual machines during an eight-hour shift and you have 500 vm's running at the same time over a series of physical servers. That means, in HR terms, you'll need five people per shift or fifteen per day to monitor and act upon the vm's, along with at least two operations people, one of whom will be carrying the pager for the entire 24-hour period. That's seventeen people needed to run your operation not including sick/vacation/leave time that needs to be covered. In salary costs alone, that's over $1million/yr., not including recruiting and management salary, benefits and compensation. And does not take into effect any development, real estate, office essentials, hardware and software costs.

> Afterwards, you can spend the time doing it properly when things have calmed down a little.

That'll cost you an extra million dollars to develop and deploy while simultaneously running your existing operations and migrating your clients over to the new solution.

It doesn't scale.

true_religion · on July 31, 2011

I doubt that every single customer has to be on their own VM. It's more likely that many people share the same VM, and only larger customers need their own VM.

Let's run some numbers though, using your assumptions:

500 VMs @ $179 per customer (business class licence) means a little over $1,000,000. At that point, you'd be right---it wouldn't scale.

However, this is a business and not an HR exercise. So you could easily hire less people, and have less reliability. For a business like Shopify, I imagine their best customers are those who pay the $99 plan, accept a transaction cost, and other limitations. These customers can be piled onto the same VM together, and probably don't have such a huge throughput that a few minutes of downtime means money lost.

jsavimbi · on July 31, 2011

> you could easily hire less people, and have less reliability.

If your business has to be up and running 100% of the time, those are the minimal numbers needed to run the operation I've described. I cite a real word example, with real people and real money, the afterwards part if you will, which I'm currently working on to replace, not an hr exercise. And yes, it doesn't scale and it's very costly to replace, and one of the main reasons it was built in such a manner is because the original architect never thought it would have to scale and there are licensing considerations, something rarely mentioned around these parts. Licensing can really fuck things up.

imack · on July 30, 2011

I'm actually really glad to hear it. Though, I wonder about the "rails doesn't scale" mantra, is that really more for active record? In your experience is active record the biggest out of the box bottleneck?

simonw · on July 30, 2011

The "rails doesn't scale" mantra was discredited 5 years ago, when people realised that it scales exactly the same way as PHP. Remember, scalability != performance.

caseyf2 · on July 30, 2011

We handle a lot less traffic than Shopify (~15-20 million reqs a day) but ActiveRecord isn't a bottleneck and I wouldn't expect to to be [...unless you are talking about something other than the performance of AR as a body of code?]

Instead of "Rails doesn't scale" we should say "Rails runs on Ruby which means that it will consume significantly more CPU and more memory* compared to something else"

In my case, 1 extra server (my estimate) was a small price to pay for developer happiness.

* unless you are running JRuby

akronim · on July 31, 2011

The issue they have doesn't seem to be scaling, in that RoR is scaling linearly. But if you have hundreds of machines, raw performance saves real money, which means RoR is maybe not ideal for massive deployments and the JVM languages give you more performance on the same hardware. Though unless you're working on a top 50 site I wouldn't worry so much.

inkaudio · on July 30, 2011

From what I read, "rails doesn't scale" is really a misnomer, because there are number of things that can be done to scale rails. I think Haiping Zhao of Facebook made it clear, that it's really just an efficiency problem. Bottom line, rails requires more computing power than jvm for sites like twitter. Which are only a handle of sites.

JoelMcCracken · on July 30, 2011

That seems to be the case. The view/controller layers can be scaled via more machines. Its the database layer that requires synchronization. Thats a universal problem, though.

gr3g · on July 30, 2011

Hi Tobi, I was wondering how you guys are doing multi-tenancy.

equalarrow · on July 30, 2011

I'm working at a company right now that have a very big Rails stack. Hundreds of models, probably thousands of files (not including plugins and gems) across the whole app. They are experiencing scaling issues but not for requests per second, but for developer productivity.

I'm reading a great book called Service-Oriented Design with Ruby and Rails, by Paul Dix. He states a lot of the issues that our team is running into with a monolithic app. Seems also that these issues can apply to other frameworks outside of Rails (I've run into some of these same issues years ago with Java web apps).

The main point of the book is figuring out how to abstract out various layers of the app into standalone services. I've talked to people about this and a lot of the time they recoil at the fact that you would increase the complexity of the app by adding yet more parts (services) that require separate machines, deployment dependencies, and their own data stores. However, looking at big sites like Amazon or Twitter shows how one needs and can break apart their systems to make it easier for developers to focus on particular pieces as well as increase system performance.

I'm going to be joining a new startup this year as the technology lead and all these things are on the table. I'm evaluating node, some NoSQL dbs, some cloud paas', etc. I'm also going to be designing from the beginning fast fail and services - or at least an architecture that makes it fairly easy to start migrating parts of the system over to their own standalone services should the need arise. I don't think you can plan for everything, but with the way technology is moving forward and the tools available today, you need to keep SOA at least in the back of your mind.

stephth · on July 30, 2011

Like Matz, creator of Ruby, said [1] - and if I didn't paraphrase I think it would lose part of its charm - :

"If you can make up a website that has a higher traffic than Twitter ... it's a great success of business ... so you have money ... so you're safe to hire the Java programmer to replace it. [laughs]"

http://ontwik.com/ruby/ruby-2-0-what-we-want-to-accomplish-i...

lucisferre · on July 30, 2011

Excellent. This is exactly what all developers need to realize about technology choices. Make decisions based on what will help you to deliver value quickly now, you won't really know what needs, or how it needs, to scale untill much later anyways.

jordibunster · on July 30, 2011

I'd agree. Yammer followed pretty much the same path: our main app is still a Rails app and we ship more Rails code every week.

We've migrated some key parts out to HTTP services written in Scala, Java, or node.js.

hello_moto · on July 30, 2011

When that day arrived, the next question will be whether you have the guts to do it. How many companies dare to rewrite (even with the assumption you can do it peacemeal style)?

gr3g · on July 30, 2011

True that. Even LOLCODE can scale :) Twitter's problems are very specific to twitter - Mega throughput (~7000 Tweets per second) in realtime, sharded DBs requiring multiple connections etc.

Ruby will give you a significant time to market advantage especially if you are a startup.

jacques_chester · on July 31, 2011

My understanding is that the particular scaling problem of twitter is high fanout through subscriptions. Receiving 7k messages per second and storing them in a database is actually fairly straightforward.

visava · on July 30, 2011

try grails http://www.grails.org. Java/groovy version of ROR. It also has a clojure plugin

riobard · on July 30, 2011

He worried about performance. Grails runs on JVM, but AFAIK, it is not as fast as Java/Scala/Clojure.

angerman · on July 30, 2011

One thing that wasn't touched was JRuby[1], on their site they state high performance and real threading as advantages.

If twitter has (some of) the best ruby developer (mentioned somewhere at the end of the video), why have they neglected JRuby? Why is it no option? For legacy code with native extensions this makes sense. But is jruby slower, more memory hungry on the JVM then scala or clojure? I always though that JRuby was one of the more performant languages on the JVM?

Apart from that it was an interesting talk.

[1]: http://www.jruby.org/

cageface · on July 30, 2011

JRuby is fast for a Ruby implementation, but it's still far, far slower than Scala or Java itself.

http://shootout.alioth.debian.org/u32/benchmark.php?test=all...

kennystone · on July 30, 2011

A lot of the pain a site like Twitter will have comes from GC (they wrote their own Ruby GC), and JRuby has a much better garbage collector over MRI and probably better than Twitter's.

vsync · on July 31, 2011

It seems like something like Twitter would benefit lots from a good generational GC. Any more info on the one they wrote and how it differs from the built-in one? I'd love to learn more.

bad_user · on July 30, 2011

I do get that a dynamic language like Ruby will always be slower than a language like Java, which has primitives and where many things, including static method calls, are solved at compile time.

But citing the Alioth.Debian benchmarks? Really?

Dude, take a look at the source-code of those benchmarks sometimes -- they are completely useless ;)

igouy · on July 30, 2011

Dude, "completely useless" is just "completely useless" name calling.

Say what you think is wrong with the source-code of particular programs.

netghost · on July 30, 2011

If you want a fast dynamic language, take a look at Lua. It's surprisingly fast, especially LuaJIT, and pretty straight forward (almost boring really). The main downside is that there isn't the breadth of community around it.

cppsnob · on July 30, 2011

Lua's threading is just as broken as Python or Ruby. Maybe even more so.

code_duck · on July 30, 2011

They're as useful as the code is well written for each language.

You think the Ruby code could be faster, and thus the comparison is skewed? Go ahead and help everyone understand by improving it.

cageface · on July 30, 2011

They show the right order-of-magnitude differences. Taking them more seriously than that is a mistake.

igouy · on July 31, 2011

That JRuby :: Scala URL shows an interquartile range from 6x to 31x so which order-of-magnitude were you going to pick? ;-)

etherael · on July 30, 2011

it's still slower than ruby 1.9 is it not? I was under the impression the current perf hierarchy was 1.9.2p290 w / gc cleanup patch (ree inspired) > jruby > 1.8.7 (ree) > all the other rubies?

jefft · on July 30, 2011

I recall reading an interview with Evan Weaver that the problem with moving to JRuby was in client libraries they were using, not JRuby itself.

kemiller · on July 30, 2011

That is indeed the issue if you want to convert an existing rails app to JRuby. But if you're just starting out, it's a different story: you'll end up much more JVM-focused since the best option is generally to use the java equivalent instead of a ruby gem, but there are options.

I've been using rails for almost 7 years now (holy crap) and run a medium-sized site on it. It works just fine, but if I were starting from scratch today, I would still use Rails, but I'd run it on JRuby.

larrywright · on July 30, 2011

Agreed. There are certainly some people who can't use the JVM for some reason (client libraries being a common one), but I think most people would be best served by using JRuby. There's a lot of innovation going on there, and platforms like Torquebox[1] have some compelling benefits.

1. http://torquebox.org/

petercooper · on July 30, 2011

Direct YouTube link: http://www.youtube.com/watch?v=ohHdZXnsNi8

rhizome · on July 30, 2011

OP is an ontwik posting bot. Thanks.

sscheper · on July 30, 2011

Surprised no other person brought this up.

petercooper · on July 30, 2011

Ontwik clearly provides some sort of service, if only to dredge up YouTube videos that we've otherwise missed.. but I can't help but feel there's a "better way" for these videos to be discovered than a site that just embeds and adds no editorial context.

rhizome · on July 31, 2011

Of course there is: one can post a link to the YouTube to HN with a title that describes the content. No separate website necessary.

gcampbell · on July 30, 2011

If any of this stuff sounds interesting to you, we're hiring for all sorts of positions: http://twitter.com/jobs

hkarthik · on July 31, 2011

Part of the problem is that any typical Web Application framework is ill suited to building a full real time system like Twitter.

I doubt their story would have been much rosier if had they gone with Spring MVC, Hibernate, and Oracle from the start.

The minute you start moving away from CRUD based application design and moving into SOA, you're already signing up for a significant rewrite, even if you stick with the same platform.

MrMcDowall · on July 30, 2011

No one will ever be in the position of handling so much real-time data as Twitter is. The rest of us can just get on with it and stop trying to pre-empt situations that will probably never happen to us.

xentronium · on July 30, 2011

Many companies in financial sector handle similar workloads. I know a couple of HFT shops, they're all java.

MrMcDowall · on July 31, 2011

For sure, I was meaning from the perspective of a startup. There's obviously existing industries where real-time data is big, but they don't tend to be web fronted serving hundreds of millions of users.

hello_moto · on July 30, 2011

Many people seem to refer only the scalability (performance, that is) side of the argument but only few who actually pointed out the developer's productivity of choosing Scala and Java for Twitter situation.

http://www.infoq.com/articles/twitter-java-use

schumihan · on July 31, 2011

Agreed. You can achieve very high productivity if you use Scala and Java properly.

sahglie · on July 30, 2011

With the release of Java 7 (invokedynamic) the performance of these dynamic languages (like ruby and python) may become much less of a factor (JRuby and Jython). At least that's what the JRuby folks imply: http://www.engineyard.com/blog/2011/jruby-1-6-released-now-w...

Punch Line from Link:

"There’s a very real chance that invokedynamic could improve JRuby performance many times, putting us on par with our statically-typed brothers like Java and Scala. And that means you can write Ruby code without fear. Awesome."

mark_l_watson · on July 30, 2011

Right on! I was going to make the same comment until I saw yours. Charles Nutter has been very enthusiastic about invokedynamic based speed improvements - can't wait.

A little off topic: as a consultant it seems like the demand for Clojure was been tremendous: Clojure is a nice language and very performant. It will be really interesting to see how much large speed improvements in JRuby will cut into Java's, Clojure's and Scala's developer market-share.

equark · on July 30, 2011

Is there any evidence this is actually true? Currently, IronPython and IronRuby do not have great performance on the CLR despite dynamic support.

igouy · on July 31, 2011

JRuby JVM 1.6.0_25 :: JRuby JVM 1.7.0

http://anonscm.debian.org/viewvc/shootout/shootout/website/w...

troymc · on July 30, 2011

I noticed that every time Twitter acquired a company, they also accrued a new language:

* Summize brought Scala

* BackType brought Clojure

Has anyone noticed that pattern elsewhere?

jorgeortiz85 · on July 30, 2011

Twitter started looking at Scala before the Summize acquisition. Also, to my knowledge, Scala was not being used at Summize.

skrebbel · on July 30, 2011

A bit boring, but Google comes to mind.

jingweno · on Aug 2, 2011

I think most people misunderstand the point made by Raffi...yes, JVM has awesome performance, but never try to solve performance issue up front by sacrificing the agility offered by RoR. Not everyone is building Twitter, you don't even know whether you will hit the point where the VM is blocking your way. If you blame all the performance issue to the VM level, you are simply doing it wrong...In 90% of the cases, MRI is fast enough and meet your requirement (GitHub, Groupon, Living Social and many others are using RoR BTW). Twitter is very pragmatic at this respect, they have tried all the means to scale the app on RoR before they move to the JVM. Never ever try to solve a requirement that doesn't even exist in your own app...

jingweno · on Aug 2, 2011

The point is you probably never need that performance gain on the VM level (YAGNI) at all. Things are changing very fast in software development, your project could die or fail long before you really think about performance optimization on the VM level. But when you start a project and think that my project will fail because I use Rails and Rails doesn’t scale, you are doing it all wrong! What you really need is a tool that helps you iterate faster. Rails falls into this category. And that’s the productivity gain that I am thinking when starting a company.

As a pragmatic approach, a half century of software engineering says that you should write the code first and worry about making it faster only if it is too slow. Donald Knuth is right: Premature optimization is the root of all evil. Don’t merely let the VM performance metric blind you to this fundamental truth. If you are chasing for a performing language, Java/Scala is not your ultimate solution, C/C++ is, even Erlang.

I actually don’t see a problem with “performance” emerging as the requirement in the Ruby world sooner than others, say Java/Scala, because this “sooner” is very contextual and depends a lot on the implementation. To give you more info, GitHub is on RoR since it started, and till now they haven’t hit the so-called “Rails does not scale” point ( http://teachmetocode.com/podca... ). So are many other projects. Besides, think about Twitter, they only recently try to port everything to the JVM, after Rails has served them a couple years. All these facts tell you, this “sooner” may never happen to your own app, and most importantly, Rails can scale, although it may not scale as well as others! But once you hit the point where Rails, or Ruby in general, doesn't meet your performance requirement (assuming you are lucky enough to build another Twitter), do what Twitter suggests you to do in the video. Is that too late? Not at all. Because by then, you have the resources to do whatever you want, even inventing a VM that is more performing than JVM.

To summarize, the Ruby VM was fast yesterday, is still fast today, and will be faster tomorrow. In 90% of the cases, it's just fast enough. Do I need the performance gain by switching to JVM? Don't know yet. It'd be better to let the market drive you. Does Rails provide the agility I want to start a project? 100% hell yeah!

moe · on July 30, 2011

I'm still not getting the futz they're making over their "scale".

So your inbound load is 7000 tweets/sec or roughly 250 MBit/s (assuming 4k per tweet). Then you fan that out to (assuming) 20 append-only mailboxes on average.

Perhaps my assumptions are far off, but I'm only arriving at a couple GBit/s here and a low two digit number of terabytes/storage per year.

This sounds like "a couple racks" to me, not like "a couple datacenters".

lenn0x · on July 30, 2011

Here is an old presentation from a year ago.

http://www.slideshare.net/nkallen/q-con-3770885

Now, at the time they were doing peak 2000 tweet/s. The fan-out was 1.2M deliveries a second... So if we go with the current 600:1 ratio at 7000/s, that's about 4.2M/s. I actually know it's much higher now since I work there but other things to consider is we have a large data warehouse, search, API, pipelines to external parties for the firehose, logging at terabytes an hour, in-house metric collection doing 3M writes/s, etc.

http://www.scribd.com/doc/59830692/Cassandra-at-Twitter

It add's up very fast.

gnuvince · on July 30, 2011

As I understand it -- and I might be wrong here -- the inbound traffic is not the real problem; it's the distribution to all the followers. If you have 7000 tweets coming in and you need to send it to the 120 followers of all these posters in real time, that makes a lot of data shuffling.

SonicSoul · on July 30, 2011

very interesting talk. I was surprised that there was no mention of using a lower level language i.e. c or c++ in order to maximize cpu/ram utilization. while JVM is a clear winner over ROR it does add some overhead. I guess it is a sweet spot between performance and code manageability.

neduma · on July 30, 2011

Any thoughts solving scaling issues with node.js..

stock_toaster · on July 30, 2011

Until node gets some type of bind-fork-accept mechanism (built in) to utilize more than one cpu in a native and simple fashion (cluster/multi-node are close), I feel it will not gain the same level of traction that java has.

People also have opinions about java(scala/clojure) vs javascript from a language preference standpoint. I think it is too early to tell what impact this will have.

However, many developers I know seem to have a strong distaste for Java, the JVM, and the ecosystem around both. I think several of those folks would look to node, erlang, or possibly even golang (if it gets faster) simply to avoid using java.

hello_moto · on July 30, 2011

I noticed that the people who have a strong distaste for Java are largely application-developers. In most cases, these developers usually just work with the available libraries or APIs to build a website backed by database (some of them are consultants that build similar apps over and over again).

Back-end developers seem to (maybe?) prefer to use Java.

stock_toaster · on July 31, 2011

I don't know if my anecdotal evidence bears this out, but I admit that it is just that...anecdotal.

People are still using C and C++ to write low level code. Databases, package managers, games, etc. Then there are the 'application developers' as you named them, writing http service endpoints, web apps, and the like.

It seems java still owns the colossal corporate stacks. I hear things like "it is easier to hire" and "java is faster/better for extreme large scale". If you think about all the languages and tools available, only the first makes much sense.

* C/C++/D is faster than java.

* statically compiled code is easier to deploy.

* Erlang is arguably more scalable than java.

* Haskel/Ada is 'safer' than java.

* I think several languages are more fun to write in than java. Ruby, python, golang, coffeescript, etc, etc.

So java may not be the best language for large scale, but maybe one of the best or good enough? When combined with the first point of ease of hiring, I can certainly see why large companies are attracted to it. If your language of choice lends itself to your workers being more easily replaceable, then as a company that is probably better/safer.

Other than that, I can't see why someone would prefer to use java. I don't work in/at/for huge companies though.

I admit that my own personal 'java bias' is based on dated interactions with java. Whenever I hear 'java' I think: good performance (vm), eats memory like candy, painful ecosystem of xml files and outdated/abandoned random libraries. I have tried dabbling in scala, and while I enjoyed the language to a fair extent, I still found myself wrestling with the JVM and the ecosystem (library version incompatibilities, obscure compiler errors, namespace wrangling, etc).

hello_moto · on Aug 1, 2011

My opinions are anecdotal at best as well and that's the reality of software development. There's no research that can state that X is better than Y whether it is programming languages, methodologies, architectures, patterns, etc.

I don't deny the reality that people are still writing C/C++ code in the field of embedded devices, games, something that requires fast performance with a very low memory usage. On the other hand, there are a few NoSQL solutions built using Java: HBase, Neo4J, Cassandra.

In some cases, JVM Hotspot optimizes code on-par with C/C++. I don't know much about D performance. If the speed improvement is not night-and-day for projects other than being mentioned above, and if writing readable code is much better in Java, I'm not sure if we should compare C/C++ vs Java. On the other hand, many people seem to come out and say that Ruby is _very_ slow. Is it heaven-and-earth slow?

There are advantages and disadvantages of compiled vs dynamic code when it comes to deployment. It all depends on the tools and ecosystem too sometime.

How is Erlang more scalable than Java? In what area? horizontal vs vertical scaling? developer's productivity (or team performance) scale? performance? speed? Erlang seems to excel in a niche area (in a positive speaking).

What about Haskell/Ada, how are they safer than Java? Do they have better type-systems? handles NULL better than Java? Bulletproof from developers? detect more bugs?

Keep in mind that Java ecosystems have grown and matured a lot since 2004. The tools and libraries are staggering. Most of your concerns are no longer relevant except "eats memory like candy" in most Java desktop apps. Having said that, have you heard about Java ME? that thing runs in mobile devices albeit a different distribution of JVM.

Outdated/abandoned random libraries seem to happen in our neighboring ecosystems: Ruby (and Rails).

I have to admit that sometime other languages are more "fun" to dabble with. I use Python and Ruby. I like Python because I don't have to argue when it comes to code-style. Pythonic (PEP-8) or GTFO. It's not that I hate innovation or artistic coders, it's just that I'm a discipline person. Best practices in most cases, pragmatics when needed, hacks when the world ends tomorrow.

Companies chose Java for varieties of reasons and yes, one of them is the available pool of talents. I'm sure we all have heard the old phrase "enterprise developers". Some of them are bad, while others are quite sharp when it comes to the typical enterprise stack. Some of them can design systems/libraries quite well. Spring Framework comes to mind. Google Guice, Google Guava, Android are next (yep, crazybob used to do EJB and enterprise Java stuff yet he's one of the sharpest Java dude I've known).

I noticed that some of the well-run enterprise systems do have a better infrastructure planning thus forcing people involved around it to know better when it comes to certain technology choices. I'm sure there are web startups out there that just keep on hacking PHP code and use MySQL without having plans for backups, recoveries, etc.

Of course these are anecdotal experiences of mine.

stock_toaster · on Aug 1, 2011

    > On the other hand, many people seem to come out and say that Ruby is _very_ slow. 
    > Is it heaven-and-earth slow?

Comparatively, I would say yes. Granted, most of the time it won't matter because you are waiting on IO (disk/network), but if you are doing cpu intensive work, it is slow and you probably need to drop down to C or offload that work to another service.

    > How is Erlang more scalable than Java? In what area? horizontal vs vertical scaling?
    > developer's productivity (or team performance) scale? performance? speed? Erlang seems 
    > to excel in a niche area (in a positive speaking).

My guess would be in single server scalability. Erlang's write-once variables and actor model, combined with a good VM ("green processes") make it very single-server-scalable (verticle). It also has good built in node-to-node communication mechanisms (horizontal). Performance is probably slower than Java though. And I imagine the developer pool is much more limited than that of Java.

    > What about Haskell/Ada, how are they safer than Java? 
    > Do they have better type-systems? handles NULL better than Java? 
    > Bulletproof from developers? detect more bugs?

I meant safer in the type-safety sense, yes. There are also classes of static analysis tools for both. Granted, my knowledge of these languages is quite limited.

I certainly see your points (especially about liking the code hygiene of python), and agree that Java is not going anywhere soon. I guess I don't understand why a startup, or individual developer, would choose Java over other languages, even other languages on the JVM, for new projects.

Thanks for the good discussion. :)

hello_moto · on Aug 1, 2011

Java ecosystem has a lot of static analysis tools that can integrate to almost all popular IDEs and Continuous Integration systems. Findbugs, PMD, JDepend, Sonar (recommend to check Sonar).

Checkstyle is another tool that I use since I'm kind of the annoying dude when it comes to code-style. (Have you seen GWT API code? it's like written by one person as opposed to a few developers with different perceptions of "readable" code. I like that kind of thing).

There are a few reasons why startup/individual dev would choose Java:

1) Previous experience in Java

2) Java fits better for the type of problems to solve (intensive computational that requires Hadoop like infrastructure)

3) Emotionally attached to static/compiled language with nice IDE so that one can navigate the source code easily whether the code base is large or small (sometime not all decisions are rational and I'm okay with that because developing software requires more than technical skill; it also requires passion).

4) Marketing (if you're targeting the enterprises). Zimbra, Jive Software, Compiere, Alfresco, Day software, Liferay, Salesforce used to be startups.

Java ecosystem seems to learn and grow in a much better speed thanks to the following actors:

- Rails (Spring Roo, Spring MVC, JPA 2.0, and possibly MVC framework from the upcoming JEE releases)

- C# (Java 7 new features, Java 8 closures/lambda. Yes, Lisp does this first, but I think C# forces Java to implement closures more than any of its competitors).

- REST/JSON/WS (Check out the latest JAX-RS, supports REST, JSON, XML, Atom-Feed, and JAX-WS)

- I/P/SaaS + Cloud Computing (Targeted for Java EE7, deployment, infrastructure to support multi-tenant, etc).

NB: Just so that I don't sound like a Java fan-boy, I use Java by day but I use and help to promote and organize Python community overseas (of course by not comparing Python vs Java :)).

veemjeem · on July 30, 2011

Where are you pulling your hunch from? If anything, most of the ruby & node developers came from the back-end world of Java. I know I'm one of these people. Backend java developers have to use configuration heavy IOC frameworks like Spring, Guice, Hibernate, etc, whereas most of these ideas can be emulated in a more flexible language like Ruby.

hello_moto · on July 31, 2011

Guice is configuration heavy? That's the first time I've heard about that. The last time I used Spring, I only have to provide one applicationContext.xml file that contains the XML header (XSD stuff) and 1 line of configuration to inject _every_single_injection_required_for_my_app_

Things changed.

Perhaps I was misusing the word "back-end".

When I refer to app-developers, I'm pointing toward people who build web-apps using Struts, Spring, JEE, EJB (I see where you think that back-end means completely EJB/Service/Hibernate).

The non-app-developers seem to keen on building infrastructure around the Java ecosystems:

Hadoop, HBase, Cassandra, ZooKeeper, custom server using Netty or Apache Mina. Or even building platforms such as GWT, Android.

jsavimbi · on July 30, 2011

Anyone pondering their technology stack should watch this video. It doesn't matter what you initially employ as a technology/framework/server to get your app up and running, but if you need to scale the JVM is were it's at. I say that as a Rubyist.

mattdeboard · on July 30, 2011

I was surprised when I got some pushback on this concept at a local Django meetup last week. A lot of people believe that Python & Ruby-type languages are the backend languages of the future.

cageface · on July 30, 2011

To be fair, you're very unlikely to ever have to solve the kinds of scaling problems Twitter has had to solve. You'll get your app off the ground faster with Python or Ruby.

jsavimbi · on July 30, 2011

You will get your app of the ground faster, but you're selling yourself short if you're a technology-based company thinking that you'll never have scaling problems. Competing against Twitter or any other social-based app you'll probably never encounter that level of scale, but any financial application will need to be both fast and handle the complexities that only the JVM can address.

Like he states at the end of the video, when describing the 7000+ tpm during the WWC:

"...we do things like Forex spikes upon our standard baseline growth. So right now the JVM is really the only mechanism that we can build upon that gives us the flexibility to do something like that."

cageface · on July 30, 2011

Plenty of companies do huge transaction volumes on dynamic languages. Twitter is a freak outlier. If you try to solve problems long before you actually have them chances are you'll come up with the wrong solutions.

gcampbell · on July 30, 2011

I'm pretty sure he said "4x" rather than "Forex".

jsavimbi · on July 30, 2011

Good catch, and make sense given the context. My mind is stuck in HFT mode and constantly has me worried.

wwkeyboard · on July 30, 2011

It all depends on the problem. Dynamic languages like Ruby and Python are good for rapid development, but not for high volume, soft realtime problems like Twitter has. Just make sure to match your tools to you problems and try and oddly hack a solution with a tool that is not best for the job.

mattdeboard · on July 30, 2011

Yeah my contention was that they're great for certain phases but once you start reaching scaling problems the JVM might be the better solution.

cdavid · on July 30, 2011

Also, twitter is at a point where they can afford the best programmers, and where hardware efficiency becomes a big issue. In my experience, a lot of web-based small companies are more limited by skill, development practice than language and hardware usage. Only once you solved the former can you focus on the latter.

ibejoeb · on July 30, 2011

They are the languages of the present (and future, for a while I'm sure), but the JVM is the platform to write these languages to.

jsavimbi · on July 30, 2011

People are going to defend their language of choice, but I have to look up to see what the big dogs are doing, try and understand why they're doing it and what type of influences they're under when it comes time to choose a technology stack to address and solve a problem. Ruby, and Python to certain extent with Django, suffers from the Rails attitude of opinionated development where either it's all Rails or nothing, because that's what they're used to developing and feel uncomfortable outside of it, or worse, are under the impression that Rails is the end-all. That simply doesn't work in a scaling environment. I've found that Java developers are not only easier to recruit due to their shear numbers, but are more receptive to other avenues of approaching a problem and have a better set of skills with which to approach it or are at least able to migrate to other technologies, like Scala for example.

delambo · on July 30, 2011

Coming from a Java workshop, I have noticed the exact opposite - the majority of Java developers I have worked with are unwilling to touch or experiment with anything other than Java, even languages on the JVM like Scala; alternatively, I have noticed python developers are much more open and agile when it comes to moving in and out of other languages.

kemiller · on July 30, 2011

Could that be because the experimental ones have already moved into Ruby/Python years ago? The Rails world certainly has a lot of ex-java programmers.

jsavimbi · on July 30, 2011

In retrospect, I should've said depending on the environment and the willingness of the devs involved, but yeah, they always find a way back to Java.

eurohacker · on July 31, 2011

is there any good site that would explain when to use what technology - like

when you need many concurrent users then dont use ROR but use JVM or C++ or Scala instead,

if you need to build a fast prototype build on ROR or PHP etc.

lhnn · on Aug 1, 2011

An ignoramous question: Aren't there faster languages than Java?

swiharta · on July 31, 2011

I'm pretty sure the project I'm working on will be the next Twitter, and this video's making me second guess staying with RoR.