Correlation is not causation. A 600x performance difference doesn't suggest a language-level performance issue. Quite the opposite, in fact! Multiple orders of magnitude performance problems are usually algorithmic in nature, or a really hard physical bottleneck.
Point taken; thinking about it more, AIUI they had a 1 request - 1 thread model, whereas our architecture was (despite the old tech) more event-driven.
Still, what do you think is a typical ratio? I've read pieces suggesting language-level overhead of 60x-70x is normal.
For the exact same algorithm of moderate complexity, there should be less than an order of magnitude of performance difference between any languages, from a high-level language like Ruby to hand-coded assembly.
Performance issues come from blocking and bottlenecks, not programming language choice. Waiting on database queries, waiting on networks, waiting on disk - those hurt. Using O(n^2) algorithms when O(n log n) or O(1) could be used hurts. Bad indexing on data sources hurts. Handling excessive or redundant data hurts. Threads that could be doing work getting blocked (or deadlocked) by other threads that could get out of the way in a better architecture - that hurts.
Programming languages, that's nothing. And in Ruby's case, even that limit can be addressed. Ditch the perceived inefficiencies of the Ruby interpreter in favor of JRuby running on the highly tuned JVM, and see if things are faster.