Hacker News new | past | comments | ask | show | jobs | submit login
Rewriting the Ruby parser (railsatscale.com)
520 points by kddeisz on June 13, 2023 | hide | past | favorite | 175 comments



With the mention of the parser's performance, I am reminded of Rich Kilmer's 2004 RubyConf presentation. Rich used Ruby to test reliability of a distributed system with hundreds of Java VMs. Some of the serialized data was stored as XML (over 1M lines!), which was slow to parse and load. Rich modified the program to serialize the data as Ruby code, which loaded much faster (https://www.infoq.com/news/2007/06/infoq-interview-rich-kilm...):

> Chad Fowler and I over basically 2 weeks, took the Java Debug Wire Protocol specification [...] turned it into a DSL in Ruby, and used that DSL to generate the packets for sending and receiving data. So we used the DSL in Ruby as a generator to generate Ruby code, as the whole protocol and then I used that at Darpa. They were trying to say “we could freeze the agent society from within the agent society will send messages, and it took about 7-8 minutes for all the messages to propagate and everything to freeze and then go quiet. I had a Ruby process that was running all 300 VM’s were underneath it, and I could freeze it in about a half of second. All 300 of them! And you actually could watch the CPU use because we had a monitor and the CPU use has dropped to zero. And it freaked them out. And what was great was you could turn it back on, and all the agents came back on. But time had been lost. And it was like alien abduction lost time, ten minutes went away, “what happened to us?” It was a bizarre thing because they were agents and they were planning on things and all of sudden 10 minutes just went away. But it was interesting to show how Ruby could actually be used as this kind of harnesses to wrap around things like systems.


Nice cover story to hide your ~totally existing~ time machine ;)


>Some of the serialized data was stored as XML (over 1M lines!), which was slow to parse and load. Rich modified the program to serialize the data as Ruby code, which loaded much faster

So he took a data format designed for human readability and converted it to a data format that's designed purely to be read by Ruby and people are surprised that it's faster?


I would argue that an XML file that's "over 1M lines!" is no longer human readable.


That's not the point though.


I think a bigger difference is that the XML is parsed by a Ruby program and the generated Ruby code is parsed by a C program.


Using something like protobuf would have required fewer steps and adhered to a standardized format.


It sounds like the novel idea was thinking to do that in the first place


"Use a more performant serialization format" is hardly novel. It's why things like protobuf exist.


lol what do you want here. "ok you're right, the approach that worked is bad we should have waited for the astute technical insight of an anonymous internet commenter two decades later." you got it bud A++ are you looking for work you seem delightful.


I'm not sure why you're so upset over this. All I've done is point out that some people put a ton of work into a great deal of engineering effort to solve a problem that would have been trivially solved by merely using a different choice of serialization format. Even 10 years ago the technology to do so was widely available and standardized.

But I mean sure, WOOHOO, what a great achievement!


> This is the final step before merging YARP into CRuby, and we’re very excited to see it come to fruition. This work will be done in the next couple of work days.

It continues to amaze me the pace at which kddnewton and co. are able to work. His tweet[1] from a few months ago is really relevant here. Being full-time employees and not just open-source volunteers makes a huge difference!

[1]: https://twitter.com/kddnewton/status/1639258413120073730


I love how pragmatic the approach is here. We still don't have parser generators which generate human understandable/readable recursive descent parser with good error recovery baked in. But i am guessing the ruby syntax is too complicated/irregular anyway.

On a macro side, i am always surprised with the length and effort company will go to keep working with solution design and selected at their young age, and try to scale them well beyond the braking point. And i feel like most of this effort to keep ruby alive could have been used to slowly transition part of the infra to something JVM/.Net/Rust based for much more bang for engineering buck.


(Disclaimer: Shopify employee but not part of this team) Ruby is my favorite language so far to write "business logic" in, and Rails is a very developer friendly web framework that makes it very easy to set up websites. So I think this is a pretty great strategy overall. The end of the day I think the way to get tech too scale, is for companies / academics / etc to really just put the work in. Java was very slow on release, but now the JVM is blazingly fast, and that is because of the huge amount of work put in by Sun, Oracle, IBM, Redhat, Google, Azul, etc. over the years.

For what its worth though, we do also have Go, Rust, Scala, etc. in different places in our stack where its needed.


Disclaimer : my career has mainly been on compiler/runtime and other infrastructure related stuff, so not much of a business logic/web dev kind of guy.

> The end of the day I think the way to get tech too scale, is for companies / academics / etc to really just put the work in.

This is true in the abstract, but my point is mainly about the comparative effectiveness of different strategies when it come to building scalable infrastructure (or scaling up infra): Keep the bulk of the existing code and invest in making the underlying compiler/interpreter/tool chain better or progressively migrate the code to a tool chain with better scaling capabilities from the get go.

From experience, the nature and semantic of a language severely what is "reasonably" possible in the runtime in term of safety (like code loader in java), performance (both CPU and memory) and tooling. Now improvement are possible, but they tend become exponentially expensive as time goes on.

Now, obviously there is always a tension between developer productivity and infra concerns, but i do believe that we have better compromise point on that line with newer language and framework.

I totally agree that in the 2000's the experience of most "enterprise framework" sucked very hard, and the emergence of language such ruby/python etc... was a god send : a response to the overly rigid and ceremonial way of the past. But with time, we have been able to understand better what makes a good programming experience, distill that into better designed languages and framework which offer "better" compromises. For example :

- Instead of dynamic vs static type , we have progressively typed and type inference - Instead of GC vs non-GC we have rust borrow-checker - Instead of runtime meta programing with have DSL and macro's

Even more important, i believe that a lot of the experience come from the tooling around the language, and there again the "cargo/dotnet/go" cli approach with a single coherent entry point for both package management and framework scaffolding ease a lot of the pain of the old way.

With all of that we now have languages which offer a better compromise on the dev. prod vs infra/performance...

> JVM is blazingly fast

I would say blazingly faster... But compared to C++ (or even rust) java is still quite slow. Especially for anything compute intensive.


> Disclaimer : my career has mainly been on compiler/runtime and other infrastructure related stuff, so not much of a business logic/web dev kind of guy.

That probably explain your surprise. This question comes up extremely often on HN, the answer is that even if you were to stop all feature development, re-platforming a large organization like Shopify would take years, not even considering all the re-training, lost knowledge etc.

And you'd spend these years having to support multiple platforms so that would only pay off much later, and in the meantime your competitors continue iterating.

In general it's best to stick with the devil you know, unless the new platform interoperate very well with the old one.


> re-platforming a large organization like Shopify would take years I am not sure what re-platforming means here, but "slowly transitioning" to a different stack is not the same as stop and rewrite everything.

> That probably explain your surprise

While i don't write "business logic", i have been involved in a lot of project to babysit, maintain, refactor and/or improve business code bases. Sometime the runtime can only do so much and you have to adjust the user level code. And from experience it's never as bad as some make it to be.

> And you'd spend these years having to support multiple platforms so that would only pay off much later, and in the meantime your competitors continue iterating.

Again i see this very often as well. Software engineering seem to over emphasizes the cost of platform transition while downplaying the operating burden of not modernizing.


> over emphasizes the cost of platform transition while downplaying the operating burden of not modernizing.

Given that I work on the Ruby Infrastructure team at Shopify, I think I have a well informed view on both the cost of transitioning and the operating burden.


When you reach the level of scale where you need to move to distributed systems, you pretty much need to re-platform anyways. That's usually the ideal time to make a decision like this.


I've worked in Orgs with Cobol code written in the 70s and 80s still around. The strategy of moving your business over to a more performant / scalable tech, in essence means you now have two sets of technologies that just end up sitting on top of each other. I worked on a business line that literally had Java, Javascript, Cobol, VBScript, C#, and Smalltalk all working somewhere in the business process counting only in house written code! Businesses never actually modernize everything. They patch old code, write new stuff in new tech, and it grows in complexity. So the invest in core techs we already use is I think a great idea.


Hard to comment in the abstract, but i think this just showed that the organization did have a defined strategy and things just grew on a had-hoc fashion. Seems like an orthogonal problems.


It is a lie to say java is faster than c/c++ due to runtime optimization

It is also a lie to say java is "quite slow" compare to c/c++


> It is also a lie to say java is "quite slow" compare to c/c++

Why ?


Since no one is replying to your actual question:

The JVM takes bytecode and generates assembly for execution. It also profiles that code and improves it over time.

Sometimes long running java bytecode will be faster than statically compiled C++. In many/most cases, long running Java code on good JVM will reasonably close to C++ for some measures of "reasonable".

Garbage collection pauses, memory usage, JVM startup/warmpup is all more detrimental to Java in the speed comparison.

But "quite slow" basically, to me, implies it is 10-100x slower than C++. Sure you can come up with various benchmarks (lies, damn lies, and benchmarks) to make some corner case.

In general, the JVM is probably about 2-3x slower than C++ compiled code. It certainly is not "quite slow".


Because it depends entirely on the use case. There are plenty of cases where Java will be faster than c++ purely due to the available libraries.


> Because it depends entirely on the use case.

Every thing depends on the use case, and one case always find special case as counter example and most engineering solution/choices.

Generalization are still useful and sometime "true". C++ (and native languages) are faster than java simply because it was designed this way. C++ chose speed at the price of complexity, safety and build time. Java on the other hand focused on simplicity and safety.

I understand the idea of wanted to bring nuance, but the"it depends" can also become an excuse for bad tech choices.

> There are plenty of cases where Java will be faster than c++ purely due to the available libraries.

Hard to believe, but in that case i would say that you are comparing libraries, not the languages.


> C++ chose speed at the price of complexity

If that were true, C and Rust (which are typically faster) would be just as complex as C++ (but no language is).

It chose to jump into the OOP fad while staying low level ("what if we had C with classes?"). That's the cause of most of the complexity. The rest is age, backwards compatibility handcuffs, and kitchen sink stuff from other languages like move semantics and functional stuff.

> Java on the other hand focused on simplicity and safety.

Java chose to jump into the OOP fad while being high level. Ruby and Python did the same thing, and they're also simple as a result. That or low level and not OOP are the correct combos if you don't want to wind up with something as byzantine as C++.


> If that were true, C and Rust (which are typically faster) would be just as complex as C++ (but no language is).

I don't think that this assertion follows from my statement.

It's totally possible that there exist simpler languages which give exactly the same level of performance and expressiveness as C++ while being simpler. Nobody said that the C++ design was optimal with regard to the complexity/zero cost abstraction ratio. Bjarne him self think so (https://www.stroustrup.com/quotes.html), and both carbon and cppfront are effort in that direction.

The point was that when comparing (as of today) java vs C++ we shouldn't be surprised that a language which has "zero cost abstraction" as a core principle, and which is willing to be arbitrary complex ended up being faster.

With regard to the language you mentioned, C is simpler because it does less... And rust is much more recent and completely rethinks the native language landscape. I am not sure that we had the understanding (or even the tech) necessary to create rust 20 years ago.

But more importantly, i don't really buy the premise, of "are typically faster". Some concrete example would be nice, otherwise from experience this statement is wildly incorrect.

> It chose to jump into the OOP fad while staying low level ("what if we had C with classes?"). That's the cause of most of the complexity

Disagree.

> The rest is age, backwards compatibility handcuffs

Very true, if we remove backwards compact. C++ would be simpler.

> like move semantics and functional

Adding features paradoxically can simplify a language by providing a coherent/unified version of previously distinct usage pattern. uniform initialization is the canonical example. I would say that move semantics also simplify the language by folding resources reuses patterns inside RAII.

Same for C++ lamda, just simpler syntax ...

So you seems to say that we can choosing to forgo low-level control (ruby,java,python) or general abstraction structure(which is what OOP is really) to would produce a simpler language.

But C++ is exploring a different design question, trying to have in the same language low-level control, and abstraction that are general enough to expression complex design, and which can be efficiently deconstructed as to not impact performance. And the C++ community seems to be willing to pay some level of complexity for that.


>Hard to believe, but in that case i would say that you are comparing libraries, not the languages.

No, you're not, because comparable libraries in C++ to do many activities, such as at-scale stream processing, simply do not exist in C++.


> many activities

I do not doubt that there are domain where the best libraries are in a particular language, and this language might not be C++.

I am just not convince that java is better than C++ with this regard. (as they are more domain where the best C++ is faster than best java one).

But i think that's beside the point, we are comparing language here, so the existing/quality of characteristics of libraries shouldn't be the main focus (beside the stdlib of course).

Now there is more to choosing a tech. stack than just the programming language, and that include the libraries, current knowledge of the team etc. And i would even venture to say that those might be more important than the programming language per say. But i do believe we can still factor those concern out and compare programming languages, keeping in mind it's only part of the story.


What value is there in that exercise? We never use a given programming language outside of a given task/context. Comparing them absent that is navel gazing.


The number of engineer years required for the parser rewrite was probably in the low double digits. Migrating a platform like Spotify to a new framework in a different language would take hundreds.


Where is the “more bang for the engineering buck”?


JRuby. TruffleRuby.


?


As a devoted rubyist, it seems clear to me in retrospect that ruby's "pretty complex to parse is just fine" approach has been a real challenge to the language's success. For reasons related to discussion in OP.

This project could be one approach to ameliorating that challenge, I'm very pleased to see shopify willing to invest in this, and hope it ends up effecting the ecosystem as hoped.


I disagree.

There are many languages that are hard to parse that are very successful. C and C++ are the obvious examples: there are many dark corners around the preprocessor, type annotations, etc. C++ is an absolute nightmare to parse.

C# is pretty complex too. Python has its weirdness around indentation.

Honestly, most popular languages end up acquiring a decent amount of grammatical complexity over the years and it doesn't seem to significantly hinder adoption. Humans are quite good at reading complex text and syntax.

I think the real tax on Ruby is its pervasive use of runtime metaprogramming. It's Ruby's most exciting strength and enables much of the joy and excitement that Ruby is known for. But it makes static analysis so hard and becomes less and less valuable as team and codebase size increases.


I'd say that C and C++ were successful despite their being hard to parse, and other unfortunate hacks, like null-terminated strings and include files instead if modules There was little choice for other features they offered (like Unix being written in C and bundling a C compiler) which.were the compelling reason to use them.

JavaScript was a huge success not because it was a great language, but because everyone wanted to write for the browser.


I think you and I are saying essentially the same thing: grammatical complexity seems to have little negative impact on language success.

> There was little choice for other features they offered (like Unix being written in C and bundling a C compiler) which.were the compelling reason to use them.

Pascal was around at the same time, is grammatically infinitely more elegant... and is dead.


Pascal evolved into Modula-2 and Oberon, which both live on, in significant chunks, fused into the highly successful Go language.


Shame, too; that was a really fun language.


It’s little appreciated how much current parser generators are holding the industry back. That is, LISP maintains a lead in metaprogramming because it bypasses Chomskyism alltogether. It really should be a few lines of code to add an “unless(X) {}” statement to “if(!X) ()” to Java, Python, Ruby but the very idea that you could patch an existing grammar with a separate file is like technology that fell off a UFO. Also anything that you generate a parser for should automatically generate an unparser. Sphinx is a fraction of the framework it could be because it can’t process markup and turn it back to Sphinx.

I think a PEG framework with a few features (like an easy way to implement operator precedence just by stating it with either numeric values or X > Y statements) could be revolutionary. Python is almost there but not quite.


I used to strongly agree with this. This is an article I wrote on the subject 10 years ago: https://blog.reverberate.org/2013/09/ll-and-lr-in-context-wh...

I even had ideas like what you mentioned about encoding precedence and associativity explicitly: https://github.com/haberman/gazelle/blob/a12a123129dfb7e1f3e...

But I'm no longer as optimistic. The main problem is that nearly all languages have syntax that cannot be easily formalized using declarative abstractions like CFG or PEG, and must fall back to imperative code. And once you have a mix of imperative and declarative code, most of the benefits of a purely-declarative abstraction go away -- or at least the benefits that mattered to me.


If you’re into language parsing and how its patterns could be used in a modern programming language you should take look on Raku (fka Perl 6): It has native support for „Grammars“ [1] which are basically specialized classes to put Regexes in for defining tokens and also combinations of such tokens. Once you‘re done you can use a grammar object to parse text returning an AST object. Raku‘s own syntax is defined in Grammars being a subset of Raku‘s syntax.

So in a nutshell Raku‘s „Grammar“ construct is like RegExes on steroids and renders defining DSLs or other special purpose languages so much easier than in any other modern language.

[1] https://docs.raku.org/language/grammars


I think in most cases metaprogramming makes applications much harder to read. When taking over someone else's codebase, I don't want to also have to understand how they modified the grammar of the language.

Obviously, this isn't a blanket statement - metaprogramming has some amazing success stories (homebrew and vagrant both have excellent DSLs), but more often than not I appreciate code that's just "one layer" of abstraction.


Classes and methods/functions alter the grammar just as much.


Custom grammars are a cool idea. They are terrible in practice and one of the biggest reasons why Lisp failed to take off, why it's almost noone's first choice of language for building something big.

The reason is that new grammar creates a new language, and the divergence of the new language from the base language creates a cultural barrier that inhibits communication. It's harder for new engineers to be productive, and it's harder to collaborate.

Customizing your grammar works best if you're a lone wolf, a one man band. You can lever up your productivity by custom-designing something perfectly suited to both you and your preferred problem domain. But nobody else will understand it.


>> It really should be a few lines of code to add an “unless(X) {}” statement to “if(!X) ()” to...

Or even one line :)

Factor...

  : unless ( ? quot -- res ) swap [ drop ] [ call ] if ; inline
Rebol/Red...

  unless: func [expr block] [either expr [] block]


The difficulty isn't `unless(x) {}`, it's `{} unless(x)` and `{} if(x)`.

Except you also don't have the brackets to keep things organized.


Do you mean semicolons and whitespace?


No, I mean that in the examples of difficult to parse lines there are no {} anywhere and no () on the condition.

  def save = File.write(name, self.to_yaml) unless invalid?
and you could also have

  def save = File.write(name, self.to_yaml) if valid?
If you had to bracket the expression that postfix if/unless applies to, it would be significantly easier to parse.


Nice article! In this sentence

... open parenthesis character is ambiguous in this context. To get around it they made their grammar more ambiguous and then enforced that the actual grammar was enforced in their tree builder.

I'd change the second "more ambiguous" to "more lenient"

i.e. lenient meaning "grammar accepts more strings", ambiguous meaning "grammar is invalid"

I have seen this issue in Python's grammar as well, and it's mitigated by the new PEG parser


Ambigious does not mean invalid


For people who see parsing as model checking, ambiguity means something vastly different (inferior) to leniency.

Fast parsers that use a schema are also doing model checking.


Yes, technically it means "there's more than one derivation"

But Python's pgen rejects such grammars as invalid ... The other strategy is just to pick an arbitrary interpretation


It feels to me like Shopify is single-handedly keeping Ruby alive. A little bit like Jane Street and O'Caml.


I think ‘is Ruby dying?’ is little more than a meme that has stuck around for longer than it deserves.

The continuing work on the language and its performance is impressive, but Ruby (and Rails) themselves have the honour of being stable, tried-and-tested solutions for rapid application development. Is it as exciting as the latest and greatest serverless lambda framework in Typescript? Not really. Is it a dependable workhorse? Absolutely.

At some point you might be successful enough to justify a rewrite into something else, but a simple Rails app will take you a long way with little effort.


I don't deny the utility of Rails for building a certain class of web site. But I think there is a difference between Rails and Ruby, and a difference between use of a technology and improvements to that technology.

I think a technology must continue to change and adapt to remain relevant. I see Shopify as the main driver of current improvements to Ruby (the core features of the most recent release, and the upcoming release, were mostly due to Shopify as I understand it). I don't see smaller companies doing these improvements and I think without them Ruby will (slowly, because it's used in some big places) fade away.


Smaller companies can be happy with Ruby as it is while enjoying the incremental improvements. They're hardly going to invest in hiring C-proficient engineers to bolster the language - it's simply good enough to start a business from and the performance metric is low priority.

They're in a different class to heavy hitters like Shopify and Github, who will gain a lot more from investing in gradual improvements to Ruby's runtime at the scale they operate at.

I'll contrast it with JavaScript, which has tried to assimilate every language pattern under the sun over the past decade and is intensely difficult to maintain a stable stack with, even if it's better now than it used to be.


I'm a bit surprised that Github isn't doing work on Ruby, since they're probably the largest RoR deployment


A lot of ex-Hubbers went to Shopify. Additionally, a lot of upstream commits to Rails occurred from things pulled out of GitHub. GitHub has (had?) an entire team dedicated to Ruby and Rails and performance and optimizations/improvements


This surprised me, as I knew of several prominent GitHub folks contributing to Rails, but when I looked [1], they're all at Shopify. TIL!

[1] https://github.com/rails/rails/milestones/7.1.0


I used to get invites for jobs and see listings for jobs for Ruby and Rails a lot. Now I don't see any at all. There's most definitely a "dead" feeling to the platform.

I hardly hear about new projects being started with Rails as well...


As a Rails person, I get a lot of those requests.

It may be that after years of working on other technologies, you just aren't passing the filters anymore. When we were aggressively hiring Rails devs in 2021, we specifically searched for the seasoned Rails devs that could hop in and get going. I've found less appetite outside of the big shops (Shopify/Github/etc) to pick up junior Rails devs which isn't great.


Probably depends where you’re based. Ruby is still lucrative in London and there is a healthy market for it both in startups and more established businesses. While I’ve branches out to other languages (not just JS as a full stack engineer) my career is still boosted by my Ruby experience.

I don’t think this makes it dead or dying though. It’s stable and entrenched while JS has taken the place of the golden child.

One complaint I’ll grant myself is that library development is a little less prolific this days. Again, there are well-established solutions to a lot of problems in Ruby so you’ll have a go-to collection of gems, but it’s more often the case these days that something doesn’t have much library support and you’ve got to roll your own.


My co-founder and I are using RoR at our new company! It's definitely not dead and allows you to build SAAS very quickly.


It might just be the economy. I was getting about 20 inquiries a week from Rails positions until around December 2022.

It's used in many companies, new and legacy.


Yeah same story here. Until about 6-7 months ago I was hearing from recruiters for Rails jobs daily, both old and new projects.

I'd be very surprised if the decrease since then is out of line with the rest of the industry, regardless of tech stack.


I don't know why, there are tons of smaller companies using it. I have a lot of languages and experience on my resume and the one that consistently gets me the most inquiries at the highest pay grades is still Ruby.

Anecdotal I know, but from the moment that it appeared on my resume in 2012 it's been non-stop. Probably 80% of everything I hear about.

Ruby and it's ecosystem brings the closest thing to natural Aspect Oriented Programming that I've seen in the wild, which is why it's so much more productive than everything else I've tried.


> I don't know why, there are tons of smaller companies using it.

Longtime rails dev here. The reason smaller companies are using it is because you can move much faster working on a full stack rails app that gives you SPA-like functionality without needing to incur the performance or operational cost of a dedicated front end.

I've worked for both rails shops and JS shops, and the productivity achieved with Rails is staggering compared to React in a small team environment. Guillermo Rauch tweeted a few months back that SPAs were a zero interest-rate phenomenon and I completely agree. Just because a bunch of companies jumped on the JS hype train doesn't mean that they were all making the right decision.


> Guillermo Rauch tweeted a few months back that SPAs were a zero interest-rate phenomenon

Guillermo Rauch is selling Vercel, whose strategy includes first-class tooling for SSR, convincing people to move their SPA to SSR, and then locking them in on their platform.


Except my preference for SSR as the default dates as far back as 2014 and is rooted in the laws of physics[1] Downloading an empty shell, downloading code to render a bunch of spinners, to then incur in a bunch of waterfalls of data and more code to the server is not gonna make it.

[1] https://rauchg.com/2014/7-principles-of-rich-web-application...


I mean those are fair points, though respectfully I think there are cases for SPAs where SSR won't do. But that's not what we're talking about.

Referring to SPAs as a "zero-interest-rate phenomenon" implies that SSG/SSR models are more efficient in terms of financial cost of deployment. I don't agree this is necessarily the case, and I think SPAs can be developed and deployed sanely and cheaply also.

Vercel is doing some amazing things, but it's also innovating in ways that occasionally lead to "lock-in", in the sense that moving away from Vercel would involve a lot of friction. So I think it's fair to point out that you have a financial interest in convincing people to adopt delivery models that your business streamlines.


Sure, he's trying to create a business that I'm not particularly interested in being tied down as well. That doesn't make him wrong about SPA.


i think if rails continues to push hotwire (turbo + stimulus, and perhaps strada?) and get a coherent story on view components, it will continue to take mindshare from the js hype of the last decade. mobile dev is in decline, and browser makers just released web notifications, web app support, webtransport, page transitions, etc., so the backend has largely reached parity for cross-platform development. no longer is json the natural data exchange medium for apps, but rather chucks of html that can be plopped right into the dom without js having to massage the response into shape on the frontend. js can return to being a frontend scripting language, its natural habitat, rather than being shoehorned into being a do-it-all platform language.


I've been building a webapp with Rails on and off over weekends. Several times over this process, I thought through some of the architectural decisions and naturally realized that the "Rails way" was the best option to pursue. It's not just because I'm using Rails - my most recent webdev experience was with a SPA driven by a Java backend. I'm sure there are tradeoffs involved (what doesn't?) but with every passing day I use Rails the more I appreciate the decisions it makes for you.


I’ve had the exact same experience! I’m getting to use some new JS SSR frameworks at work (Remix) but I keep using rails for my own things. Gets out of my way.


Remix is pretty nice too, and I appreciate it for bringing Rails-style SSR to style in the React world before RSC, but the power of Rails is from having a complete platform to build your apps on. I don't need to string together five different libraries to implement authentication and e-mails, that comes more or less built into Rails.


Can you recall some of these architectural decisions? I think that would be interesting.


Two decisions on the top of my head:

- I wanted to make an index page where a user could make edits to the items being displayed and make regular show/edit pages for the item so if a user was on that page they could edit it. This is actually really useful: a user can bookmark the page for a specific item, or open it in a new tab, or in general do things browsers let you do but apps struggle with. Making two different edit components would be stupid, and I thought this would be one of those things that I could do better in React than in Rails. After looking around a bit I quickly found that if I used standard REST-y routes and wrapped the key parts of the view with turbo frames I could get exactly what I wanted, and it worked out seamlessly. In general, I've found that Rails' heavy emphasis on REST was a good architectural choice, and every time I disagreed with it and went another way I ended up regretting that choice and reverting to REST.

- In the Java world, you typically have thin models + a service layer which has the business logic. This is apart from other layers such as Repo/DAO etc, which I was already replacing with ActiveRecord, but I was initially resistant to putting all of my business logic in the models, especially if it involved logic across them. But it also hurt discoverability. I was working on a project I was expecting other people to work on with me, and I wanted to make it easy for them to figure out what they could do with a model object. The solution to this came from DHH himself. I've lost a link for this, but he said that if you make service objects, you should add a method on the model that acts as a way to reach the service. This keeps the model "fat" while also separating out logic into simple, unit-testable classes.

One of the smaller things I appreciate is keeping all of the routing details in rails routes. I know other frameworks like Django also do this, but I really didn't like this about most Java frameworks and microframeworks in other languages.

In general, if you're interested in seeing how to architect Rails apps, I say study how 37Signals does it. Playing around with Basecamp convinced me of the practicality of the Rails way and taught me a lot of interesting and useful patterns.


> mobile dev is in decline

This is news to me. Is this really true?


i don't have rigorous data at hand, but browsers and mobile hardware are now good enough that most apps are just fine being developed using cross-platform web tech rather than java or swift. plus, apps no longer attract and retain users as they once did (we're past the hype cycle). native mobile, as a result, has retreated to domains where the tighter integration has benefits, like games, health, IoT, etc. even apple is loosening its iron grip over its app store as the tradeoff between integration and flexibility no longer makes sense for most apps. apple now focuses more on content and on integrating more functionality into iOS itself because of that.


Github is built with Ruby on Rails and they are heavily investing on it, too. See this recent blog post [1] for more details.

[1] https://github.blog/2023-04-06-building-github-with-ruby-and...


Github employees also send a lot of commits to Ruby and Rails. But probably not as many as Shoppify.

There might be one or two additional big companies that are similarly funding employees to contribute to core infrastructure in significant ways.

But not too many more than a handful, I agree. And they are carrying a lot of weight in ruby ecosystem for sure.

I feel like this era of open source in general is one of very shifting patterns of contribution for sustainability. One or a few big companies paying people to keep the thing alive is definitely one that seems to be increasing.

Perhaps since the company(ies) in question didn't originate the product(s) in this case, it doesn't feel like they "own" them exactly (not like "open source" products originated and developed by only one company where the product is their business itself -- not sure which category O'Caml fits in), but the risk in depending on only one or two companies (where the product is _not_ their actual business itself) is that if the company decides it no longer wants to make the investment, it can definitely be disastrous for the product. Shoppify just did a bunch of layoffs -- I don't think they hit the people contributing to ruby too hard, but they easily could have, except perhaps Shopify too realizes that if they stopped keeping Ruby alive it would be disastrous for their own business.


This is wholly irrelevant, but I love the spelling of O'Caml like an Irish last name. I'd definitely frequent a pub called O'Caml's.


OCaml was originally a contraction of Objective Caml, and spelled O'Caml. At some point this changed and the spelling without the apostrophe was adopted. What comes out of my fingers hasn't caught up, though.


Given the early logo for the language was Joe Camel, it would need a smoking room ...


This couldn't be further from the truth. There are so many people using Ruby, so many modern companies deciding to use it and so many large companies that continue to use it.

The demand for Ruby developers is higher than the supply.


Your parent is talking about investing in Ruby the language and its ecosystem. Not just using it.

The new GC, and JIT, along with a few Ruby Cores are all Shopify employees.


stripe seemed to be doing quite a few interesting things (e.g. Sorbet, and there was some work adopting TUF for rubygems IIRC) but it seems to have dialled down things a bit.


People forget Ruby is also massively popular in Japan to this day


>People forget Ruby is also massively popular in Japan to this day

So are fax machines.


This is spoken like someone who knows nothing about Ruby.


Github is doing some as well. Although with resources from Microsoft I do wish they do more.


Github: "Am I a joke to you?"


You're going to incur the wrath for saying the silent part, but you're not wrong. People have the same 5 rails example corps every time someone says one of these two things:

- without rails nobody would use ruby

- without X corp, ruby's dead

Fact is, we're seeing less and less usage, and more and more distillation of the current userbase along the golden paths laid out by DHH.

Is it wrong for rails and thus ruby adoption to slow down? Not at all, people should use what they like, however I think Rails and Ruby are in this negative spiral where:

- rails is mostly needed for prototyping and crud apps

- this work is typically done by juniors

- rails devs are at this point largely seniors, not juniors

- rails devs pay the bills with other tech or by maintaining legacy rails apps.

There are startlingly few deviations from this, and either everyone majors in rails with a minor in javascript and C++ or they just get happy with their current gig and settle. I wish it weren't the case, but Ruby just hasn't done enough to differentiate itself from Rails, and when compared to neo-PHP or JS there's just not a lot of attractive parts of the golden path Rails provides. It's off-tune for this generation of choice and ubiquity.

We don't need that level of scaffolding anymore and in many cases there are other tools that handle that with more versatility.


Looking for a job right now, not limited to any specific language, and I see Ruby mentioned plenty of times. Based on that, it seems the reports of Ruby's decline are not as bad as sometimes claimed.

If you look at PostgreSQL then a lot of dev work comes from EnterpriseDB and a handful of companies too.

The thing with Rails is that it doesn't scale terrible well to "Twitter scale" (if I'm not mistaken Twitter has dropped all usage of Rails) so there aren't that many well-known companies running it, but the overwhelming majority of companies are not "Twitter scale" and it's not really an issue for them. There's a long list of smaller outfits that are not in the "top 50" using Rails quite happily.

People focus too much on "What is {Twitter,Facebook,Google,Amazon,Netflix,...} doing?" Who cares? You're never going to have the same problems they have. And whatever they are doing is not necessarily representative for the entire industry.


> The thing with Rails is that it doesn't scale terrible well to "Twitter scale"

It's true that Twitter switch to JVM langs, but it's not true that Ruby doesn't scale (or couldn't have to Twitter's level if they'd kept it). Twitter was early days for Ruby and things have improved a lot, but the only scaling challenge with Ruby is the cost of app instances. I use Elixir/Phoenix now and run 1/4 of the app instances I used to and with much less memory required per instance. (in one app it's 1/10 the ruby instances!) It's traditionally opex cost that hurt Ruby scalability, not technical, and very few companies will ever see the level of success where the cost of servers gets prohibitively high (compared to dev dev cost).


Isn't "it's comparatively slow" what people usually mean with "doesn't scale"? You can scale anything with enough hardware, but as you mentioned at some point is just becomes very expensive.


Twitters architecture at the time was a textbook example of how not to build a large scale many-to-many social network. Maybe switching would've been worth it for them anyway, but the big thing they needed was fixing architectural choices they never should've made to start with.


> Isn't "it's comparatively slow" what people usually mean with "doesn't scale"?

If that's the case, then they are misguided.

> You can scale anything with enough hardware

No, some architectures or implementations can give you diminishing returns or a hard cap. Not everything can scale horizontally ad infinitum.


> Looking for a job right now, not limited to any specific language, and I see Ruby mentioned plenty of times.

This furthers the point. By stating a lot of companies are looking for Ruby (something that doesn't match my experience when looking) is not a testament that it is hot and in-demand, it is a testament that those roles are not being filled. Senior devs don't make senior dev money doing junior dev work. My assertion is that the majority of Rails is CRUD development that only gets difficult when you step off the golden path- ergo, those positions go unfilled and outnumber their statistical representation in what would be called 'production Rails applications'


What does "hot and in-demand" even mean, exactly? All I'm saying is that based on my (admittedly limited and vague) dataset there seem to be plenty of companies happily running on Ruby (some with Rails, some without) and that "FAANG-type companies we all heard of aren't using it that much any more" doesn't actually mean all that much.

I'm not really sure what your point about senior/junior devs or "roles are not being filled" is.

(aside: please don't delete your post and post exactly the same identical post again to clear the downvotes on it).


twitter was never a good fit for rails anyway. twitter is basically a human pubsub at scale. Rail's bread and butter are simple crud based apps with nonlinear use patterns.


> - rails is mostly needed for prototyping and crud apps

> - this work is typically done by juniors

> - rails devs are at this point largely seniors, not juniors

> - rails devs pay the bills with other tech or by maintaining legacy rails apps.

- and those prototyping apps have very often gone through multiple teams, some or all of them probably outsourced. Potentially true of any codebase, but it's true of an exceptionally high proportion of Rails codebases.

I'm at the point where I'd want a stupid premium to come in on an existing Rails codebase, and I'd want a day or two with it before saying "yes" even at that. They're great if they've been maintained by professional, expert teams their whole life, but god-awful messes remarkably resistant to analysis, otherwise.

I like Ruby a lot but most of the jobs are in Rails, and after initial infatuation followed by repeated exposure over 15ish years, I've come around to pretty much hating Rails. Too much implicit magic, too much memorization, too opaque to tools that might help overcome those first two problems.


> Too much implicit magic, too much memorization, too opaque to tools that might help overcome those first two problems.

If you are a senior level Rails developer none of this is true except maybe memorization and that seems to be a pre-requisite for any senior engineer in any language. It's trivial to debug rails applications with debugger and reading the underlying source code. Everything you need to solve problems is a binding.pry or `bundle open` away.


If you're having to poke around in running code to find out WTF some symbol even is and where it comes from, and that's not a very uncommon thing to have to do, that's unacceptable IMO.

> that seems to be a pre-requisite for any senior engineer in any language

In many languages and language-ecosystems, there's little point to memorizing e.g. method names and signatures that you're not using so often that memorization happens naturally, because your tools can remind you when you, fairly seamlessly, when you need to know. A lot less memorization goes a lot farther in those worlds, than it does in Rails, and the pain of encountering something one is not familiar with is near-zero. Coming back to them after a year or two—or five—away's not a big deal. The brain-space required for Rails is unusually large, and the rate of rot in Rails skill is high. Ramp-up time in an unfamiliar Rails codebase is rough, and requires assistance from those already "read in" to avoid a bunch of wasted time tracking down which gem provides such-and-such dynamically-named object or method or what-have-you. "Which library is this even from?" is not a question that ever reaches the level of conscious thought, in many other languages & frameworks.

Getting up-to-speed on an unfamiliar Rails codebase is full of little side-quests that simply aren't needed elsewhere, and you have to hold a lot more in your head to remain productive in it, than other systems require. This is obviously not impossible, but... oof, why?

All that written out... there's a chance I'd still pick it for a new, solo project, depending on the task. It's fine as long as you are very-familiar with the entire codebase, and some of its gems are major time-savers. I get why companies, and especially move-fast prototyping startups, end up with it, I'm just very done onboarding to existing Rails codebases, personally, without some serious pain & suffering compensation.


> If you're having to poke around in running code to find out WTF some symbol even is and where it comes from

I see this people complain about this but I don't understand why. First of all I've seen highly competent engineers complain about methods in Rails that exist from basic inheritance in Ruby. A concept that they probably learned when they were 10 years old. This is how object-oriented code is written. Pretty much every game is written the same way. Yes the dynamic methods that are generated can be annoying, but not what I see people complain about.

Second, they're trying to code a language like Ruby in a text editor and complain. If you tried to write Java or Scala in a text editor you would also have a bad time. So, yea I don't get it to be honest.


Are you suggesting php and JS are stealing from what would've otherwise been ruby projects? I've been a ruby dev for a decade and I've never heard of a ruby shop migrating to using php for new work and maybe one or two moving to JS for new work. It's golang and elixir that are taking the place of ruby for new work in my experience. I suspect some python too, but I haven't seen that.


I think they're suggesting that neo-PHP and JS adopted the best parts and patterns of Rails (eg. Laravel).


I've spent the majority of my time on Ruby since 2005, and only about 2 of those years have involved any Rails at all. There are plenty of opportunities to use Ruby in other contexts, but it's often less flashy. Most of my Ruby use have been e.g. in devops behind the scenes where the job description might not list a language.


I mentally put Ruby / Rails and Python with Flask / Django (and also “neo” PHP) in the same category vs all of the 70 million JS frameworks. When you talk about alternatives, what are the major ones you have in mind?


Admittedly I haven't written any ruby code in a very long time, but you only briefly touched on the reason we ran away from it after giving it a serious try:

> rails devs pay the bills with other tech or by maintaining legacy rails apps

That "maintaining legacy rails apps" job just doesn't exist with our PHP apps. Once properly tested and deployed PHP will generally work perfectly and smoothly for years with zero maintenance unless someone finds a bug that was missed in testing. You can pretty much setup a cron job to apply security patches and that's it. Maintenance done.

The only downtime I can recall was when our datacenter installed buggy firmware on their storage array and brought down every virtual machine we had with them... which was a pretty rare and unusual event (by the time they fixed it, we had already moved everything to another datacenter... and it seems we weren't alone because they went bankrupt).

With ruby that wasn't our experience at all. We found our production code would regularly just stop working and resources had to be pulled off other tasks with zero notice to figure out why/fix the issue. It was a productivity nightmare. Maybe that's improved now, but from reading the Shopify and GitHub blog posts they both seem to be taking on mammoth amounts of work to ensure their systems are reliable.

At the companies I work for (more than one) we expect anyone qualified to do work like that to be dedicating all of their time to other things. All of the apps we built in ruby were rewritten from scratch in PHP and we haven't had any regrets.

I've never really liked PHP and I actively hate JavaScript, but I've come to accept those two languages are just more practical than anything else. I'm definitely keeping an eye out for that to change though - Swift is look promising for example.


The story you're telling sounds like a you guys problem, not a Ruby or Rails problem. I've been building and running Rails apps since 2007 and have not experienced what you've described.

My current app runs on AWS ECS. Upgrades are largely just updating my Dockerfile or merging a pull request from Dependabot. We have pagerduty and it only goes off when a 3rd party API is down, usually resolving itself.


Sounds like you know how to build PHP apps and didn't know how to build Ruby apps. That's fine, and a valid reason for you to pick what worked for you. I've had plenty of Ruby apps just run for years without additional work.


while rails is great, its comparative advantage has wanned a lot over the verses since it first came out. Its biggest competitors were spring and zend framework (also hunchentoot if you're THAT kind of guy). compared to those, ruby was a breath of fresh air.

Nowadays it has to compete with nodejs, phoenix framework, django and laravel. all of which are within 80% of developer productivity while being vastly more performant. I use phoenix framework myself and while I wish there were more packages available, developer productivity is good enough and we can get away with far less machines to do the job.


It's been really great watching Ruby over the last few years. I had the privilege of having dinner with Matz several years ago near the beginning of Ruby 3.0 work. I was giving a talk about Elixir at a Ruby conference, and he was interested in things I (as a self-professed fanboy of both languages) liked better about Elixir than Ruby. We talked for quite some time, and it became very clear to me that he knew that Ruby was stagnating and wanted to make some changes to keep it relevant (without wrecking the language). He was incredibly open-minded (and nice!) and was willing to listen to anyone who had things to say.

Flash forward several years, and the amount of changes in Ruby are huge! Ruby is as fit as ever for modern development, and I'm really happy about that.


Pattern matching alone is a huge feature and an absolute delight to use when the opportunity presents itself.

I’d love to see where Ractor goes but I worry it will remain niche, like with Refinements.


Agreed. Pattern matching and pipe operator were the two that I hoped for most.

The pipe operator implementation proposal that they team came up with was wrong though and I'm glad they didn't do it. We don't need alternate syntax for `.`, we need ability to chain arguments together in a syntactically pleasant way in a functional style so we don't have to write `first(second(third(arg)))` we can write `arg |> third() |> second() |> first` which is much cleaner and reads left to right like it should


Ruby-idiomatic pipe operator is Object#then. After a lot of design proposals and discussions it is more or less evident no solution other than method would integrate naturally with the rest of the code. So it is just `arg.then{ third(_1)}.then{ second(_1)}.then{ first(_1)}`

Would've been a bit more concise with method references, almost introduced in 2.7, but alas.

(But, well, people tend to want "something that looks like operator, preferably something that looks exactly like |>" and reject every other possibility)


This would be more elegant if there was a better way to do `&object.method(:method_name, …)`. Unbound method support has been around for a long time but converting a method call with arguments to a prod is still not simple to do… unless you start currying methods and writing obscure code.


Yes, that's what I referred to. Before 2.7, object&.:method was almost merged (or rather merged and reverted) because Matz had second thoughts about its uglyness... Which is not completely untrue, but not having a concise way for referring to a method is irritating.

That still wouldn't have solved passing additional args, so maybe { object.method_name(_1, args)} is the next best thing. Though it perceives non-atomic due to wrapping block.


This already exists to some extent, with the >> operator and lambdas.

```

first = ->(x){some code...}

second = ->(x){some code...}

third = ->(x){some code...}

(third >> second >> first).call(arg)

```

I agree it's not as clean as what you propose, but much better imo than traditional nested calls (and Haskell's `.`).


I’ve had a lot of fun with Ruby’s support for functional paradigms but that stuff is unlikely to get through code review when the status quo tends towards ‘Clean Code’ style OOP over-abstraction.

Ruby being a type 2 lisp is a fun one - creating a class and and a factory function with the same name, with argument forwarding:

    class Animal; …; end
    def Animal(…); Animal.new(…); end


Refinements would be more useful if you could expose the refinements, but currently you can't.

    module HashExts
      refine Hash do
        def symbolize_values = transform_values { _1.to_sym }
      end
    end
      
    module Test
      using HashExts
      
      def self.new_h = Hash.new
    end
      
    puts Test.new_h.symbolize_values
    # => undefined method `symbolize_values' for {}:Hash (NoMethodError)


Yeah, I feel like there was a bit of an expectation mismatch around that. `using` makes a lot more semantic sense than `include` or `extend` in a lot of cases but it didn’t play out that way and we’re still living with the unusual convention of making ‘x-able’s and writing ‘concerns’. Not to mention that they were file-scoped and not lexical in their first version so had limited utility for library devs.

As far as I know the teething issues around refinements are ironed out but they remain an obscurity.


I use ruby/rails for professional consulting work. It helps get startups off the ground and into a MVP stage quite quickly, and there is high productivity.

You can also use whatever React/Vue on the frontend if you really need to.

I would like to use some Typescript framework but none are "there yet" with regard to productivity.

Scaling rails in some scenarios is quite challenging, but in most cases you can leverage caching to solve performance challenges.


with the improvements to hotwire you might not even need to get to a heavy frontend like react


Nest.Js seems to be promising, but like you said, it has a long way to go.


Tangential:

> Over the years, processors and C compilers have gotten much better using a couple of techniques. These include pipelining, inlining functions, and branch prediction. Unfortunately, the parsers generated by most parser generators make it difficult for any of these techniques to apply. Most generated parsers operate with a combination of jump tables and gotos, rendering some of the more advanced optimization techniques impotent. Because of this, generated parsers have a maximum performance cliff that is extremely difficult to overcome without significant effort.

Although generating parsers, and finite automata in general, using a table-based approach is common, it has long been recognized that using tables/data for this purpose (as opposed to generating executable source code directly) is not a good idea, precisely because it inhibits compiler optimizations. I think the current situation is simply a consequence of the fact that parsing abruptly stopped being sexy multiple decades ago.

Much better parser generators are possible, and LR, specifically, has much untapped potential left.


Will this change impact Solargraph / Rubocop? They are painfully slow / unusable in their current state on large-ish projects


Depends on what you mean by Solargraph, because it's kind of a large project. If you mean their typechecking, then definitely not, because it's not using a Ruby parser. If you mean the general feedback, then yeah potentially.

Rubocop yes if it ends up using YARP as a new backend.

Either way, I would suggest you check out ruby-lsp, which is definitely going to benefit from this speed, and soon.


The article mentions that they are building a compatibility layer around YARP that tools can use to transform its new tree format into the legacy Ripper format. They don't call out Rubocop specifically but I can't think of another OSS tool that so prominently uses the parser APIs.


rubocop uses rubocop-ast uses parser, so it will eventually make its way down to rubocop once we finish the compat layer for the parser gem.


I made a Ruby LSP because of this problem. It's not perfect but incase it's helpful for you. It can parse a large project with all of its gems in a few minutes. That data is indexed in an in-memory db with Tantivy. https://github.com/pheen/fuzzy_ruby_server


It's not true "the best chance you have is reading the 14 thousand-line parse.y file and trying to understand it", there are several tools to navigate yacc/bnf style grammars like https://www.bottlecaps.de/convert/ and it's companion https://www.bottlecaps.de/convert/ that make relatively easy to understand/document/debug/compare the grammar.

Just added several ruby grammars here https://github.com/mingodad/plgh/tree/main/ruby they are converted (mainly using https://www.bottlecaps.de/convert/) to an EBNF understood by https://bottlecaps.de/rr/ui to generate navigable railroad diagrams.

Copy and paste the EBNF on https://www.bottlecaps.de/rr/ui on the tab "Edit Grammar" the click on the tab "View Diagram" to see/download a navigable railroad diagram.


Interestingly, the actual syntax tree and related structures/functions are generated from the config.yml file and the templates inside the bin directory. They are using a custom template language written in Ruby, here's an example of how they do enum stringification: https://github.com/ruby/yarp/blob/main/bin/templates/src/tok.... Obviously this isn't a novel idea, but IMO this kind of design goes a long way to support their maintainability argument, especially in C.

Also,

> CRuby actually ships with 90 encodings (as of 3.3)

This is asinine.


The templating is definitely an effort to keep the maintainability in check. It's also because we're planning on keeping a grammar around to generate test cases for us with fuzzing/other algorithms.

The encodings is a bit historical, to my knowledge Ruby is the only major language that was developed by someone whose language did not work with ASCII. So the first encodings written in Ruby were the old Windows pages.


> Interestingly, the actual syntax tree and related structures/functions are generated from the config.yml file and the templates inside the bin directory.

That makes a lot of sense. ASTs tend to be a very dumb but fairly verbose data structure. And, in C in particular where everything is more verbose, there ends up being a ton of boilerplate in the AST nodes.

In a language like Ruby with a very rich syntax, you end up needed a ton of different AST nodes.

Generating those from a simple declarative format makes maintaining that much easier.


>> CRuby actually ships with 90 encodings (as of 3.3)

>This is asinine.

Overall there are even more: 103. But then again 1.9.2 only has 85 and 95 overall. Also one of those new 'overall' ones is the EBCDIC code page for US/Canada.


> a custom template language written in Ruby

This makes it sound like it's some template language specific to this project, but ERB is the dominant templating language for Ruby because an ERB implementation is in the standard library.


This is big. Ruby syntax errors are some of the most frustrating things to track down. So much worse than C-like languages like JavaScript.

I'm glad to see big steps towards this.


While on the subject, anybody has good reference on parsing error recovery ?


If you find it, i am interested. But it seems to be a topic that is still open to research and niche to a small group of devs


I suspect that most of the knowledge on this topic is embedded in source code of most prominent compiler tool chain and the head of their dev. :(

I think they might an interesting intersection here we ML, where can could learn the comment mistake pattern made by real user and either error correct better, or at least provide pin point accurate error messages.


Not really no.


I think Ruby is a pretty good language and I spent many years using it professionally but I can't imagine having enough motivation to work on the language implementation when it's so difficult to even parse the code. The amount of engineering time spent on this topic is bonkers!


Maybe I'm a masochist, but to be honest it's a dream job. Working on such a complex language and history is really really fun.


GraalVM?

What’s the progress on Ruby/GraalVM that Shopify has heavily invested in?


GraalVM Ruby, more commonly known as TruffleRuby, is actively developed still. As the article states, it adopted YARP already.


My understanding is that Shopify still hasn’t been able to migrate to Graal/TruffleRuby, even after 10-years of its development.


Your phrasing could be misconstrued by those not familiar with the history. Shopify hasn't been developing TruffleRuby for 10 years. The entire TruffleRuby project is 10 years old, including the first 12 - 18 months where it was a humble intern project. TruffleRuby (previously called JRuby+Truffle) was initially a research project for testing out optimizations, not a production-quality implementation suitable for deployment. I think it's now a viable deployment target, but that wasn't true 10 years ago.

As you might imagine, there's a lot more to production deployments than simple Ruby compatibility. At Shopify, we've recently modified our CI system to support TruffleRuby and are actively running projects against TruffleRuby in CI now [1]. The TruffleRuby 23.0.0 release coming out in the next day or so includes quite a few compatibility issues we've worked out getting a large Rails app booting. That project is on the order of months, not decades.

YARP will make adoption of TruffleRuby easier. Absent a language specification, implementations like JRuby and TruffleRuby have to match what CRuby is shipping and that by necessity means they lag after CRuby releases. Parser changes are amongst the hardest things to port over. YARP eliminates most of the challenges there.

[1] -- https://railsatscale.com/2023-06-12-truffleruby-in-shopify-c...

Disclaimer: I'm on the TruffleRuby team at Shopify.


> “getting a large Rails app booting

What does ‘booting’ exactly mean in the context of a web app (Rails)?


The Rails initialization process is commonly called "booting", probably owing to the presence of the config/boot.rb file. It's the loading and execution of code necessary before a request can be processed. Beyond the code that Rails executes, it generally includes the loading of most, if not all, of your application's dependencies and any initialization they may require.

In my experience, the Rails boot stage accounts for most of the compatibility work. If you can't boot the app, you won't be able to serve requests and likely can't run tests. E.g., it was at this step that we learned we needed to support more of the native extension API to get memcached running (†). Once booting we did run into some other compatibility issues, but the lion's share came up during the Rails boot process.

† - As an aside, the memcached gem hasn't seen a release since 2014. Its C code attempts to detect the Ruby version and alter how it sets things up. It predates the availability of TruffleRuby so there isn't any TruffleRuby-specific detection logic in it. Our extension compatibility made it look like we supported CRuby's internal object model and that was causing the extension to try to allocate objects in a CRuby-specific way. A small change to a macro fixed the problem and the rest of the gem ran fine. It's one of those things that we can't fix until we see, but once we do we can fix it permanently.

[1] -- https://github.com/oracle/truffleruby/pull/2871


When do you think Shopify will be running its rails on in production on Truffle?

Also, any thoughts on the recent license change of GraalVN?


I can't commit to a timeline, but can say the work is progressing nicely. I plan to blog more about it as we have details to share.

The GraalVM commercial license change seems like a positive change to me. But, I'm not a lawyer and I'm not speaking on behalf of Shopify. To me, it looks to be in keeping with how the Oracle JVM is licensed and I think that simplifies a lot. The commercial version has some really nice improvements for the native binary and that's been mostly closed off to Rubyists.


Hack?

I wonder if Shopify will ever give up on trying to fix Ruby and instead, do what FB did with PHP and just create their own derivative (Ruby) language that addresses their needs better.


Why? Shopify was able to work with the core team and make the whole language better.

Facebook has far more employees and wasn’t going to get PHP improved to their liking.


The irony is that PHP has improved massively in perf vs the relative gains of Ruby.

https://onlinephp.io/benchmarks

https://serpapi.com/blog/benchmarking-ruby-3-1-yjit-ruby-2-7...


Apart from measuring very different things, the PHP page compares PHP releases over something like 25 years vs. about 3 years of Ruby releases.

Look at PHP 7.2 onwards, which is a similar timeframe, and the improvements are fairly modest.


What a impressive work & effort! Well done.


Great read. I’ve never written a single line of code in Ruby but I love reading stuff related to compilers.


A lot of cool work! Congrats!


Well described. Also I found this on github.

https://github.com/whitequark/parser


That is one of the many parsers listed in the article.


Somewhat fitting xkcd: https://xkcd.com/927/


It’s not fitting at all.

> We recently got approval to merge this work into CRuby, and are very excited to share our work with the community.


While this xkcd is frequently germane, in this case it is not particularly apt. The core CRuby team, who maintain the current “standard”, are in agreement that YARP — the new “standard” — will one day replace it.


Even putting the word "standard" in quotes is buying too much into the argument.

This is a new implementation of an existing standard.


So Shopify spends millions of dollars in engineering salary to get a substandard performing language up to par with a dozen other contemporary better choices? The choice to stick so hard with Ruby just doesn't seem reasonable at all with their scale. It must be something dogmatic from the top.


Facebook did the same when they developed HipHop for PHP. I imagine the reasoning is the same in both cases: spend a few million dollars and a handful of engineers improving the language and tools, or spend hundreds of millions and years of time rewriting everything into a more performant language. I honestly think the math pencils and makes sense.


People always say language choice doesn't matter, but how many companies get stuck with a language where they resort to rewriting the parser for it?


No one is "stuck" or "resorting" to anything. It's an effort to improve tooling, performance, error reporting, and working with Ruby in general. We could maintain the status quo (which does indeed work) or we could improve it. If we don't improve the status quo, people call the language stagnant and outdated. If we do, it's considered a criticism of its maturity.

How many languages would even entertain such a contribution? How many languages are so tightly bound to their parser that writing a new one wouldn't even be workable? How many languages have user tooling that works extensively with parsed code snippets? How many languages have multiple implementations? Of those, how many are working together on an effort to share a common parser?

There's plenty of valid reasons to use something other than Ruby. An open source project that grew organically over 25 years deciding that it's time to pay down some technical debt and improve the ecosystem at the same time is probably not one of them.


"Rails at scale" is a funny title. Shopify had to rebuild the core parser. Github has a department dedicated to working off the bleeding edge of Rails (and has core Ruby and Rails maintainers on staff, Shopify probably does too). Both Shopify and Github spent literal engineering years upgrading Rails, and still for some reason think it's worthy of boasting about in engineering blog posts, despite the burnout, not-shipping-features, and turnover they suffered because of it.

If your company has enough engineering overhead and capacity to stop delivering features and try to fix your core language for a year, then you're in a different ballpark from most other tech companies. If you're NOT in that boat, then "Rails scaled for Github and Shopify" is not a message you should take home.


If you're not in that boat, odds are you don't need to care.

And they didn't stop delivering features for a year.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: