Hacker News new | past | comments | ask | show | jobs | submit login
"DCI" in Ruby is completely broken (tonyarcieri.com)
83 points by bascule on Jan 2, 2013 | hide | past | favorite | 66 comments



I was curious what Tony's graph would look like using a SimpleDelegator as he suggested.

    (using ruby 1.9.3p194)
    Calculating -------------------------------------
             without dci     68507 i/100ms
                with dci     24409 i/100ms
          with delegator     46945 i/100ms
    -------------------------------------------------
             without dci  2240202.4 (±3.2%) i/s -   11235148 in   5.020463s
                with dci   412445.3 (±3.7%) i/s -    2074765 in   5.037113s
          with delegator  1018928.4 (±1.6%) i/s -    5117005 in   5.023310s

RUBY:

    require 'rubygems'
    require 'benchmark/ips'
    require 'delegate'

    class ExampleClass
      def foo; 42; end
    end

    module ExampleMixin
      def foo; 43; end
    end

    class ExampleProvisioner < SimpleDelegator
      def foo; 44; end

    Benchmark.ips do |bm|
      bm.report("without dci") { ExampleClass.new.foo }
      bm.report("with dci") do
        obj = ExampleClass.new
        obj.extend(ExampleMixin)
        obj.foo
      end
      bm.report("with delegator") do
        ExampleProvisioner.new(ExampleClass.new).foo
      end
    end


Which is pretty much what you'd expect... making two objects instead of one cuts the speed approximately in half.


I took a stab at separating out the method dispatch costs from the cost of the #extend call here https://gist.github.com/4436640. I also tried making a call against an unmodified instance of a completely different class to demonstrate that the global method cache is indeed flushed in MRI by doing a runtime #extend on any object instance.

Subtracting the results of different runs like this is pretty questionable, though (the errors accumulate), so the above is to be taken with a pinch of salt.


the example in the op was somewhat misleading in that he was extending the object within its initialize method. i think you'd more often want to do a per-object extend dynamically at runtime to simulate something like smalltalk's #become, where the object would be modifying its capabilities and 'type' based on some external input conditions.


I've had similar experiences profiling Ruby code, replacing dynamic constructions with static ones, and seeing orders of magnitude bumps in performance.

One thing I've learned never to do is to "customize" short-lived objects at runtime with new methods. If you create an object in response to user input, it should be born with all the methods it's going to need.


You know, I've been thinking about this kind of thing lately. All of this mixin, traits, monkey patching hoopla is completely nonexistent in functional programming languages. To add a new method to an object you just..write a function that takes that piece of data.

Imagine a language where a call looked like this:

    user.doSomething(1, 2, 3)
But it's just syntax sugar for this function application:

    doSomething(user, 1, 2, 3)
("user" would be just the data..like a struct type thing)

That way you get the flexibility of loose functions but you get the (imo) superior aesthetics of oop languages.


In my language Magpie, function calls can have which ever appearance makes the most sense for the operation:

    prefix(arg)         // for function-like things
    arg postfix         // for "getters"
    arg infix(otherarg) // for method-like things
It's purely a syntactic distinction. Semantically, methods are always multimethods and are not tied to any receiver class. You can freely define new methods that dispatch on existing classes.

I find it works really well, though it brings in some unanticipated complexity. You'll really want multimethods or some form of argument-based polymorphism. Also, scoping get a bit more confusing if you want "overriding" to work like you expect across modules.

See: http://magpie-lang.org/multimethods.html


Magpie is so lovely, i want to give it flowers. Do you think it'll ever reach something near production equality, or will it "always" be a "toy"?


Note that the two examples are only the same if all subroutines are global.

The difference is that (usually) methods are looked up where the classes are defined, but subroutines are looked up where they are called. So visibility is different in the case of method invocation.


That's pretty much exactly how it's worked in Perl for a very long time. It's very useful but it can bring in some ugliness if you don't have a strong typing system.


That's exactly how perl5 objects work:

  package Foo {
    sub new { 
      my ($class, $bar) = @_; 
      bless {bar => $bar}, $class
    }

    sub bar { 
      my ($self, $x) = @_; 
      say $self->{bar} . $x
    }
  }

  my $foo = Foo->new( 'Hello...' );

  # method call
  $foo->bar( 'baz' );       # => Hello...baz

  # function call
  Foo::bar( $foo, 'baz' );  # => Hello...baz


how is that not what Python already is:

    >>> class Foo(object):
    ...     def __init__(self):
    ...         self.x = 1
    ...         self.y = 2
    ...     def bar(self):
    ...         return self.x + self.y
    ... 
    >>> Foo().bar()
    3
    >>> f = Foo()
    >>> Foo.bar(f)
    3
    >>>


But you cannot do it the other way, eg:

    def foo(bar, baz):
        print bar.baz + baz

    f = Foo()
    f.foo(3) # error


Claiming "f.foo" should work when "foo" is a freestanding function strikes me as nonsensical, as the "." operator is a namespace operator, suggesting names that are within the namespace of "Foo". Though Postgresql supports this pattern (http://www.postgresql.org/docs/9.2/static/xfunc-sql.html#XFU...). I think it's awkward there too, and has the caveat "it's unwise to give a function that takes a single composite-type argument the same name as any of the fields of that composite type." - basically it breaks namespacing.


In static languages this really isn't a big deal, since you can dispatch based on type. Things definitely get more awkward if you can do things like define new types at runtime, though.


d calls this ufcs ("uniform function call syntax"): http://www.drdobbs.com/cpp/uniform-function-call-syntax/2327...


That's exactly how these OOP languages get translated into C, so the difference would be having a method dispatch table per class or having a global system for multiple dispatch.


That's how it works in D.


From my experience the main benefit comes from the alignment of particular scenarios, with contexts, the result of which is a single place to reason about the algorithm(s) behind that scenario.

I see very little reason for the dogma behind the way role injection is typically handled, yet everyone seems to be attached to doing this in Ruby. This is odd because there are DCI examples in almost every language, each one has it's awkwardness when it comes to roles, but they all manage to get it done with some workaround. I don't see why Ruby should be any different, if it wasn't possible to call #extend, we'd just do it another way.

Either way, I wouldn't dismiss DCI just because the canonical approach some academics cooked up is completely senseless in practice. In fact the author seems to agree in the end so the title is a bit exaggerated.


I'm late to the party, but I read all the comments and I think yours is the best.

DCI is a powerful design concept in any class-based language. When behavior gets complex, DCI is the cure for the "kingdom of nouns" that results from a beginner-level understanding of how to apply OOP. In short, DCI is a way to elevate business tasks to first-class citizens.

The singleton method business is purely an aesthetic concern. It definitely makes for more appealing code, but at the cost of an order of magnitude performance hit. All else being equal I love elegant code as much as the next rubyist, but this is simply too high a price to pay and I'm shocked that so many people find this acceptable.


I'm "thinking in DCI" for 2 years now.

At GameBoxed (http://gameboxed.com/) we've got a Rails app that uses a lot of .extend on every request. It's in production for 1.5 year now.

It's a backend for multiple social games (one backend app for many CoffeeScript frontends).

It's not huge in terms of traffic and performance needs. We implemented it with a DCI-like architecture in mind. It was helpful at the beginning. DCI wasn't the only way to implement it, but it was the simplest way for us at that time.

We've had some performance problems, but they were never related to the usage of .extend. If it was a performance problem, we would probably consider removing the .extends (not a big deal), switch to Java, do more caching, there are so many options.

Again, we don't have huge performance needs, we've maybe 40 req/s at peak times.

DCI is much more than .extend, it's a huge shift in thinking about the OOP architecture.

If we (as Ruby community) agree that it's an architecture worth trying, then maybe it makes sense to introduce better optimisations techniques to the current Ruby implementations?

Anyway, it's great to see the discussion happening. DCI may not be ready to use right now, but it's more of a change in thinking than in the implementation.

I definitely wouldn't call it completely broken in Ruby.

Here are some of my posts on DCI in Ruby:

http://andrzejonsoftware.blogspot.com/2011/02/dci-and-rails....

http://andrzejonsoftware.blogspot.com/2012/01/dci-and-rails-...

http://andrzejonsoftware.blogspot.com/2011/08/dci-patterns-h...


What does extend give you that delegation doesn't?


At a code level - no need to write the delegation methods/macros.

At the design level - flat hierarchy

At an abstract level - thinking in terms of objects, not classes.


> At a code level - no need to write the delegation methods/macros.

You don't have to. They're built into SimpleDelegator. Syntactically the difference is you'd write:

    class MyThing < SimpleDelegator
instead of:

    module MyThing
And do:

    MyThing.new(obj)
instead of:

    obj.extend(myThing)
> At an abstract level - thinking in terms of objects, not classes.

It seems to me like you're trying to use a module to solve a problem which can be solved with an object. I don't think that's thinking in objects at all... it's thinking in modules.


With DCI do you ever need to extend/include multiple modules? Would this single-inheritance approach work?


As long as the interactions you're working with don't implement the same methods, it would be trivially easy:

  class Account; …; end
  class Withdrawer < SimpleDelegator; …; end
  class Depositor < SimpleDelegator; …; end

  acc = Depositor.new(Withdrawer.new(Account.new))
  acc.withdraw(40)
  acc.deposit(100)
At least that's how I understand it. I'm not convinced that DCI is worth it, but…I don't yet have a need for it in the application that I'm working with.


I'm not a Ruby programmer, so maybe I'm misunderstanding the article -- are we talking about changing an object's class hierarchy after it's already been instantiated and then complaining that it's the performance aspect that is pathological? That sounds like a completely insane way of writing software to me.


It isn't so odd in Ruby. Ruby has always been a duck typing land. If it quacks, it is a duck. There's even a method missing handler so you can respond to any method call whatsoever. People can and do add methods directly to even built-in classes as well, the equivalent of modifying the class prototype in Javascript. So method calls are considered more messages in Ruby and saying a class will respond differently to a message after it has more data about what it is, a certain blog for example, isn't so strange to Ruby programmers.


>It isn't so odd in Ruby. Ruby has always been a duck typing land. If it quacks, it is a duck. There's even a method missing handler so you can respond to any method call whatsoever.

What's the benefit of this kind of typing, though? I'm pretty much just a C hacker these days, so I'm pretty ignorant about Ruby, but does this apply to the object or to the class?

It seems simpler, especially on a conceptual level, to just initialize the the duck with everything it could possibly need to be a duck, rather than adding things dynamically.


>It seems simpler, especially on a conceptual level, to just initialize the the duck with everything it could possibly need to be a duck, rather than adding things dynamically.

It is simpler, on a conceptual level. On an implementation level however, it quickly becomes non-simple.

One of the fundamental modularity concepts is 'separation of concerns' - this isn't something you always need to do, but when a concern (a scoped set of functionality) grows large, it should be implemented separately, to keep the conceptual complexity of individual abstractions and implementations minimal.

If I implement all of the logic for handling conditions, importing data, extracting reports, managing permissions, serialization, and resource handling in the same object, it's a very complicated object. I have no way without getting a full mental model of the thing in my head that changing X about it won't break Y, or what parts of code depend on the structure of the results of calling Z.

There are plenty of ways to skin that cat, and different ones are more appropriate in different places. Often it's correct to extract the logic into a generalizable mixin, which in Ruby would be a module, and then include it into the class. Sometimes it's better to extract the logic and the concept it represents into another class that has a relationship with the original one. And sometimes it's better to separate the logic and code into a 'context' as DCI describes - I generally prefer Decorators for this, but the Rails community seems to lean toward using modules here also, largely on the weight of DHH's opinion (app/concerns/).


>It seems simpler, especially on a conceptual level, to just initialize the the duck with everything it could possibly need to be a duck, rather than adding things dynamically.

Yes, it is simpler conceptually. But, in Rails especially where there is a tradition of moving all your application functionality into the "model" layer, you could end up with a single class that is thousands of lines of code long. For example, a User class, with a load of code about resetting passwords, that is only used in one relatively rare circumstance.

So instead, DCI says you should separate the password-reset stuff into a separate module and only add it in when needed. Both User and PasswordReset are simpler and easier to understand and only come together when needed.

(As I said elsewhere, I prefer to wrap a decorator around the User to achieve the same thing).


You could do it on the object level or the class level.

Metaprogramming like this generally is most useful for powering "DSLs" and other ways of expressing not-quite-imperative-code in Ruby.


Ruby decouples the method that was requested (the "message" in Smalltalk OO parlance) from the one that is executed. This allows for a number of neat patterns, like forward invocation, where one object can specify a set of messages to be sent to another object (or method!), or delegation, where an object can opt to handle a subset of messages and pass all others on to an underlying object (or method!).

Delegation (especially with the extreme simplicity of SimpleDelegator) seems like the obvious solution here, at least to me, but some people prefer to use cache-busting mixins instead.


Side note, duck typing is not inherently slow. Some people assume since many duck-typed languages are slow that there is some causation going on.

Go is way faster than Ruby and extensively uses duck typing everywhere (but is also statically typed and not interpreted-). My point is duck typing is not the reason Ruby is so slow.


Interesting point. I never considered whether duck typing was possible in a statically-typed language. I've always used it in concert with interpreted languages. I imagine it could take a significant amount of pain away from systems programming (I'm looking at you, COM).


it's usually called structural typing in more strongly typed languages like Go or Ocaml.


It depends.

Is your app CPU-bound? If so, then DCI probably isn't a great idea for you. In fact, building it in Ruby may not be the best idea.

But if your app is IO-bound (like most Rails apps), huge and it takes new developers weeks to get up to speed, then the gains from having a simple, modular code-base should save you developer-time (and hence salaries), which are much more expensive than CPUs.

Having said that, I tend to use SimpleDelegator instead of dynamically injecting stuff into objects.

EDIT: added 'ruby not being the best idea if CPU bound'


This DCI stuff sounds like a quick way to make an app CPU-bound, if it wasn't already.


Ruby may be slow but my apps still spend more time waiting on complex database queries than anything else (unless I'm using the Jbuilder gem to build JSON outputs, in which case I might as well make a cup of tea waiting for a call to return)


This also has a precedent in Lisp, see change-class: http://clhs.lisp.se/Body/f_chg_cl.htm

There's even a protocol for notifying instances that their class has changed so they can make any necessary adjustments: http://clhs.lisp.se/Body/07_bb.htm


Yes, that is what the author is talking about. I'm not sure what his point is though. This fact has been known for years and if you care about performance, you simply don't modify the class hierarchy.


It doesn't seem to be widely acknowledged amongst people promoting this particular style of DCI as a good way to write Rails app, so I'd say the point is valid, even if the title is overstated. There are also quite a few other very popular gems that make heavy use of #extend at runtime for non-DCI purposes, including Haml and Paperclip, so I don't think it's fair to say that this is common knowledge (at least in the Ruby community).


That's something that happens a lot in the JS/nodejs world too. It has the potential to be disastrous if you're not rigorous, that's for sure. But it still enables very concise and maintenable code.


In practice, we've been using DCI (with object.extend) in a production Rails app and have seen no worse performance degradation than in other apps with comparable complexity (at least subjectively--of course it's very difficult to measure objectively). Typically plenty of production apps suffer more from simple things like missing an index on a db column, N+1 queries, iterating over objects in Ruby, or poorly-thought out design. DCI in Rails is a pretty good trade off for when you need to use it.


The title of the post is unnecessarily inflammatory. It should be "Using extend to do DCI in Ruby is completely broken." The author even admits this in the article, and points out other ways of doing DCI in Ruby.


What titles are NOT unnecessarily inflammatory nowadays?


Erm what? Changing the class hierarchy invalidates all method caches? That sounds a bit brutal, and totally unnecessary.

Can't you just have a method cache per metaclass (or class; I'm not very familiar with how Ruby factors this stuff)? Since mixing in a role into an instances creates a new metaclass anyway, there's no need to affect instances of the original class.


It's because MRI's implementation is simple. They haven't bothered caching per class because it takes more code. However JRuby caches per class.


If you have to go and look up the method again (even ifcached) because the callsite cache is invalid then you already lost. Constantly invalidating the callsites is the real problem with this pattern.


DCI is a paradigm related to design and architecture. Computers are so fast today that does this benchmarking business really matter?

If DCI helps to keep your code maintainable, readable, and less buggy, is that a better tradeoff? I would think that 95-99% of the time, the answer is yes.

On a side note, if speed is mission critical, then would ruby really be the language of choice to use?


"Computers are so fast today that does this benchmarking business really matter?"

Famous last words.

Computers are much faster than they were years ago, but look at how much crap they have to deal with... So much crap in fact that we barely notice they are indeed much faster. So much crap that hardware can barely keep up.

Performance is never mission critical until it becomes mission critical, and that usually happens all of a sudden. It's that moment a few days into production when the system becomes unbearably slow and the rest of the company is breathing down your neck asking "WTF is going on?! Fix it, NOW!".

It's problematic enough when you are just creating to many objects and making processor caches irrelevant, but if your performance issues are baked directly into your coding style... good luck with that.


We're not talking micro-optimization to squeeze an extra 0.5% performance out of a system. We're talking about an order of magnitude (or more) slowdown, in a language already not exactly known for its blazing speed.

My question of the day: why isn't more time spent on building abstractions that aren't just convenient but that are also fast? Why does readable, maintainable code have to mean code so slow it effectively turns Core i7s into Pentium 90s?

I don't think it has to mean that at all, but that's exactly the choice a lot of programmers seem to think they're making.


There's more than one way to skin a cat. Besides mixins, You don't even have to use SimpleDelegator to do delegation in Ruby. For example, here is a demonstration of another approach to DCI in Ruby (http://github.com/rubyworks/dci).


Ruby is a language, not an implementation. DCI in ruby seems just fine, but all implementations are not friendly to the dynamic mixin hell hole.


Completely agreed, the title is a red herring and serves to dissuade people from giving DCI a try. "It's too slow" is not an argument against DCI, it's an argument for fixing a slow implementation of #extend or using a different method.


Why not do DCI with delegation instead of mixins?


Because in general, delegation tends to lead to object schizophrenia. Specifically, SimpleDelegator will leak the original delegatee object when a delegated method returns `self`.

     s = "foo"
     
     d = SimpleDelegator.new s

     s2 = d << "bar"

     s2.object_id == s # true
I think SimpleDelegator is really neat but totally dangerous. Doesn't this ring alarm bells with you?

  # delegate.rb L117
  #
  # Returns true if two objects are considered of equal value.
  #
  def ==(obj)
    return true if obj.equal?(self)
    self.__getobj__ == obj
  end


While this is true, it currently has pathological performance on all major Ruby implementations.


I just wonder if fixing this (by adding extend calls to cache) will end up making unicorn eat way too much memory.


The perl equivalent (run-time role application) normally caches the extended class by default and re-uses it. Provided you don't have a pathological number of combinations, that seems to work out pretty well.


When you say "it" you mean using extend, not DCI in general, right?


Yes, hence why I was constantly calling it "DCI" in quotes


I thought the same thing when I came across the DCI pattern. It is very useful for beautifying and simplifying use cases that span lots of objects.

I do something slightly different. I create a class directly for the context and then fire an object into it.

class TheContext def initialize # ... end

  def execute(the_object)
    # ...
  end
end


Just to note tha NetBeans RCP uses DCI as well, quite successfully.

http://wiki.apidesign.org/wiki/DCI

http://www.antonioshome.net/kitchen/nbdci/


Extending is expensive.

Delegating is usually cheaper, just be careful how you do it.

https://gist.github.com/4441557

I guess the upside of `extend` is that you can do it at the call site, delegation has to be configured upfront.


You can do DCI without singleton methods in Ruby.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: