Hacker News new | past | comments | ask | show | jobs | submit login
Five-minute Multimethods in Python (artima.com)
113 points by llambda on Dec 7, 2011 | hide | past | favorite | 48 comments



The definition of multimethods presented here sounds suspiciously like ad-hoc polymorphism (or just overloading functions). It seems like the real difficulty is not in the concept but in the fact that I've now seen under three different names: multiple-dispatch, ad-hoc polymorphism and overloading. All but the last name seem complex; since I initially encountered the concept in Java as overloading, I never thought much about it.

Polymorphism based on types like this is something that statically typed languages are naturally good at. Looking at his example code, the trick to having this sort of code in dynamically typed Python is to introduce type annotations via decorators. You're doing the work of static typing for a fraction of the benefits.

Additionally, a statically typed language can take this concept even further. Haskell, for example, supports polymorphic functions like this very well using type classes. It even allows functions polymorphic on their return types--you can have a function `read` which takes a String and returns whatever. This means you can actually even have polymorphic constants--maxBound, for example, is the maximum value of whatever numerical representation it happens to be.

Overall, I think features like this make more sense in a statically typed language. They fit in--you're already specifying types everywhere--and can do more. However, it is interesting to see how you could accomplish something like this in a dynamically typed language.


I think you're getting distracted by the simplistic implementation presented here. multimethods are plenty powerful outside of static types - see CLOS, Clojure.

That said multimethods are a halfway step to predicate dispatch (this is mentioned in the article comments). With predicate dispatch you can get efficient dispatch on any interesting predicate - not just types - for example even?, odd?, structural matching, etc.

I think dynamic languages could benefit a lot from predicate dispatch.


Multimethods and overloading are completely different beasts, despite the syntactic similarity. Overloading is resolved at compile time by looking at the compile time types. Multimethods are dispatched at run time, by looking at the run time values. This makes overloading much less powerful. Multimethods are in fact problematic in statically typed languages; they are difficult (but not impossible) to type check.


They are only completely different beasts if you think the distinction between compile time and run time is very important. That's not necessarily the case. There is the text of the program at one end, and pushing electrons at the other. At exactly which point this thing we call "compiling" occurred is just a detail that may or may not help us in other - typically human factor - goals.

Multimethods, to a first approximation, is overload resolution at run time rather than compile time. They are not just syntactically similar; they are semantically similar too. Multimethods is the dynamic language analogue to overloading in static languages.


> Multimethods is the dynamic language analogue to overloading in static languages.

This implies that multimethods are not possible in statically typed languages, which is trivially disproven by stuff like the Nice programming language.

The difference between compile and run time is a red herring, since the actual distinction is between a statically _declared_ type and a dynamic type (which FWIW could be determined statically within AOT compilation).


No, it doesn't imply that, because static and dynamic are not mutually exclusive. A static language with polymorphism[1] has both static and dynamic aspects. A purely static language does not need (and thus won't have) runtime type tags of any kind, so any kind of dynamic dispatch (including multimethods) would be impossible at the level of the language (i.e. you'd have to implement it all explicitly as a library).

[1] It shouldn't need to be said, but this is HN so it does need to be said, but I mean OO-style runtime polymorphism here (or, with a stretch, sum types), not parametric polymorphism.


Why haven't you told us what you mean by the notoriously vague terms "static" and "dynamic"?


Because defining every word one uses in a forum post is incredibly tedious, and even when you try to do so preemptively, someone still comes in and asks you for definitions of more words.

I'm using them in a colloquial rather than rigorous sense, because that's the lowest friction way of communicating in a casual setting.

But at core, when I say 'static', I mean something determined before execution, by means of static scopes and compile-time types; when I say dynamic, I mean something determined at the point of execution, by means of dynamic scopes and run-time values. When I say 'static scopes', I mean scopes determined through lexical nesting and static types. When I say 'static type', I mean types determined without need for any execution of the target program. When I say 'dynamic scopes', I mean scopes determined through dictionary lookup at runtime (vtables are dictionaries too), possibly even including dynamic scopes (cf Lisp, Javascript eval, etc.).

Most languages have both dynamic and static aspects, but the mixes vary greatly. Languages designed to be compiled typically use far more statically resolved features, because if you can afford the time to put into analysis, and you have all the source immediately available, you can find a lot of common bugs and generate faster code. Languages designed to be interpreted typically use far more dynamically resolved features, because that gives less latency to execution, more expressiveness, more flexibility with modularity and program composition, and is also easier to implement.


The difficulty is not with understanding that "when I say 'static', I mean something determined before execution" but that you only say 'static' - you don't tell use what is being determined before execution, and that turns your comments into meaningless mush ;-)


You could view it that way, but the fact remains that they dispatch on completely different information. Overloading dispatches on compile time types, multimethods dispatch on run time type tags. This distinction is important. Are you saying that you are going to ship say a C++ compiler with your program so that you can dispatch on run time information with compile time overloading??


Overloaded functions are only partly resolved at compile time. The dispatch itself happens at run time using only one function parameter (single dispatch), which can be any of any compatible type. Single dispatch turns out to be super fast to implement with vtables; multiple dispatch tends to be slower, but it's sometimes worth it for the added expressiveness.


Overload resolution is entirely a compile-time task. But the method resolved may itself be virtual, and so use dynamic dispatch in its invocation.

(Overload resolution is a pattern matching task, selecting a signature from a list of signatures, given a tuple of candidate value types[1]. It's not just used for invocation, but also when e.g. determining which overload of a function to assign to a function pointer.)

[1] The set of value types is usually larger than the set of types available for use in signatures. For example, most static languages don't have a name for the type of the 'null' or 'nil' value, so you usually can't refer to it in a signature (and nor would it be particularly useful to do so).


Haskell is incompatible with OOP precisely because in Haskell everything is solved at compile-time. It's a different way of thinking that has both advantages and disadvantages, however you cannot move these concepts around easily (i.e. not many features in Haskell are compatible with Java or viceversa - for instance Microsoft tried implementing Haskell on top of .NET and simply gave up).

    ad-hoc polymorphism
Yes it is - however, the difference between multi-methods and the polymorphism you get in Java (and Python) is that Java does runtime single-dispatch on the implicit parameter only, whereas a multimethod can do runtime dispatch on all parameters.

Plus, because it is at runtime, you can also give it conditions that aren't necessarily related to the types involved (depending on implementation of course).

    Overall, I think features like this make more sense 
    in a statically typed language
You've got it backwards - a lot of the features in statically typed languages like Haskell or Java (e.g. overloading, since you mentioned it) are there to be a replacement for multi-methods. But the feature is so powerful that it hardly has any substitute (just like macros).


pytyp will also do this, if anyone is interested - http://acooke.org/pytyp/pytyp.spec.dispatch.html

(in addition, it can be used to add type checking, and type-guided translation between json and python objects. it will also dispatch on "compound" types, like a list of integers).


http://en.wikipedia.org/wiki/Multiple_dispatch#Common_Lisp

I tend to prefer the (defmethod) form in CL. You get generic functions which can retrieve the function object specialized for the parameters with (find-method). You also get the auxiliary method goodness...

But the great thing with Python that the article shows is that there's probably a way you can get those things in Python too if you needed them.

So does Python have multiple dispatch or can you just implement it?


The challenge with expressive features like multimethods - does your language give you the tools not only to implement it, but to make it fast? This is yet another benefit of macros - the ability to add features like this and get good, perhaps even great, performance.


Since people are throwing out similar kinds of implementations, here's my pattern matching library (which is kind of the same thing as multimethods, but allowing more arbitrary matching than just based on type):

http://svn.colorstudy.com/home/ianb/recipes/patmatch.py


This is kind of crazy. I had written pypolymorph independently a few months ago that does this same thing: https://github.com/amoffat/pypolymorph


I did something similar and tried to implement functional language style pattern matching in python:

https://gist.github.com/1320421

Not to actually use, of course.


"a function that has multiple versions, distinguished by the type of the arguments"

Ooh, C# style overloading.


Yes, but not quite. Consider the following example:

  static void Main()
  {
    Bar( 12, 15 );
  }

  static void Bar( object o1, object o2 )
  {
    Foo( o1, o2 ); // error
  }

  static void Foo( int x, double y )
  {
    Console.WriteLine( "int double" );
  }

  static void Foo( double x, int y )
  {
    Console.WriteLine( "double int" );
  }

  static void Foo( double x, double y )
  {
    Console.WriteLine( "double double" );
  }

  static void Foo( int x, int y )
  {
    Console.WriteLine( "int int" );
  }
This does not compile as the compiler does not know which of the four versions it should call. As jules said overloading is resolved at compile time by looking at the compile time types. Multimethods are dispatched at run time, by looking at the run time values. In this case that would be the "int int" version.


Thanks for the explanation.

That's cool and all, but why wouldn't I just define an interface with method foo(), then override foo on my classes? This feels like class-based OO turned inside out.


"At run time" is the key. Consider Java.

If Dog and Cat are subclasses of Animal (or implement an Animal interface), for example, I might want to use a "breed" method to attempt to breed a hybrid animal with the loyalty of a cat and the stealth of a dog.

  public Animal breed(Animal a, Animal b)
  {
  // return generic Animal
  }

  public Animal breed(Dog a, Cat b)
  {
  //return Cog
  }
But at compile time, if all I know is that I have two animals to breed, Java will ALWAYS pick the breed() method with the Animal arguments, even if the run-time types will be Dog and Cat.

To solve this problem in Java you have to delve into the hideous Java reflection object and litter your code with junk that you'd wish the language would figure out for you.

When you've used a language that is aware of types at run time and supports multi-method dispatch, anything else seems like it's half-baked OO.


And referring back to the original question on C#, you can do this using the "dynamic" keyword -

http://achoiusa.wordpress.com/2009/08/27/exploring-c-4-0-mul...


I'm not really up on this stuff, but that strikes me as an implementation detail. Does it really need a new name?


[deleted]


The difference between multimethods and pattern matching is that pattern matching is usually "closed" (you can't add new clauses somewhere else), while multimethods allow you to extend the method with additional functionality.


The power of a static typing in a dynamic typed language.. that's great.. and elegant too! GO BDFL !


Actually it's just checking the types at runtime, nothing much 'static' about it. This is pretty much the antithesis of duck typing, in fact. I could see it being useful, but I could also see it leading to confusion for people who expect Python's usual duck-typing behaviour when they pass their own types into a function.

Whoops you subclassed from a type for which this function has a multimethod defined? Gonna do something different.

On the other hand, in non-trivial programs there _are_ certainly occasions when you need to check the type, so using an elegant method like this isn't a bad compromise.


Except that you have to statically specify the types using decorators. So it's more like a subset of static typing in a dynamically typed language.


It's not a subset of static typing at all.

If multimethods were a subset of static typing, then regular methods would be too. Methods are dynamic by definition: the type of the method's target is inspected at runtime and used to dispatch to the correct function entry. In Python, it's explicitly stated because the first argument of all object methods is "self". Multimethods extend this to do dynamic dispatch on all argument types, not just the target object.

Most Python object methods are declared at compile time; this doesn't make method declaration a "subset of static typing". You can add new methods to classes and objects at runtime using Python's reflection and introspection; you can just as easily add new multimethods at runtime using this framework.

Finally, the @decorator(...) syntax at compile time is just syntactic sugar for something like the following:

  def _foo(self, arg):
    ...
  foo = decorator()(_foo)
So you can use decorators at runtime whenever you want too.


Not the main point but if the if-else pattern gets tedious why is there no switch in Python?


Because a switch can usually be replaced by a dictionary?

    { 1: runOne, 2: runTwo, 3:runThree }[n]()
is equivalent to

    switch(n) { 
        case 1: runOne(); break; 
        case 2: runTwo(); break;
        case 3: runThree(); break;
    }


This forces programmer to invent dummy names. And Python doesn't support nontrivial lambdas.


I've seen the argument about why that's more "pythonic", but it's not compelling. Allowing some syntactic sugar would make the meaning a lot more obvious.


Also, on Smalltalk (which influenced python) there is no switch statement as well. On the other hand, Smalltalk has no if, while, for statements as well (they are all methods).


While Smalltalk doesn't have case statements, a) it's trivial to add them, and b) the way you get around it is totally different: rather than use a Dictionary, you instead generally make a pile of tiny classes, each with a fooBar: method, and then simply make sure you've got the right object type and call fooBar: on it indiscriminately--in other words, basically your classic visitor pattern.

That's totally possible in Python, too, but I don't really ever see it done.


I like to use defaultdict for this. A strategy dictionary, with a default strategy.

dispatch = defaultdict(default_stategy)

dispatch['one'] = new_strategy

print dispatch['two'] => default_strategy.

Beats case statements and elif chains.


I usually just provide a default argument to get:

dispatch.get('two', default_strategy)


Consing for control-flow is not normal, but on Python it is. Python - not even once.


In this case, a switch statement wouldn't help because you're dealing with a bunch of boolean functions rather than a single value. If you were using something like typeof and switched on the type, a switch statement would make more sense. However, even that would be pretty tedious for code like this, I think.

In general though, I have no idea why Python doesn't have a switch or case statement of some sort.


Python doesn't need a switch or case statement because you can write a suite of ad-hoc functions, stick them in a dictionary with your values as keys, and dispatch on the values:

    >>> def first():
    ...     print "a"
    ...
    >>> def second():
    ...     print "beta"
    ...
    >>> caseof = {1: first, 2: second}
    >>> caseof[1]()
    a
    >>> caseof[2]()
    beta
    >>>


Now it's a single value:

    case [typeof(X) for x in X] of:
        [int, int]:
            ...
        [int, float]:
            ...


    { (int, int): methodWithTwoInts,
       (int, float): methodsWithIntAndFloat,
       ...
    }[  tuple([typeof(x) for x in X])  ]()


Actually this is better that switch (or whatever).

Some advanced dispatch code can notice that some tuples are dispatched more often that the rest and can check for them first.

Ditto for dict lookup.

But this would unexpected for switch code.


It's not necessary. If there is long if/else chain in your code, something is probably wrong. In case of python (and other OO, weakly typed languages).

But my opinion is that it wouldn't hurt to have it there.


COND, man, COND.


This obfuscates code, and I remove it wherever I see it. Sometimes people are too clever for their own good. Use an object. Override a method, FFS.


Any trivial example demonstrating how something like this works will seem obfuscatory, because examples are usually trivially solved with other approaches. But see the Expression Problem[1] for the deep problem.

Whether it's an appropriate solution depends on whether introducing new operations or new data types is more common; and where the modularity boundaries are. For example, what if the base class in the hierarchy doesn't have a method for the task you want to perform, and you can't add one because it belongs to third party code, and nor can you implement it appropriately for every descendant?

[1] http://en.wikipedia.org/wiki/Expression_problem




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: