Hacker News new | past | comments | ask | show | jobs | submit login
Python Best Practice Patterns (stevenloria.com)
311 points by abeinstein on Feb 28, 2014 | hide | past | favorite | 94 comments



Several of those patterns are incomplete or frowned upon:

* if a method does not use the object's state (no `self` usage) make it a `class-` or `staticmethod`.

* Some magic methods are presented. There's more to them[0].

* one should not write `class MyClass:` but `class MyClass(object):` (new style class[1])

* the last one (`return None`) make me very dubious

* Cascading methods: that's a big no. The idiom is that if a method may change the state of the object then it should return None (eg `set.add`)

0: well-written and comprehensive guide: http://www.rafekettler.com/magicmethods.html

1: http://www.python.org/doc/newstyle/


For certain use-cases (like constructing queries for an ORM) or other things where you're effectively passing around curried ideas to eventually be executed, I think cascading methods is a huge win.


What you are referring to is a design pattern called fluent interfaces[1]. They do make for very usable APIs when used to represent pipelines and filters. They are also used heavily in creating domain specific language features. In your SQL example it works very well such as in SQLAlchemy. But in that example, the chained methods are building a query as opposed to mutating the actual data. Splitting hairs.

[1]http://en.m.wikipedia.org/wiki/Fluent_interface


> Splitting hairs.

Severely, since what you're doing is mutating the actual state-data of the query object.


In the case of SQLAlchemy you are not. The cascading methods on query objects create new query objects as it should be. Mutating objects with cascading methods is horrible API design as it suggests immutability where there is none.


the API in some ways comes from that of Hibernate, which does actually mutate in place (see http://docs.jboss.org/hibernate/orm/3.3/reference/en-US/html...). I felt that keeping the existing object un-mutated is a lot more intuitive. I think the names that you choose for the methods do make a difference, e.g. Hibernate is saying "add()" which for whatever reason seems to imply "mutation" to me, not sure why.


Having used hibernate extensively (and considering many of it's features remain completely unmatched in any other ORM I've encountered), I can say that I strongly dislike the idea of mutating the query in place. There's huge advantages to each chain returning an independent and valid query that can be executed, referenced, and added to. I don't see any upside to the Hibernate approach here, and it violates "Prefer Immutability".


Oh right, that's fair. Same in Django (and you're right, mutation in this case would be horrible design), I suppose I was mixing the metaphorical with the actual.


Yeah these things aren't as black and white as both the blogger and the original commenter make it seem. Software design is very subjective.


Showed this to a senior dev at work and he pointed out another (subtle) error

    for each in enumerate(items):
        if each.match(self):
            # This fails because enumerate yields an (index, value) tuple.
            # Match isn't a function defined on tuples.


Do you have to show the code to someone before your allowed to run it? Can't see how you wouldn't have spotted this yourself


a pretty basic error :(


> Cascading methods: that's a big no. The idiom is that if a method may change the state of the object then it should return None (eg `set.add`)

So methods that mutate state, and don't return None are un-Pythonic?

What about this example:

  a = [1, 2, 3]
  b = a.pop()
pop() mutates the state of a and returns a value that is not None. Is the Python code language itself un-Pythonic?


> * the last one (`return None`) make me very dubious

I'm curious, as someone who uses this pattern quite a bit, how would you improve on this?


Just speculating here, but a func that just ends without returning, effectively returns None, so that idiom may be considered redundant. IMHO, it's often worth the additional clarity to be redundant in this way.


Yes, but the "Pythonic" way to do things is that "explicit is better than implicit," so I'm not sure what the original poster considers wrong with `return None`.


    In [1]: def foo(): pass
    
    In [2]: foo() is None
    Out[2]: True


I was speculating about what the original complainer meant, not about what python does... b^)


I do this too. I dislike when functions in languages don't explicitly return something even if that something is nothing.


That line and this one lead me to believe he's full of hot air:

> if a method does not use the object's state (no `self` usage) make it a `class-` or `staticmethod`.

Uh, that's a little too "cookbook" for my tastes. The decision of whether to make something a class / static method or not should be decided based on what the function is doing, rather than some pointless dogma. I'm assuming he means this method:

> def alarm(self): > with open(BUZZER_MP3_FILE) as f: > play_sound(f.read())

It's a design decision, and one that is particularly trivial at that. Let's say the effects an alarm ultimately change, resulting in an internal state read / write on the object. Now you have to go back and update the class despite the fact that it at a higher philosophical level one could argue that an alarm should only be triggered on an individual boiler.

Point being: it's nit-pickery for nit-pickery's sake. I don't personally use `return None`, but it's certainly more explicit and is fine to use if your judgment has found it to make sense.


There are no procedures, only functions that return something. A function either returns something explicitly, or returns None. Easy.


It seems he's using Python 3 (using print as a function), so no need to inherit object.


There is a print function in Python 2 either!

http://docs.python.org/2/library/functions.html#print


These things turn subjective pretty quickly. That, and, "this is the way most people do it, so just do it this way."


Sometimes I find guidelines like these counterproductive.

For example, when looking through someone elses code, if it does not have deep levels of nesting I prefer longer methods you can read like a script rather than having to jump around the place to see what is each method.

Likewise with the 80 character guideline in PEP. It can take a lot longer to find the closing brace if it is hidden in columns that look like a newspaper article.


>Likewise with the 80 character guideline in PEP

Definitely. The PEP8 was actually revised 6 months ago[0]. Nothing crazy, but a bit more pragmatic.

>For code maintained exclusively or primarily by a team that can reach agreement on this issue, it is okay to increase the nominal line length from 80 to 100 characters (effectively increasing the maximum length to 99 characters)

0: text: http://legacy.python.org/dev/peps/pep-0008/#maximum-line-len...

commit : http://hg.python.org/peps/rev/fb24c80e9afb


I think this is due to poor naming, not due to short functions.

Compare:

  def makeMove(self, target):
    enemy_piece = target.piece
    if enemy_piece:
      self.takeEnemyPiece(enemy_piece)
    target.piece = self
    self.position = target
And

  def changeCoords(self, foo):
    if foo.standing_on_it:
      self.nowItsMine(foo.standing_on_it)
    foo.standing_on_it = self
    self.coords = foo
In the former case, you mostly don't need to consult the source of functions called. In the latter, you're never sure.


Yes. I look at these things and pick out what I think are the good parts.


Cascading methods aka fluid interface are OK in many cases, e.g. various builders. OTOH immutable objects are preferable in many cases.

What I really disliked is storing intermediate results of a multi-stage computation in instance variables. It is so easy to return multiple values from Python functions. Also, thinking what you need and need not pass and return between interconnected functions helps structure them much better, into smaller, clearer pieces (as #1 correctly advises).


What is your reasoning for not making methods chainable?

http://en.wikipedia.org/wiki/Fluent_interface#Java


Chainable mutating methods are not idiomatic Python; returning None is idiomatic for mutating methods. ISTR that the reasons here regard readability and clearly distinguishing mutations from queries, but in any case if you don't follow the idiom in libraries you create, they won't behave like Python's builtins and standard library, which, whatever you think about chaining on its own, will cause some context switching for most Python users when trying to work with it, increasing the cognitive load.


The builtins are broken, that is why there are new builtins liked `sorted()`, list.sort() returning None is not an improvement. The standard libraries are a _mess_, there is absolutely know cohesion in their api design, or even naming conventions (go peruse the standard lib). Much of what is Pythonic is someones opinion on how a nanny-language should behave.

At this point I think calling something idiomatic or Pythonic is a thought terminating statement. I need to see refutation.


sorted returns a new object, not mutates the original object, so it follows the convention GP is talking about. sort() also follows that rule, since it mutates, it returns None.

In terms of refutation, some things are just conventions, and breaking them has to be done with the understanding that the benefits outweigh the costs. If python were a clean slate, totally new language, you might have a discussion about the merits of mutating chaining, but as it is, the convention is not to do that, and the benefits don't buy enough to violate everyone's expectations.


Counterpoint, object.next() was removed from python3 and replaced with next(object). It both modifies state and returns data. I really prefered next() as a method since it does change the state of a generator.


> * the last one (`return None`) make me very dubious

simply `return` would be enough


> * if a method does not use the object's state (no `self` usage) make it a `class-` or `staticmethod`.

Is this really true?


Having gone through a couple thousand lines of Javascript that adhered to the "keep methods small", I call bollocks on that. Make methods as big as they need to be.

    function doFooOnList(l) {
        for (var i=0; i<l.length; i++) {
            doFoo(l[i]);
        }
    }
    function doFoo(i) {
        i.foo();
    }
gets old, very quickly.

After designing and building code for 20+ years, I can comfortably say that there are no arbitrary rules of software design, and some of the worst code I've seen has been a result of following "best practices" instead of thinking for oneself.

Write code like it is meant to be read, because that's what happens most often.


> Make methods as big as they need to be.

Correct, and if you do that, methods will invariably end up small. What you cite above is small no matter how you write it, so that's not the point of the advice to keep methods small. Methods pretty much never need to be long; they're long because they're poorly written code. Well written code tends towards small methods.


Correct, and if you do that, methods will invariably end up small.

If you'd said "usually", I'd have agreed with you, but "invariably" is far too strong. Sometimes the logic you need to implement is fundamentally complicated, and so the code you write to implement it must inevitably be at least that complicated. If that means writing a 100 line function, but the function really is doing one job, at one level of abstraction, in a cohesive way, then so be it.


We agree, I'm fine with usually.


"Small" obviously depends on your language, but well-named methods which follow the "do one thing and do it well" principle always win over wall-of-text code. It's all a matter of decomposing the issue, finding the level of abstraction your method will work at, and naming things.


We agree completely.


some of the worst code I've seen has been a result of following "best practices" instead of thinking for oneself.

Very very true. This phrase suggests there is no better way to do something, and that it should always be used because it is "best", and the dogmatic application of rules replaces actual thought and understanding. I am not against a "recommendation" or "suggestion", but to call it "best practice" is cargo-cult-like.


`for (var i=0; i<l.length; i++)` is so 1970. In Javascript, you have array.map and $.each, and can define functions in-place.

Same applies to Python, BTW; get used to list comprehensions and generators and you cannot look back.


These aren't supported everywhere in javascript.


jQuery or underscore provide them on every sane JS platform. These are not hard to write by hand, too.


I'd say methods should be small enough so that their intent is clear. Usually it means that methods should have exactly one responsibility, otherwise they seem bloated.


You are calling bollocks on a claim he doesn't make.


Agree with all of these but two. The first is the example of doing:

    class Foo(object):
        highlight = reverse
No, that is not clearer. Now I have no idea what this method does. Making it explicit requires more keystrokes, but allows you to properly document the method. Also, when I run help(Foo.highlight) I won't get the generic documentation for `reverse`.

Second, using `each` for a generic iteration variable. This is an opinion, not a best practice. I would argue that either the loop is a one liner, at which point use whatever you want (x works well), or it's more than one line and then I want a proper name for the thing you are iterating over.


`highlight = reverse` also flies in the face of TOOWTDI (from PEP 20).

Which is a shame, because this means that convenience methods that Ruby has, e.g. ary.first → ary[0], ary.compact → ary.reject{|x| x.nil? }, ary.map → ary.collect are pruned out of the stdlib and frowned on in contributed libraries. This chilling effect that descends from PEP20 is one of the worse aspects of Python.

They increase readability and should be encouraged. Even if ary.last is one more character, it uses less of my brain to read than ary[-1]. ary.map might be more readable if other code uses ary.reduce, while ary.collect is more readable if other code uses ary.inject, ary.detect, etc.

The OP gave a perfect example with this---in an event handler for a drag operation within an editor, I'd rather communicate that text is being .highlight()-ed, even if the underlying view methods are reversing the pixels. If I used .reverse(), it might confuse a coder into thinking the text itself is being reversed when I drag.

Perhaps if more Pythonistas consider this a "best practice," it will swing favor for amending the Zen. But I wouldn't bet on it.

Also, you're incorrect about help(). help(Foo.highlight) will provide the docstring for Foo.reverse if Foo.highlight = Foo.reverse.


> Even if ary.last is one more character, it uses less of my brain to read than ary[-1]

This has nothing to do with whether it's a good idea. If you read a lot of python code, the latter is easier to grok. And when you have separate ways of doing things, it takes a longer and longer time to absorb those idioms and internalize them to the point where you don't need to think. Adding a bunch of "convenience" methods that do minor permutations on common operations might read better when a line of code is given as an isolated example, because it can read more like an english sentence. But when you're reading over code, the resemblance to english only helps when you're looking at application code which isn't a part of the language or standard library. Having the language and standard library present a single way to do things makes it easier to get to a base level of familiarity.

Endless variety in doing simple things doesn't buy you much.

That being said, the python stdlib is full of stuff built up over many years, so it doesn't follow that idea everywhere. Also, higher level design is never going to fit into the TOOWTDI concept because things at that level are more subjective.


Personally, I do agree with having ary[0] as the "one true way". Doing ary.first will lead to ary.second, which will lead to ary.slippery_slope :). I mean some conveniences are good, while having too many is evil.

> Also, you're incorrect about help(). help(Foo.highlight) will provide the docstring for Foo.reverse if Foo.highlight = Foo.reverse.

That's what I was trying to say. When doing help(Foo.highlight) I want it to say something like "Event handler for highlighting text", not "Reverse pixel color for the given rectangle."


The whole "use __iter__ whenever you can" thing is dubious too. It may be fine sometimes but the example is poorly chosen. If the department gains a name and a manager, using __iter__ makes much less sense. And now you also need to implement __len__ if you want to count your employees, etc.


And possibly allow indexing: `department[3]`. Yeah, I don't like that. It's not clear whether it's a generator or a full sequence. Generators when you don't expect them are evil (you cannot iterate over them twice). I would much rather see something like department.get_employees() and department.iget_employees() that return a tuple and a generator respectively.


That's the sort of things people used(?) to complain about C++: gratuitous operator overloading. It sometimes makes sense, but you really need to think long and hard about the semantics.


Author here. Must give credit where it's due:

These patterns come from a talk by Vladimir Keleshev, author of docopt and excellent Pythonista. These are NOT my original work.



You couldn't say anything to motivates me more about reading the linked article. docopt is a gem ;)


Several of these are generally applicable to programming:

* Keep functions small and composable

* Keep functions at a consistent level of abstraction

* Use constructors to ensure objects always exist in a complete, usable state

* Use meaningful method names in place of comments

It is nice to see that other people struggle with functions with lots of parameters + lots of partial state variables. I don't suppose anyone here has a better solution?


Comments should explain why you are doing something while method names are a guide to what they are doing. I see them as for different purposes and one should not replace the other.


Absolutely, unless you're writing end-user documentation.


I was playing about in Django recently, having written previous views as functions, I wrote some new ones as class based views. I felt that the OOP approach kept it cleaner and inheritance could be used in the same way as currying could for the function based views (but less complex - maybe because I understand OOP concepts better).


I never knew you could use __enter__ and __exit__ to code your own things that work with the 'with' statement. Well worth the read!


For simple context managers, an easier method is to use contextlib.contextmanager (http://docs.python.org/library/contextlib.html).

  from contextlib import contextmanager
  
  @contextmanager
  def tag(name):
      print "<%s>" % name
      yield
      print "</%s>" % name
  
  with tag("h1"):
      print "foo"
  """
  <h1>
  foo
  </h1>
  """


You don't even have to create a full object if there's no need to:

    @contextlib.contextmanager
    def manager(*args):
        object = initialize(args)
        try:
            yield object
        finally:
            # cleanup
            object.close()


You might be interested in the docs on the Data Model [0] to learn more about the various "dunder" (__foo__) methods. There are a ton of cool things you can do with python objects.

[0] http://docs.python.org/2/reference/datamodel.html


For the 'method object' one, I use nested functions. Python is not Java or Smalltalk.

The @classmethod example I'd write with an ordinary function also, outside the class.


Not enough explanation as to what problems are being solved or why your solutions are "best practices".


Many of these tips address the idea that good naming improves readability. I couldn't agree more!

If you're looking for more on this topic, Brandon Rhodes gave an excellent talk on this at PyCon US last year [0].

[0] http://pyvideo.org/video/1676/the-naming-of-ducks-where-dyna....



except no tools use it, yet.

I would absolutely love a version of shedskin that moved to Python3 syntax and used optional typing.


PyCharm will read simple type annotations like

  def get_error_message(error_code: int) -> string:
      ...
and use them for autocompletion hints and type warnings.


This is the guy who authored TextBlob! https://github.com/sloria/TextBlob


I'm skeptical about the last one return None


I saw this code yesterday:

    def is_file_for(is_nagyker, type):
        if type == KIS_ES_NAGYKER:
            return True
        elif type == KISKER and not is_nagyker:
            return True
        elif type == NAGYKER and is_nagyker:
            return True
At the first glance, I thought it always return True. Would have been more clear an explicit return False at the end!


May I suggest `any` here?

    def is_file_for(is_nagyker, type):
        return any([
            type == KIS_ES_NAGYKER,
            type == KISKER and not is_nagyker,
            type == NAGYKER and is_nagyker])
I know unsolicited code improvements from strangers isn't the coolest thing in the world, but `any` (and `all`) can really improve clarity for stuff like this. I know I use them quite a bit.


I saw that used in the article too - not something I'd thought to do myself. Thanks for pointing it out, I find it really readable and I know there are places I could use this.


You create a list, so this expression won't shortcut like or. Or is definitely the right way to write an expression like this. Any has its place, but mostly when it's input is a generator, e.g.,

    any(k in obj for k in other)


Seems rude to return None instead of False for an is_. I think I would have written:

    def is_file_for(is_nagyker, type):
         return type == KIS_ES_NAGYKER or \
   	    	(type == KISKER and not is_nagyker) or \
    	        (type == NAGYKER and is_nagyker)


As another change, it is recommended to avoid explicit line continuations by using parentheses instead.

    def is_file_for(is_nagyker, type):
         return ((type == KIS_ES_NAGYKER) or
                 (type == KISKER and not is_nagyker) or
                 (type == NAGYKER and is_nagyker))
This way, it doesn't break if there is extra whitespace at the end of the line.


As yet another change, PEP8 recommends placing comparison operators (and dots, when method chaining) at the beginning of each line to make things a little clearer.

    def is_file_for(is_nagyker, type):
         return ((type == KIS_ES_NAGYKER)
                 or (type == KISKER and not is_nagyker)
                 or (type == NAGYKER and is_nagyker))


See, I don't find that clearer; quite the reverse. With trailing 'or' the type variable is nicely lined up, and the pattern matching that is going on is a lot clearer to my eye. With things misaligned, I have to much more carefully parse the text to see what is going on. I sometimes even do something like this (probably overkill in this example, but my aim is to illustrate a concept, not to bicker about the best expression of that particular statement):

   return (
    (type == KIS_ES_NAGYKER                   ) or
    (type == KISKER         and not is_nagyker) or
    (type == NAGYKER        and     is_nagyker))


Yours is more aesthetically pleasing, but being able to see the "or" quickly helps one understand the nature of the full conditional at first glance.

The other reply to you seems to be the best of both worlds.


    def is_file_for(is_nagyker, type):
         return (   (type == KIS_ES_NAGYKER)
                 or (type == KISKER and not is_nagyker)
                 or (type == NAGYKER and is_nagyker))


Or maybe

    def is_file_for(is_nagyker, type):
         return (
                    (type == KIS_ES_NAGYKER)
                 or (type == KISKER and not is_nagyker)
                 or (type == NAGYKER and is_nagyker)
                )


This is incorrect. “The preferred place to break around a binary operator is after the operator, not before it.”


You're right. It must've been another Python style guide I somehow remembered that from.

From what I can tell this is kind of a point of contention among many developers; there are lots of debates when Googling that particular line.


Thanks! I've wondered about this.. I do like the parens better, and it looks like PEP-8 agrees with you: http://legacy.python.org/dev/peps/pep-0008/#maximum-line-len...


Me too. As blasphemous as it is, the "Explicit versus implicit" edict ends up used as defense for a lot of stuff that I consider pretty severely pedantic...


Is there something like this for Ruby?


Very learnsome, Thanks!


Commenting to save for later reading. :)


If you go to your user page:

https://news.ycombinator.com/user?id=binarysolo

One of the links is 'Saved stories'. That's every story you vote up.


You should name a classmethod first variable "cls", not "class_".




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: