Hacker News new | past | comments | ask | show | jobs | submit login
Programming like a Pirate (Alt+Shift+M) (kranglefant.tumblr.com)
59 points by magnars on Dec 4, 2012 | hide | past | favorite | 41 comments



I'm not sure I agree with this. If I'm looking at the implementation of a function, it's because it's done something wrong. For this function, I would fire up the debugger and say:

"Does isTestPage() return what I expect? Yup, so the problem is going to be in surroundPageWithSetupsAndTeardowns()."

A well named function becomes a black-box; you can tell if it's working just based on its inputs and outputs. Why open black boxes that are working?


I think I'd go as far as to say that I respectfully disagree with the article and agree with you.


> Why open black boxes that are working?

Maybe to modify them ?


Or even to understand how they are working and to check if they are really working correctly?

I really don't like this attitude of ignoring inner workings of things just because they seem to work. Sometimes we really are working under constraints that make digging through the code impractical, but in my experience these constraints tend to be in the code itself - namely, someone wrote it as a 'black box' with tons of kludges and `if False:` style comments.

I don't want to exaggerate and say that every programmer should know his every tool down to it's last bit, but taking black boxes apart is what makes one a hacker, and what allows one to improve one's skills.


Fair enough. Curiosity is a very reasonable excuse for opening a working black box, and is definitely what makes one a hacker.

On the other hand, most systems these days are so complex that you have to have some black boxes. I know more about how my linux kernel works than the vast majority of people I work with, and yet I would classify the majority of it as a black box.

I have a general idea of various trade-offs in efficiency and fragmentation resistance in malloc implementations, and if I have to I can dig into one. On the other hand, jemalloc works great and I just don't have the time to take it apart, so I just use it.

Ultimately to get things done you need to trust that functions do what they say they do until you get evidence to the contrary. Taking functions apart can make you better at doing your job, but rarely fixes the problem you have Right Now.


Well, modifying a distinct method is much easier than modifying something in a middle of 20-page scroll of code.

There are simply fewer thing that can go wrong.


Hear, hear - I've been marking undergrad coding assignments all term, and the ones where they make a new method for absolutely every block of their code are nearly as unreadable as the ones that don't know how to indent or pick good variable names.


Mind to show an example? Does the new method have a meaningful name? Are they smartly structured in a hierarchical order? I find it hard to believe that too much methods are harder to read than bad variables or indentation.


Keep in mind that these are undergrad projects, so a few hundred LOC tops. But you'll see something like:

  bool noMoreElves() {
    return elfCount == 0;
  }

  void fireAllReindeer() {
    for (int i = 0; i < reindeerCount; ++i) {
       delete reindeerArray[i];
    }
    delete reindeerArray;
  }

  void checkCloseWorkshop() {
    if ( noMoreElves() ) {
      fireAllReindeer();
    }
  }
the above is certainly readable, but takes forever to parse -- the below is, I would argue, just as clear, and much quicker to read (the point the OP made)

  void checkCloseWorkshop() {
    if ( elfCount == 0 ) {
      // fire all reindeer
      for (int i = 0; i < reindeerCount; ++i) {
        delete reindeerArray[i];
      }
      delete reindeerArray;
    }
  }
the comment above the loop is possibly superfluous, but helpful


That goes against many recommendations I've read about refactoring and writing cleaner code:

"Good code invariably has small methods and small objects. I get lots of resistance to this idea, especially from experienced developers, but no one thing I do to systems provides as much help as breaking it into more pieces."

– Kent Beck


The little research I've seen has not backed up that opinion. While that view is shared by many - especially those who come from a Smalltalk heritage - the studies I've read about do not show a clear association between small pieces and more readable/maintainable code, at least not for code blocks under 100 lines or so.

I dislike reading Smalltalk influenced code with many small pieces. They tend to use instance state as a way to pass parameters, and that makes me uncomfortable - I have a leaning towards functional programming.

I also distrust function names. A function like "noMoreElves" might actually be "no more elves!" and implemented as "elfCount = 100". I exaggerate a little, but an inline "elfCount == 0" is much easier to verify than having to jump elsewhere to see the code.


http://stackoverflow.com/questions/153234/how-deep-are-your-...

Kent beck also gave this amazing (and suprising) answer on SO about "how far to go with your tests?" He is in my eyes, a true pragmatist and gives some sound advice.

Uncle bob on the other hand seems to have drunk too much of his own koolaid. I saw him give a talk 2 years ago at a java conference which made him seem more like a Steven Balmeresque salestype, not an active developer.


Both of your examples are fine, as it is pretty clear what each method does.

In such cases, the golden rule is "if you do it more than once, create a new function". I don't know what kind of things you program if you have 100-line functions with no code sharing.


As far as I understand it, it's more of a "Bottom/Top" versus "Top/Bottom" thinking. Personally, looking at:

  void checkCloseWorkshop() {
      if ( noMoreElves() ) {
        fireAllReindeer();
      }
    }
is enough, I don't need to look more precisely. But, for someone who is more "Bottom/Top", the second version is a headache because they need to jump everywhere and keep 10 different files/functions open all the time.

An anecdote of this, I once worked with a programmer who was "Bottom/Top" and really hated functions. (1500+ lines function was the way to go for him). He would then refactor my code like this:

  void checkCloseWorkshop() {
    bool noMoreElves() {
      return elfCount == 0;
    }

    void fireAllReindeer() {
      for (int i = 0; i < reindeerCount; ++i) {
         delete reindeerArray[i];
      }
      delete reindeerArray;
    }

    if ( noMoreElves() ) {
      fireAllReindeer();
    }
  }
 
I.e. Pasting my functions into the other functions he needed.. rather than just calling mine or moving them in another file. Obviously, there was no tests or frameworks used by this company. This is a true example of "Bottom/Top" thinking.

Last note, I understand your point but there's a way to mix both strategies by using high-level strategies. For instance, fireAllReindeer is deleting an array while noMoreElves is checking for emptiness. Good C++ programmers would use boost or the STD functors/SmartPointers rather then manually reinventing the wheel every time.


The main point is that when you realise you forgot to set reindeerCount = 0. If closing the workshop is the only situation in which you fire all the reindeer, it's probably fine. If you need to fire the reindeer in other situations, you now need to manually fix the code everywhere.


Thanks, good catch (my code probably would have been better if I had an actual use-case instead of a contrived example).


To more directly answer your question, it's more a problem of needing to remember an unnecessarily large number of names - in addition to the class variables numberOfElves and reindeerArray from my example, I now need to keep track of twice as many names. Also, admittedly, after the first 5 or 10 repetitions of the same assignment, I know what all the valid solutions look like, so obfuscating the code by putting little chunks of code in separate methods makes it harder to read (but that's a readability problem specific to marking assignments).


"a large number of names" is a metric that differs for different people. Some find it much easier to parse text when everything is named, some prefer to have things unnamed and just spelled out. Think of having a hash from name->function vs a hash essence->function. The first people can see a name, remember it, look it up and form a connection. The latter can see a logical block of code, parse it and then recognise it when they see it again.

Personally, I'm in the latter camp. There are people whose names I've been told more than 5 times, but still don't remember them. I remember the person, but not the name. In school, I almost never knew any teacher's name and it took me several years to learn the names of most of the other children.

This also means that you can't have a "One True Most Readable Style". You only have several local maximums for different classes of people.


I agree that unit tests for internal (non public) functionality are not particularly useful, and can hamper your willingness to change an internal implementation for the better.

I also agree it's painful to trace through dozens of classes/methods to determine what the actual implementation is. However, I don't agree the culprit is small methods. Small, interdependent methods in a single class (or package) are great and can be easy to follow if they are consistent. Usually the culprit is overuse of inheritance and design patterns.

However, one case where I find myself being less "fancy" is functions that run/generate SQL statements. Although there are many cases where you can re-use certain portions of SQL statements and build up a query through smaller functions, this ends up being a nightmare to debug. I find it is better to be as plain and explicit as possible.


Programming like a pirate? I was expecting this to be about R.


> Did you ever learn how to read code at university? Were you shown loads of different code, in different languages and different styles, and then asked to explain what the code did? I don’t think code comprehension is a normal part of the curriculum for computing science classes. We’re taught how to write, not how to read. Maybe that’s a mistake.

Dear lazyweb,

I want to bridge that gap in my education. Where do I go to read loads of code in different styles? With the emphasis on plurality of styles.


Reading code without a purpose can cause your brain to tune out. You think you're learning but you aren't. I would rather suggest you find an open-source project and fix a bug in it. You will have to read lots of code in order to do so, and, so long as there is a repeatable test-case, fixing the bug shouldn't be too hard.


I'll augment indiecore's answer. What kind of code are you interested in? Suppose you are, like me, interested in cheminformatics software. Find a few different packages (Open Babel, RDKit, Indigo, and others), download them, and compare how they implement a given task, like SMILES parsing or fingerprint generation.

If you're interested in text compression, then download a few compression toolkits and see how they work. Also read a few articles on the topic. Regular expression engines? Dozens are available. Interested in networking? Take a look at ZeroMQ and other networking packages.

Databases? That's a big topic, but you don't need to understand the entire package. Pick some aspect which interests you and see if you can figure out how SQLite, PostgreSQL, and MySQL handle it.

Here's an easy one to start off with - how do Lua, Python, and C++ implement hash table-like data structures?


I would read code on codereview.stackexchange.com. If you're not familiar, it's a site where you can get help making your code more readable and more idiomatic. I'd browse the questions for questions you're interested in and look at the suggestions for improvement.

Personally, I don't like traditional OOP, so for me, Go and D are interesting to me (read the standard libraries if interested). Go tries to restrict you to their "idiomatic" style, and D tries to let you program in whatever style is appropriate for the problem.

My best advice is to solve a real problem in a new language. Try to make that solution as idiomatic as possible for that language. Trying a functional language like Haskell or Racket will also do you a lot of good.


Rosetta Code is a great resource (http://rosettacode.org/wiki/Rosetta_Code). Examples from many languages are provided per a given topic (e.g. http://rosettacode.org/wiki/Sieve_of_Eratosthenes)


Github?


I would recommend the above, along with chapters from http://www.aosabook.org/en/pypy.html as a guide


> Think about this, in your experience - why do go have a look at the implementation of a method? Either you need to change it somehow, or it isn’t behaving as you thought it would.

When I have been left stranded without API documentation, I look at method implementations to figure out which method does the thing I want, if any. This is normally at least somewhat painful--but it's occasionally downright pleasant, when the author of the original code has written it in this "every method is written as if it is exactly as deep in the abstraction hierarchy as you'll need to go" style.

Which is to say, when you code like this, it's pleasant enough to read the source for that purpose that docs become somewhat redundant. I could imagine an editor designed to interact with code written like this, just displaying in a sidebar the implementation of any function you scroll past in autocomplete--because that's exactly as much, and as little, as you need to know to decide whether that function is the one you want.


Is Alt+Shift+M significant to the article? I don't normally code in IDEs.


Alt+Shift+M extracts text to a method in Eclipse.


The problem with

  if (page.hasAttribute(“Test”)) {…
is when you then decide to change boolean attribute Test to an enum attribute Environment with values Development, Test and Production.

Because now you have to hunt down all the page.hasAttribute(“Test”) in your code and replace them with something different. And if your code is creepy enough you will miss one or two and something will break.

So you have to look very carefully whether you repeat the same checks. If you repeat a check in different places and it has some meaning, you should abstract it away.


The bigger problem I had with that snippet is the exposed string. I haven't seen the original video this is from, but I can't wonder but removing this hardcoded string was the bigger reason.

Encoded strings like this are a maintenance problem lying in wait.

How do you grep/search for it? These are all equivalent:

   if (page.hasAttribute("Test")) {…

   if (page.hasAttribute('Test')) {…

   string test = "Test";
   if (page.hasAttribute(test)) {…

   string cheese = "Test";
   if (page.hasAttribute(cheese)) {…


This is exactly what I thought. The purpose of the "isTestPage" method is to hide the implementation. It's a basic OO principle, not some controversy stirred up by Robert C Martin.

I hated seeing somebody argue against it because we'd be better off as developers if more people understood this and made a habit of hiding implementation to this extent.


That's simple - change the method so all those invocations are invalid now.

If you're using a compiled language, the compiler will tell you all the lines that need changing.

If you're using a dynamic language, you should have test coverage for all those methods and your tests will blow up with "no such method".

Or maybe, you could special-case the 'hasAttribute(name)' method so it logs a debug warning and checks the enum attribute.

Really, there are enough ways to deal with that and you're far from the first person to have had that problem.


"you could special-case the 'hasAttribute(name)' method"

Please don't do this never.


I believe "temporarily" was implied.


And when are you going to take out that? If you leave, who will take care of this hack?


This isn't my argument -- I've never taken this approach -- but if the need came up, it would never go into source control; it'd only be there long enough to run test suites locally and find the broken code that needed fixing.


The core problem is that it moves part of your OO interface into data.

With "isTestPage()" you have a method with well-specified outputs.

With "page.hasAttribute('Test')," a load of things become part of the permanent interface of the class: hasAttribute(), the datatype of 'Test,' the content of the field in the constructed argument etc...


This is part of the art of our work as software craftspeople; choosing when to refactor and when not to. There's no great joy in reading 400-line functions, and no great joy in hunting down 10 levels of the call stack to find out what is going on.


on CleanCoder: I know $1 is just $1... but putting a sample video that I can just click to see would help a lot




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: