In the paper No Silver Bullet [1], Brooks makes a distinction between accidental complexity and essential complexity. He claims that most complexity remaining in contemporary systems is essential.
Moseley and Marks disagree with his premise in the paper Out of the Tarpit [2]. They propose an approach based on functional and relation model to minimize accidental complexity.
Even if you don't agree with the conclusions or some of the premises in this papers, both provide an excellent read on the subject.
I got about halfway through Out of the Tarpit. I found their claims implausible. The way to refute my reaction is to build a great system using their approach. Anyone done it?
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.
Rails itself is not that difficult to debug after it has been refactored to less magic with 3.X. What's difficult to debug is all the plugins written in brittle-dsl-with-lots-of-magic style
That's not necessarily true. I've found more edge cases and odd behaviour from Rails in a project that's always been 3.0+ and never used 2-style conventions than I ever found in, say, the .NET libraries.
"Magic" is definitely a problem when it comes to debugging, but eliminating magic isn't a silver bullet for making code easier to reason about, particularly when there's errors. Rails still, almost without exception, throws obscure errors that give no indication to what actually may have gone wrong, and there's still quite a lot of magic left that can behave in odd ways in ActiveRecord.
This is probably true, rails is more 'community-developed' then the aforementioned plugins. I'd say the average developer is more inclined to open and debug these plugins than rails itself. Which is what the grandparent comment is referring to (plugins and apps used interchangeably).
The problem with these sentiments is that complexity is relative--what one developer finds simple another can completely reasonably find complex. So yes, excess complexity is bad, but what defines excess? Are two Python nested for-loops significantly "more complex" than, say, foldM? I think so, but others obviously disagree--van Rossum thinks even normal folds are more complex!
The real interesting question is not whether you should strive to make your code complex (clearly not) but rather what "less complex" actually entails. Does this mean less moving parts? More parts that are simpler? More verbose code that is more explicit? Less verbose code that is easier to read at a glance? More math? Less math? Code that looks uniform? Code specialized to the given domain (and therefore less uniform)?
I have strong opinions on these questions, but I have no answers. I've met others with strong opinions exactly the opposite of mine. They don't have the answers either. It is, ultimately, not an easy question.
An interesting assertion in Rich Hickey's Simple Made Easy talk is that simple and complex are almost objective qualities, at least when you accept the definitions he puts forth (simple: one fold/braid with no interleaving). I think he's right, but I'm not sure I agree entirely, only because I haven't spent enough time thinking about it and trying to apply the ideas to real world problems. I do think that most programmers can call out complexity without a lot of disagreement. What you describe regarding fold vs for-loops touches on his definition of easy, that is to say "close at hand". Fold is easy for a functional programmer, for-loops for a Python programmer. Their simpleness might be a different matter.
The answer actually IS objective. The limit of complexity is defined by the fact that our short-term memory can simultaneously hold only five entities, plus or minus two. So, if the code requires one to hold more than seven elements in mind simultaneously - no one will be comprehend it, even the programmer who wrote it.
I once inherited a module written by a more junior developer. He indiscriminately used several dozen global variables that were used and modified in various parts of the program without much of a system. I spent about two months trying to understand how the module worked without much success. Ultimately, I had to completely re-write it (with great difficulty) to get rid of global states, and bring the number of entities affected by each function to under five-to-sever.
This biological limitation affects our development tools. That's why we have object-oriented programming. The goal is to keep the number of items we have to deal with at any moment under that magical limit of five to seven. Incapsulation is a key component of OOP used for this purpose. Without it all will be lost.
Sometimes I think of an alien race capable of keeping track of 100 items simultaneously, and try to imagine what their programming languages look like.. :)
That limitation is not exactly biological. I'm pretty sure that alien mind (or artificial mind) would have pretty similar limitation. It could be "10 entities" instead of "5 entities", but the number of simultaneous entities in "processing model" would not be much higher.
The real reason for such limitation is that number of possible connections between objects is growing very quickly.
For example, 2 entities connect with each other by 1 connection.
Good point. Connections make the situation even worse (for aliens). Interestingly, it's about the same for humans: 3-4 entities produce 3-6 connections. Both are in the same neighbourhood. The limitation is biological though (http://en.wikipedia.org/wiki/The_Magical_Number_Seven,_Plus_...).
I think what Dennis means is that there is a philosophical reason why working memory is limited, and it's not just an idiosyncrasy of organic meat-brains.
Of course the specific number is highly affected by human biology.
The question that comes to mind for me: how do we define "entities" in the context of writing software?
Especially given gorelik's observation that complexity really derives from the combinatorial explosion of connections, are objects with internal state the best way to manage that complexity? If every call of a public method from another object is a connection in the graph our entities are an (object, message) pair, and a situation involving 3 objects with 3 public methods each exceeds our limitations.
Contrast this with a pure-functional approach. If you can express your algorithm with immutable data and operations that create new transformations of that data, you could (hypothetically) have fewer things in mind at any one time: the data being operated on, and the single operation being performed. It doesn't break down quite so cleanly in reality, but use of pure functions where possible largely obviates the benefits of state encapsulation as provided by most modern OO languages.
I think we only need to count entities we need to keep in mind _simultaneously_. For example, a hypothetical File object can have dozens of methods, but to open it I only need one: File.open("..."). Internally, File may contain a dozen of variables, all of which may be updated by this call. But I don't need to know about any of them due to encapsulation. If I did, I'd be unable to write this code.
OOP (when done properly) enables us to focus at any time on a sub-part of the system containing five entities or less, so that we can understand that particular slice of the system. The slice can be very low-level (a private helper function of the smallest class), or very high-level (invoking a large subsystem through a facade interface, http://en.wikipedia.org/wiki/Facade_pattern). A typical REST API would be a good example of the latter.
I like benefits of the functional programming too, but I think functional and OOP work together rather than being in contradiction to each other.
"OOP (when done properly) enables us to focus at any time on a sub-part of the system containing five entities or less"
...
"I like benefits of the functional programming too, but ..."
Your response makes it sound as though you believe OOP is the magic that allows encapsulation or modularization.
Functional programming also enables us to focus at any time on a sub-part of the system. You don't need OOP for that. In fact, you don't even need functional programming for that. For example, you could get by with nothing more than subroutines and functions (whether they be first class or not).
That's true, functions can work quite well for smaller apps, and/or apps specifically suited for that style.
But there are many apps that have complex internal states. Subroutines and functions don't say anything about how to deal with it. This state ends up being global variables, which is a big issue.
I once worked on a complex desktop app written with functions and global data. It was nearly impossible to modify it, or understand what it does. A click on a button could execute 1,000 lines of code modifying dozens of variables, resulting in effects throughout the application. Debugging a seemingly simple issue could take an inordinate amount of time. We had to re-factor the entire app into OOP and model-view design to stabilize it and move forward.
The app had about 30,000 lines of code. So, based on this experience, I'd say it's too large for the functional style. When I start a new project I often don't know how large it'll get, so the safest approach is to default to OOP. If the functional approach will prove more appropriate down the road, it should be relatively easy to integrate functional design in those parts of the app where it's required.
I think there really is no silver bullet here. In many cases, a particular problem has a simple solution with a lot of state. To fully specify a pure function on that much state, you may need a lot of working memory.
In certain cases, you may be able to reduce the size of working memory through recursion or related techniques: "The red-black tree T without node X is the subtree on the left without node X if ... possibly rotated based on the color of the top node in subtree T' ..."
But sometimes, it really is simpler to just do things imperatively with mutable state: "First recurse into subtrees until you find node X. Then, unlink it from its parent. ... Rotate if its color is red. Repeat with parent node ..."
There are many things that are strictly simpler with mutable state. Most prominently I think are hash tables, but there are others. Not all programs are best-described as a composition of maps, folds, and filters.
There's a tricky problem that comes up in teams. A "bad" developer can hack and slash and appear to implement new features in a certain amount of time, and a "good" developer who takes the time to do everything right may actually produce less of obvious value in the short term. When approaching software restructuring, you have to make damn sure that your management chain understands the importance of the refactoring and can evaluate the time you spent for its actual worth. If your organization uses "metrics" or other weasel words to examine productivity, tread carefully.
It's also worth noting that even if you believe you know how to improve/simplify/whatever, it may be worse than you think and sometimes you don't figure out the extent of the complexity until days have gone by.
This, like many things, is a trade-off and not a set rule. In brand new code, by all means make it small and pretty and documented and whatever else would win it an award; but in established code bases be smart about what you "fix".
At my last company we had a co-founder like this. He would implement stuff extremely quickly, often at odd hours to fix a pressing bug. However, the code was just garbage from a maintenance standpoint.
It was hard to criticize, though, because he pretty much single-handedly got the company off the ground by quickly implementing features people were willing to pay for. This code was good enough to get the company up to $2 million annually.
The downside is we spent about a year re-writing the software to get it to a state where the company could scale to higher levels. During that year we had to stop adding new features.
I remember getting very frustrated with the code quality, but I often wondered whether that was just a price we had to pay to get cash flow positive.
Beauty is more important in computing than anywhere else in technology because software is so complicated. Beauty is the ultimate defence against complexity.
— David Gelernter
This resonates with me. The article makes a fine point about removing unnecessary complexity, but in the real world you can't be perfect in that regard - not for a large system anyways. So maybe we should allow a little more complex code if we can use the beauty of the product to hide it.
You can't be perfect but you can at least try to reduce complexity. Saying "just paint over it with pretty colors when you're done" invites unchecked complication. That attitude is what drove all the monkey/ninja patching that we saw with some early Ruby modules. They were beautiful, yes, but it made some code impossible to reason about.
What I take from this is we should strive for the most elegant solution to a particular problem. Be it algorithm or software architecture.
For example, if I had in some way, the catalog of all the possible solutions to problem X ready in my mind, I'd pick the most elegant one. Not necessarily the shortest, but the most elegant, the one where you can spot the _solution's outlines_ easier.
I think seeing and appreciating this outlines would make you a connoisseur of fine software. You would strive for beauty and elegance, and appart yourself from unnecessary complexity.
quoth the article: "It’s only by diligently trying to avoid all complexity that one can in fact avoid unnecessary complexity."
Amusingly you seem to support the author's argument that running around making changes when you don't really understand whats going on leads to lower quality code.
While it's ONE of the nice principles to remember; however, it's not the golden rule. Often times it's a matter of tradeoff. The complexity upfront can be for the simplification later down the road. E.g. defining a DSL would be an extra complexity upfront but simplifies and streamlines work down the road. A good developer knows how the make the judgement call to balance the tradeoff in complexity.
That's not what I'd call a tradeoff. Problem complexity isn't the same thing as incidental complexity. If a DSL is useful for the problem domain, it's not "extra complexity", it's a simple (albeit hard to implement, simple != easy) and elegant solution.
A DSL is an additional functionality adding more components to the system and thus increases its complexity. A DSL incurs extra learning cost that new developers coming onboard need to spend more effort in learning.
You can have a simple library or util module with the standard function API instead of building a DSL. They work just as well in many designs.
Python has a tendency to create non-complex programs in my experience. Today I rewrote a shell script with Python and now its much, much more elegant and easy to understand.
I guess what Im saying is that the programmer need some help from the language to make the software non-complex. Even trivial things often becomes complex in low level languages.
Agreed, till you hit a problem that requires 10,000+ lines of code. I love Python to bits but for this one large website I am working on I wish it was written in C# Java or some other statically typed language.
When you have a large project I will always take the compile time checking that tends to resolve a lot of bugs over unit tests to verify the same thing. Especially when it comes to large re-factoring efforts.
One of the most important things programmers can learn is that you don't have to choose one language per project. Binding, say, Python and C++ in the same program can be very valuable (though it is not trivial, even with SWIG). Some code benefits from the simplicity, clarity, testability and huge standard library of the scripting language, and the remaining code gains performance and expressiveness to handle its greater complexity.
I've also met bright people, mostly coders, who have told me they adore complexity. Even one young man who advocated hyper compact self-modifying code.
Don't we all do that to some degree? There isn't a fixed drop off point for "too much complexity" that's fit for everyone. Consider programming languages, are APL or Prolog "wrong"? Or, compared to something like Oberon, are Ruby, Python and Java?
You're implying that APL is complicated. It isn't. It was designed for regularity and simplicity. Iverson and his group worked on that for 7 or 8 years (can you imagine!) before they began implementing it, and what they came up with remains one of the great high water marks of software design.
This brings out a problem with the OP and the discussion. We readily dismiss things as complicated that are merely unfamiliar. The converse is also true: we don't see complexity once we've become habituated to it. And we become habituated surprisingly quickly.
So how do we distinguish between complexity and unfamiliarity? Without some reliable distinction, this is all in the eye of the beholder.
I think you're throwing out the baby with the bath water. The fact that unfamiliarity can be mistaken for complexity does not mean that complexity is impossible to recognize or avoid.
There are some cases where you can, for example, radically simplify a complex algorithm by using an unfamiliar toolkit (Feynman diagrams, Lisp, etc.) but there are also cases where a solution is obviously more complex than it needs to be even within its existing framework.
Oh, I don't claim that complexity is impossible to recognize. It's really important to do so. But how?
Most judgments of complexity we currently make have a lot to do with familiarity. (Is the 40-line spreadsheet program at http://www.nsl.com/k/s/s.k complicated or simple?) You assume something similar when you say "obviously more complex than it needs to be". When is complexity obvious? When it exceeds what we habitually consider ok.
What we really need is an objective measurement. Do we have one? We sort of do, insofar as the research on this seems to point in one direction: shorter programs are simpler. That doesn't answer every question - you still need a way of measuring program length and defining what a program is - but it's the beginning of an objective answer.
For evaluating a refactoring or using a library/framework, you can compare the number of statements that would be used to add a given feature before vs. after the refactoring.
What's harder is to evaluate the effect of adopting a certain pattern or framework will have on the cost of future refactorings. I don't think people do even a back of the envelope estimation of even just the first number. The number of statements put in can probably be used to estimate the cost in man hours.
My preference would be to put it in when it pays for itself (i.e. makes the code smaller) and not before. Otherwise it's too easy to fall victim to one's own spin.
Good point. There are multiple axises of difficulty. Amount of things to remember, interconnectedness of said things, potential drawbacks, unfamiliarity, idioms etc.
Lots of musicians go through a stage where they are trying to see what they are capable of and make music as involved/complicated/ornate as possible. Many of them mature into musicians who try to find a few elements that work together.
And then there are others that try to come up with a highly complex Magnum Opus. Punk is simpler than prog rock, does that make King Crimson listeners bad people?
There (probably) are areas at the extreme ends of the spectrum where things are too complex/simple, but there's a big range in between that might do it for certain people. I don't believe that just liking some form of complexity is objectively bad. It's not all just complexity for complexity's sake.
And then there are others that try to come up with a highly complex Magnum Opus.
In terms of artistic works that people still actively experience, the tendency is for the classics to be simpler works. Complex works tend to be classics in the Mark Twain sense.
...does that make King Crimson listeners bad people?...I don't believe that just liking some form of complexity is objectively bad. It's not all just complexity for complexity's sake.
I don't believe that just liking some form of complexity is objectively bad either, in general. In the case of coding projects, though, I have to note that the scarcest bandwidth is often reducing team lead bandwidth. Exactly what this means is highly contextual. The tricky part here is in the "but no simpler" phrase of the Einstein quote.
You might say that I like minimalism because I'm lazy. Minimalist cuisine means fewer ingredients to prep, less cleaning later. Minimalist syntax means fewer things to keep track of and lower barriers to tool-smithing. I don't think that makes me a bad person either. The question to ask is, what is the cost/benefit now, and what's the cost of changing your mind? Sometimes the more complicated system reduces the latter cost. YMMV
Sure, teams complicated the complexity issue. Which is why things like Java are so popular, where the moving parts of the language itself are few enough.
It's jam sessions all over again. Much easier if the basics of the musical form aren't too complicated (e.g. 12 bar blues) or everybody knows the same songs already (e.g. Irish folk). Getting a random bunch of people together for some Free Jazz is a bit harder.
Bach had some complicated, almost math-like themes, but is it really more complex than Wagnerian leitmotifs and huge orchestral settings or Schoenbergian modern music where the traditional score notation can't keep up anymore?
But yeah, the analogy is easy to abuse, due to the multitude of artistic styles, and active rebellion against the "mainstream". For every Piet Mondrian out there, you can find a Chuck Close.
Let me guess... they were addressing hypothetical scaling issues for a site with < 100 users? Instead of optimizing the conversion rate on their homepage and mailers?
- you might not have seen enough cases be able to characterise the true problem, e.g. in some situation a constraint exists, but you've only seen one instance of that situation
- reusing standardized components can enable you to solve the problem more quickly (even though one-size-fits-all doesn't really fit)
- Kolmogorov complexity doesn't account for efficiency, space nor time, only that the eventual answer is correct. Efficiency often requires ugly hacks
There is a resource bounded version of Kolmogorov complexity. It might even be a more useful measure for certain things. For example, a simple enumeration can prove any arithmetical statement that can be proved, but it can take time exponential in the size of the solution. With some resource weightings you might get a better idea of how hard the problem is in actual practice.
I think that a good developer is one who can strike a balance between correctness (solving the problem at hand), elegance (the system is no more or less complex than it must be) and consistency (the "culture" of the program is the same in all of it's components).
I would put an aversion to complexity in the same box as an attraction to it. I.e. atheists are fundamental in the same way that christians are. Agnostics know what's up.
Another book on refactoring that I'd highly recommend is "Working Effectively with Legacy Code" by Michael Feathers. This book gives you lots of practical hints about how to start refactoring a large, messy code-base. While the task may seem overwhelming at first, you can start by getting small chunks of the code refactored and testable.
Wow, you got me there. "Working Effectively with Legacy Code" would be even more directly in-line with what I'm looking for. Way too much Java EE stuff in the code base I'm working with right now.
That book changed the way I look at programming. For me, it was the first book I found that actually answered questions that I didn't have the vocabulary to ask yet, rather than just talking around them.
Disclaimer: I'm the author of the post, so that's not actually a different recommendation, just a more elaborate one. :)
It's one of the classics of our trade; Even if you know everything there is to know about refactoring you should really read it just to have read it. So consider that my recommendation :)
Something I have started doing; Whenever I create anything, whether it is a program , website copy and email or whatever is take a step back and say "How can I make there be less of this?" rather than "how can I make this better" or worse "cleverer".
I think education in general rewards complexity, getting bonuses in English class for using long passage of flowery words with many synonyms or extra credit in CS for adding 200 lines but reducing O(n^2) to O(n log n).
>> getting bonuses in English class for using long passage of flowery words
That depends on who your teacher is. My teacher taught that good english should be understood by as many people as possible while retaining all of its meaning. The goal of English is to communicate.
He was a good teacher because the top 65% of our class got into the top 35% of the state for English, while our school was below average in just about every other subject.
Refactoring. You have to move this logic from A to B, thus adding some complexity to B, reducing the overall complexity because this logic really belongs to B.
Note that during refactoring you will have twice the logic, which is the worst state, so you can't stop in the middle and say you'll finish later.
I'll bet some developers are literally "embarrassed" to write and share "simple" code.
Maybe "trivial" is good.
If you can "write something in a weekend", it's easy to maintain. If you lose the code, you can rewrite it. And maybe it will be better the second time.
When a project becomes so complex you cannot easily rewrite it, is that a good thing?
From the post: """It’s always bothered me that you can simplify “Simplify, simplify, simplify”"""
No, you can't really. You can just say "simplify" but it does not convey the same message.
That is, it conveys a suggestion to simplify (the first part) but it doesn't convey that simplification has so extreme importance that it's worth to repeat the message (the second part).
If you have to be a pedant and tell people to use a multiplication symbol, then use a multiplication symbol. The symbol is × (×), not the letter X.
If you want to be a better pedant, you better start paying attention, like for example to the fact that I alluded to x not being the multiplication symbol by a) putting it in quotes and b) writing ~ "x", where ~ is the well known "approximately equal to" math symbol.
Regular expressions are just like everything in programming. Used properly they are a huge boon. Used improperly, they're a disaster.
Here's a practical example. I spent half a day coding this algorithm (ASCII string in STEP format to an array of Unicode code points) in 70 lines of C++ without regular expression. It took me 15 minutes to recode it in Perl 6 using regular expressions:
while $step.chars > 0 {
given $step {
when /^ '\\S\\' (.)/ {
@unicode.push(ISO8859toUnicode($page, 128 | $0.ord ));
}
when /^ '\\P' (.) '\\'/ {
$page = $0.ord - 'A'.ord + 1;
}
when /^ '\\X' [ '2' | '4' ] '\\' (<hexdigits>+) '\\X0\\'/ {
@unicode.push(:16($0));
}
when /^ (.)/ {
@unicode.push($0.ord);
}
}
$step = $/.postmatch;
}
This (admittedly completely untested) version is drastically shorter and much less brittle.
I use them constantly but almost always BRE. And I'm still discovering new tricks using only BRE. I guess I'm too dumb to use anything more clever. I'll never be a Larry Wall. I still feel I have more to learn just on the level of BRE. I still think there are more hidden gems.
A lot of developers seem to have a serious disdain for anything that might be construed as the "lowest common denominator". I don't get it.
Somehow I find a tremendous versatility in the LCD. It works in so many environments. I can rely on it to work, no matter what the size of the input.
But then, if you go buy a book on RE, what do you get? An explanantion of the most ridiculously complex and difficult to maintain RE. As if the author is just showing off.
Mastery of the basics is too boring I guess. Too simple.
The worst feeling is working in someone else's overly complicated, messy code. It's like having to work covered in shit. The ick feeling is constant and overpowering.
Moseley and Marks disagree with his premise in the paper Out of the Tarpit [2]. They propose an approach based on functional and relation model to minimize accidental complexity.
Even if you don't agree with the conclusions or some of the premises in this papers, both provide an excellent read on the subject.
[1] http://www.cs.nott.ac.uk/~cah/G51ISS/Documents/NoSilverBulle...
[2] http://web.mac.com/ben_moseley/frp/paper-v1_01.pdf