100x better approach to software?

mistermann · on April 23, 2012

In my relatively short time (~20 years) in the software industry (in the "enterprise software" corporate development world on the MS platform), it is my personal perception that it now takes nearing 10 times the LOC &/or classes &/or complexity in general to implement the same functionality now, using modern "improved" software architecture and design patterns, than it used to writing code the "old" way we did things when I started out.

I'm deliberately very conscious of keeping an open mind and trying to discover just what it is that I don't "get" about the way things are done these days. I look at some of the incredibly complex software implementations of really very simple functionality these days, and for the life of me I just can't understand why people write things to be so complex.

Now and then I just can't restrain myself and I'll ask "why didn't you just do it this way" (often with a proof of concept implementation) and the normal response is blank stares, and a changing of the subject shortly thereafter. This is not just with individuals, this happens in groups, and no one (so it seems) feels anything is out of order. Not surprisingly, implementing new features takes far longer than things used to, at least according to my perception.

So, unless I am out of my mind, a 10x improvement could easily be realized (at least in the world of "enterprise" software), just by unadopting some of the practices adopted in the last decade or so.

SomeCallMeTim · on April 23, 2012

I've experienced this myself in the "religion" of OOD (in which I admit I used to be a fervent believer).

I once showed my brother a Ruby script I'd written with lots of clean encapsulation and a pile of reusable code -- and then he showed me how he would have hacked it together in 1/10 the code by NOT trying to design it to be "extensible."

There's obviously a balance that you can strive for -- sometimes it IS best to design code to be reused. But in this case, unless my code were reused something like 20 times, I would have ended up with NET fewer lines of code by simply taking the NON OOD approach. And fewer lines of code means fewer opportunities for bugs.

At this point I try to mix OOD with other paradigms, taking advantage of the strengths of each. Sometimes OOD IS the right answer still, but honestly most of the time it isn't. And OOD is verbose compared to other paradigms, especially when you're using a paradigm-agnostic language. (I suspect if you tried to code in Java this way it would fight you at every step, for instance.)

felipemnoa · on April 23, 2012

>>and then he showed me how he would have hacked it together in 1/10 the code by NOT trying to design it to be "extensible."

Hacks like this do come back and bite you in the rear big time when latter on you want to extend it. Unless you are planning to trow it away or never modify it, it is a really bad idea.

Scaevolus · on April 24, 2012

Programs have inertia roughly proportional to their size.

Extending a small program is almost always easier than a large program-- except for the special case where the large program was designed with the specific extension you're making in mind.

astrodust · on April 24, 2012

You sound like a person who's never had to debug a hairball of a Perl program that looks more line-noise than source code.

You can do just as much damage by being too concise as being too verbose.

mistermann · on April 24, 2012

> You can do just as much damage by being too concise as being too verbose.

Very true in my experience. Why do something in 3 lines of easy to understand code whose intention is obvious, when you can do the same thing with a clever and obscure one-line hack that will impress your colleagues?

SomeCallMeTim · on April 24, 2012

FYI, in the comparison above, there were no clever one-line hacks that were completely unreadable. OO code just takes more lines.

mistermann · on April 24, 2012

Can you give an example of how a lack of consideration of extensibility might come back to bite one?

And please not a contrived straw man example that is obvious to everyone reading, something that one might legitimately do in the real world for reasons of simplicity and expediency, that could turn out to be a genuinely painful experience when it becomes necessary to extend. (Obviously I don't expect an essay, but I'd appreciate if you could try to give us an idea in a reasonable amount of text.)

SomeCallMeTim · on April 24, 2012

The point is that there's a time and place for OOD, but small scripts that may or may not even be reusable isn't often one of them.

But you can write as much code as you feel comfortable with, I won't stop you. I'm instead using the insights I learned above to write less code.

And sometimes awkward things result, and I refactor. But a lot of the time the "awkward thing" is trying to get OO code to conform to a problem that doesn't benefit from it.

ralph · on April 23, 2012

I'd echo that view, also with twenty years of (Unix) programming behind me. The start of the slippery slope seemed to be object-orientation; yes, it can be ideal in some areas, helps re-use if done well, but since the majority of source is of average quality or worse it often seems to just add obfuscation, layers to wade through, contortions to fit the solution to the OO model the language provides, and poor names because so many need to be created that it becomes tedious. Contrast with Go's interfaces.

Design patterns have also had an overall negative effect IMHO. Useful as a lingua franca for theoretical discussion they've again been over-adopted and the vocabulary has become so large, and sometimes imprecise, lessening their usefulness, that it sometimes seems like a geek version of the management-speak for which we criticise the pointy-haired.

"Controlling complexity is the essence of computer programming" — Brian Kernighan. I often find the above add more complexity than they're worth.

pavel_lishin · on April 23, 2012

Do the added LOC/complexity/classes make it easier to change the software later? As I understand it, all of those things aren't there to make it easier to start; they're there to make software easier to maintain.

loup-vaillant · on April 23, 2012

I'm pretty certain that more code is more difficult to maintain, period.

Before you modify anything, you have to find the relevant piece of code, understand it, and understand all the dependencies' interfaces (or even part of their implementation, if the abstractions leak). I can't recall specific experimental result, but it seems, well, obvious, that all the above will be more difficult if there is more code to examine.

Now you could argue that you could shrink the size of the code that one have to examine before making a modification, by increasing total code size. I don't believe it. Increased modularity means more code re-use in the first place, and therefore a smaller program.

johndcook · on April 23, 2012

Agreed. I imagine most unnecessary code is justified by saying that it will make the code easier to maintain, even though it usually has opposite effect. It pays to be very skeptical of your ability to predict the future.

mistermann · on April 23, 2012

That is typically how the design is "sold", but in practice I've found that either the anticipated changes are never required, or are such that the "flexible" design doesn't work for the changes that are eventually needed.

Meanwhile, you have to live with this same complexity everywhere in your system; wrapping your head around the simplest implementations is very time consuming, onboarding new developers is unnecessarily slow, making minor tweaks to production apps no one has touched for 2 years is slow and dangerous.

slewis · on April 23, 2012

Can you provide a concrete examples?

mistermann · on April 23, 2012

Ok, here's one extremely simple example....

Say you have a screen where you have to show sales for an individual customer.

One way of doing this is, within the screen that requires this data, querying sales by that customer using the dynamic sql feature provided by the ORM you are already using. One line of code, and it exists in the source code exactly where it is consumed at runtime.

Another way of doing this is writing a stored procedure that returns the same data and accepts customer name as an argument. Now, this SP has to be exposed through <n> application layers before the UI can ultimately call it. So now, you have a very specific (and obscure) stored procedure, that has to be exposed in <n> layers of application tiers.

Now imagine you need sales for an individual customer, but now only for products from a particular manufacturer, and imagine the two ways that could be implemented. Rinse and repeat for every other distinct data retrieval use case you have.

This is just one type of an implementation style that can be made excessively complex. Now throw some new design patterns that you just read about into the mix. Now, on top of your <n> app tiers, you now get several additional wrapper classes on top of all this, so a simple select from your database ends up with a call stack 12 levels deep with 6 levels of indirection.

And if you suggest perhaps there's an alternate way, you're an outcast. At the very least, one would hope you could have an open-minded discussion about the relative merits of the two approaches. But this is just my experience, YMMV.

slewis · on April 24, 2012

That's a good example and it does suck. Sounds like enterprisey java-land that we HNers love to hate on so much.

CoolGuySteve · on April 24, 2012

I know exactly what you mean. The only time I ever got an answer, I was briefly working at a large bank and it was something along the lines of reducing 'touch-points'. The theory being that the more you have to alter the code later, the more likely you are to introduce new bugs.

I think it's kind of bullshitty to write 1000 lines of code with the expectation that they'll be better written now then later when you have a clearer view of the requirements, but there it is.

The whole mess with design patterns and their indirection machine gun reminds me of this move: http://1.bp.blogspot.com/_zzQgqacqhjI/Sf-oiHCviUI/AAAAAAAAAT...

Look, only one touchpoint! Much more reliable now.

tshaddox · on April 23, 2012

See Donald Knuth's thoughts on the "menace" of reusable code: http://www.informit.com/articles/article.aspx?p=1193856.

T-hawk · on April 23, 2012

The difference is in creep of functionality and expectations. So much stuff that wasn't even on the table 10 or 20 years ago is now expected so inherently that it's not even set out in the requirements. All those extra classes, complexity, and layers of abstraction do have purposes. We can continue talking about the Microsoft stack; my experience is there too. Yes, in older days you could knock together a report with ten lines of database code and a form in Visual Basic or even Access.

But expectations change. Your fat client VB Win32 app isn't good enough in a world where everyone expects to have every piece of functionality accessible with no client installation, using a web browser. Now you're rewriting it in ASP. Now you have to deal with sessions (which VB gave you for free just by virtue of the fact that the program ran continuously in memory), concurrency, logging, IIS hosting, and loads of other complexity. The functionality gains are huge over your VB program: easy access for multiple distributed users. It's just not so visible since it's just an expectation of the platform.

Five years later, now your standalone ASP pages are losing out to slicker competitors. Now every good site has on every page a banner and breadcrumb trail and sidebar menu and search box. That's not something that ASP does, unless you fake it with piles of server-side includes inserted into every page. So you move to ASP.NET with its concepts of master pages and HTTP pipeline modules and hierarchical controls. These give loads of functionality over and above older ways of doing things, even if you use only a small part of it. Multipliers of effort now exist in places and degrees that weren't previously possible.

Now your business wants federated single-sign-on with a partner web site and company. This wasn't even conceivable for a desktop Win32 app twenty years ago. But now you can implement it by invoking a few WIF library calls from the correct layers of abstraction. Sure, it'll take longer and more complexity than your single login box in your VB desktop application. But it's flexible and scalable in ways previously unimagined.

Or think about the back end. Twenty years ago, that was an Access or Sybase database where 5000 rows was a lot. Now modern database platforms scale past five billion rows, on any number of servers across a farm, with all sorts of automated procedures for live backups and encryption and failover and sharding and whatever else.

Maybe you're thinking of SQL Server Reporting Services? Yeah, it seems like it takes a long time to implement that compared to a one-off report with some charting control. But you get whopping loads of functionality along for the ride: pagination, filtering, printer formatting, linkage to CRUD actions, ability to clone the report for editing. You don't think about the functionality because it just comes as expected with the platform. And yes, some of the complexity will leak out into your application code ("no, don't page-break between the transactions for stock sale and cash credit.") But paying the price on an apparently slower or overengineered implementation really can give you highly leveraged multipliers of effort.

Recalibrate your notion of "same functionality" and you'll see why the software industry has gone where it has.

mistermann · on April 24, 2012

I don't really disagree with anything you've said here at all...yes, the software as a whole that we're deploying today is different and more complex in many ways...or, perhaps a more correct statement is its ecosystem is more complex?

So yes, there is a lot more overhead now on top of strict functionality, but this isn't what I was referring to - I'm talking more about once you've taken the hit of all these now necessary environment/ecosystem issues and you need to implement a very specific piece of functionality. There are an infinite number of ways to implement something, and I find it to be increasingly common that the most complicated implementations seems to be preferred. It seems once someone learns some design patterns, from that point forward they are determined to use them.

Rare is the programmer in the "enterprise" world who will acknowledge the fact that there are multiple ways, of varying complexity, to implement something.

A good example in my opinion of two sides of this argument, both sides far more qualified than me, was when Robert Martin (Uncle Bob) was on the StackOverflow podcast to discuss some of the issues (attacks) Joel had made on Bob's advocacy of SOLID principles, test coverage, etc. My take on the discussion was that Joel was genuinely trying to understand why Bob considered the complexity of his advocated designs absolutely critical to developing software, and without it, you will fail (or should be fired). However, his argument was unconvincing to me. If these principles are so important, then why can't you make a compelling argument for them that experienced practitioners can understand? And again, if they are so important that you ignore them at your own peril, how is so much software written successfully while ignoring these principles.

I think that side of the camp would do a big favor to itself by becoming a little humbler and intellectually honest.

Groxx · on April 24, 2012

How much have those 'most complicated' options created the ability to build such a vast increase in expectations in relatively small amounts of work? Without a few several-thousand LOC gems, my Rails code would take longer and be more error-prone, and wouldn't play well with gems A through Z (excepting maybe Q). The fact that I can grab 5 random interesting gems, try them out, and then discard them with a very small amount of work on my part is phenomenally valuable, and wouldn't be possible if they weren't as well encapsulated as they often are.

mistermann · on April 24, 2012

This would be an example of where putting significant thought into design and extensibility would be well worthwhile, because the target audience is so massive.

The typical enterprise code will never be reused, it will never be "extended", but there's a decent chance it will have to undergo slight modifications.

Most enterprise programmers, perhaps contrary to their image of themselves and their projects, are not working on the next ROR, are not sending a rocket to the moon, are not building an app that requires google-class scalability.

Groxx · on April 24, 2012

Many enterprise programmers have to deal with lots of other programmers, who they may not talk to, who are of vastly varying skills. They might literally need to protect their code from someone else's misuse, so it can be fixed later if needed without far-reaching complications.

Which sounds a lot like an open source project, except that those involved may not be interested in what they're doing. I'd argue that makes them more dangerous, so a greater level of encapsulation may be warranted.

DennisP · on April 23, 2012

Haven't watch Kay's lecture yet, but I'll mention that he and some colleagues are about five years into a project to write an entire computer system, including OS, gui, networking, and programming tools, in 20K lines of code, using some very innovative techniques. Compare to the millions of LOC in other systems and it's easily a two orders of magnitude improvement.

One example: a TCP stack is about 20K LOC in C. In their system it's 160 LOC: http://www.itsoftwarecommunity.com/author.asp?section_id=162...

Their language allows easy syntax modification within scopes, so they defined a syntax that matches the IETF ascii diagrams, and simply pasted in the diagrams from the standards.

You can read their papers here: http://www.vpri.org/html/writings.php

gbhn · on April 23, 2012

Meanwhile, in five years, Linus had Linux 2.0

Obviously the goals are a bit different, but if this group really has a 100x better software development process, why aren't we using it right now?

My own theory is that the distance from "this pile of prototype code I just wrote looks promising!" and "here's a robust, maintainable system" is where the 100x is hiding.

DennisP · on April 23, 2012

It's not like they started five years ago with that 100x process in place. They've spent most of their time inventing it.

Keep in mind, too, that Linus built a kernel, but was able to use a lot of already-written gnu software for the rest.

loup-vaillant · on April 23, 2012

If I recall correctly, the Linux Kernel is about one million lines of code, while KDE + LibreOffice + Firefox is more like 200 millions. Meaning, we could expect the kernel to be about 200 times cheaper than the rest of the software stack.

Even if they do use their miraculous 100× process (they partly do), it would still be a bit slower than a kernel in C.

My guess is, in about a year or two, we will see languages and programs based on their stack, and if it delivers, I think there's a good chance of total disruption before 2022, provided we push for it.

tgflynn · on April 23, 2012

But how many lines of code does it take to implement the language that allows them to implement TCP in 160 LOC ?

mbrubeck · on April 23, 2012

That language is OMeta. There are several implementations; the latest JavaScript compiler is just a few hundred lines. http://www.tinlizzie.org/ometa/

The compiler is self-hosting; much of that JS code is actually compiled from OMeta source. For another example, the Common Lisp version includes fewer than 200 lines of OMeta source and fewer than 300 lines of Lisp source.

OMeta was a very eye-opening language for me because of the great expressiveness it gets from its simple but novel approach. (I've written code in Haskell, Prolog, Scheme, OCaml, and plenty of other languages, but OMeta is not like any of them.) If you are interesting in programming languages and/or parsing, Warth's thesis is a great read.

CodeMage · on April 23, 2012

Thanks for this! I'm still reading the dissertation and haven't gotten to the end yet, but so far it seems to me like what it would be like to "program in ANTLR", so to speak; except that OMeta looks easier to understand and use.

corysama · on April 23, 2012

The compilers, down to machine code, are included in the 20K budget. The special language -> intermediate form tool is only a few hundred lines. There are a couple of intermediate -> machine code tools, also a few hundred lines. And of course, together they are self-hosting.

Check out the top paper in that list http://www.vpri.org/pdf/tr2011004_steps11.pdf The pdf was produced in their publishing software, running on their OS, compiled with their compilers. And, they are still under budget.

jules · on April 23, 2012

100 lines:

"Ometa can be made in itself and optimized in about 100 lines of code" -- http://www.vpri.org/pdf/tr2010004_steps10.pdf

And then about 100-200 lines to compile the intermediate language to machine code according to this paper: http://www.vpri.org/pdf/tr2010003_PEG.pdf

wmf · on April 23, 2012

They tend to use layers of DSLs (sometimes four or five deep), but even when you add up the code for all the layers it's smaller than C.

loup-vaillant · on April 23, 2012

Much smaller (at most 2000 lines of compiling tools). And of course, we don't count the number of lines in GCC for the C program.

udp · on April 23, 2012

Is any of their source published? I'm interested to see this.

DennisP · on April 23, 2012

Links to sourcecode here: http://www.vpri.org/vp_wiki/index.php/Main_Page

OMeta is the configurable syntax project, and there's a lot of other stuff.

tgflynn · on April 23, 2012

It seems to me that a huge amount of developer effort tends to be expended in mapping data between different data structures as required by different algorithms or API's.

If a way could be found to automatically discover a set of feasible and approximately optimal data transforms to meet the requirements of the operations that need to be performed it seems like very large productivity gains might be obtained.

gruseom · on April 23, 2012

a huge amount of developer effort tends to be expended in mapping data between different data structures as required by different algorithms or API's

This is indeed one of the biggest sources of complexity. A programmer I used to work with called it "meat grinding".

mtraven · on April 23, 2012

I call that "plumbing", since it's basically just getting different sizes and shapes of pipes to connect. But "meat grinding" is good too.

gruseom · on April 23, 2012

When I hear people say "plumbing" it seems to mean not data conversion but the internal workings of a system, especially anything infrastructurey. So "meat grinding" seems more specific. But I like it because it's appropriately disdainful. Who wants to grind meat all day?

Dostoevsky said that if you want to drive a man insane all you need to do is sentence him to hard labor moving a pile of dirt from one side of a prison yard to another – and then back again.

tptacek · on April 26, 2012

Fitting data in whatever form it has taken at point X in a program into the form that API Y expects is often more like meat ungrinding.

hxa7241 · on April 23, 2012

Kay's assertion seems incredible -- he really has to prove it.

It does not square with at least some research and what Brooks most perspicaciously noted in 'No Silver Bullet' -- that there is essential complexity in software, and it is the dominant factor. That really seems very opposed to anything anywhere near 100-fold general improvements.

Looking at the VPRI Steps report of 2011-04, prima facie, overall, they seem to be producing a 'second-draft' of current practice, not a revolutionary general technique.

People regularly say they translated their system from language A to language B and got 10-100 times code-size reductions. This seems essentially what STEPS is doing: when you look at something again, you can see how to improve it and do a much better job. But the improvement comes not from the general aspect -- the language B -- but from understanding better the particular project because you already had a first-draft.

This can still be valuable work though, since it is about general things we all use. Second-drafts are valuable, and ones of important things even more so.

But we should probably not infer that they are coming up with a revolutionary general way to write new software.

Substantially because large scale software cannot be designed: it must be evolved. We only know what to focus a second-draft around because we muddled together the first draft and it evolved and survived. You cannot write a second-draft first.

Having said that, if they find particular new abstractions, they could be very valuable. If, for example, VPRI came up with a neat way to express implicit concurrency, maybe, that might possibly be comparable to inventing/developing gargage-collection in common programming languages -- certainly useful and valuable.

gruseom · on April 23, 2012

Kay's assertion seems incredible -- he really has to prove it.

What assertion, exactly? That the personal computing stack can be implemented in 20K LOC? They've already achieved the bulk of it, so that claim no longer qualifies as incredible.

Pooh-poohing this staggering achievement because they didn't reinvent personal computing while they were at it really takes the cake for moving the bar. If someone had gotten all the world's recorded music on a thumb drive in 1970, would you have complained that they didn't write the music themselves?

randallsquared · on April 23, 2012

What assertion, exactly? That the personal computing stack can be implemented in 20K LOC?

No, the assertion from the article that "99% or even 99.9% of the effort that goes into creating a large software system is not productive."

While it's very interesting and frankly astonishing that they've been able to reimplement things that are fairly well understood in 100x less code, this is not at all the same as proving the assertion that these things could have been originally done in 100x less code. I hope that it's true that they could have been -- it's exciting to think that it might be true -- but it doesn't seem to be a claim that STEPS proves, or tries to, from the explicit goal of reusing concepts and designs from other OSes.

gruseom · on April 23, 2012

First, Kay didn't write the article.

Second, sorry, but I'm not buying this response at all. These guys have done something that all conventional thinking about software complexity would say was impossible. They've refuted something fundamental about universal practice. Seems to me the only open-minded response is to be flabbergasted and ask what else we need to question.

To say (I'm exaggerating how it sounds) "Oh, well, sure, they did that impossible, but this other impossible over here, that's the one that really matters" just seems like rationalization to me.

Oh and also, it's at least three orders of magnitude, not two.

randallsquared · on April 23, 2012

The supposition was attributed to Kay:

"Alan Kay speculates in this talk that 99% or even 99.9% of the effort that goes into creating a large software system is not productive."

This is the claim.

Without in any way diminishing the astonishing accomplishments of the STEPS team, let's remember that they've spent (are spending) five years to write 20K lines of code. This doesn't sound like the level of effort is 1000 (or 100, or possibly even 10) times less for what they've accomplished than it would have been had they implemented exactly the same functionality in, say, Java. Would the Java code have been 1000 times bigger? It seems quite likely that it would have! But the bulkiness of the end product's source code doesn't determine the amount of effort that went into building it.

Could it be that they spent four and a half years figuring this new system out, and a few weeks writing the 20K lines of source? It could; I haven't read most of their papers. If that was the case, then that's what needs to be pointed out to refute the assumption that they're spending as much or almost as much effort on the end product (but with astonishingly little source code required to specify it).

gruseom · on April 24, 2012

You do have a good point that one should consider not just the code size but the effort to get there (though let's not forget the effort to maintain and extend it over time). But are you sure you're framing it correctly? It seems unfair to compare the effort it takes to pioneer a fundamentally new approach to the effort it takes to implement a common one. That's like comparing the guys who have to cut a tunnel through mountain rock to the ones who get to drive through it later. We have no idea how easy it might get to build new systems their way once the relevant techniques were absorbed by industry. The founding project is no basis for predicting that, other than that historically these things get far easier as the consequences are worked out over time.

What we do know from general research (at least, I've read this many times - anyone know the source?) is that programmer effort tends to be about the same per line of code. That's one reason everyone always cites for higher-level languages being better. Maybe that result doesn't hold up here, but it seems like a reasonable baseline.

loup-vaillant · on April 23, 2012

To build that personal computing environment, they had to solve various problems, including compilation, drawings, text editing, and networking. To me, the set is sufficiently diverse to make full generalization probable. It may take a genius to apply this to other domains, but I'm confident it can be done. Once it is, your average programmer can begin to be 10 times more productive, or more.

loup-vaillant · on April 23, 2012

Yes, it is incredible. (Edit: ah okay, not the assertion I thought of) And they mostly did it. Conclusion: there is a silver bullet.

If you can do a second draft with 10 times less code, then at least 90% of the first draft's complexity was accidental. By definition, so to speak. The question of avoiding that accidental complexity at the first try is a separate question.

The VPRI did find particular abstractions, embodied in their various Domain Specific Languages. Implicit parallelism (not concurrency) was tested in one of their languages (Nile), for 2.5 drawing. The problem was embarrassingly parallel to begin with, but they only had to modify the runtime to parallelize it (30 times faster with 40 cores). I also believe they have some form of implicit concurrency.

Also, it doesn't matter if they didn't invent some new ground-breaking abstraction. Because they just showed that applying the right old ideas in a systematic ways can have a tremendous effect. Now the industry just have to wake up.

hxa7241 · on April 23, 2012

> The question of avoiding that accidental complexity at the first try is a separate question.

But is that not the more important question? -- how much effort it takes to design software (which surely is the gist of Brooks' observation).

If the final code is much smaller, yet it took as long to design/write, what has been gained? Something, certainly (easier to manipulate, reuse), and furthermore there must be a fairly good relation between size and effort, yet it is the effort to create that is the key matter here.

rwallace · on April 23, 2012

Is the effort to create, really the key matter? Typically most of the cost is incurred in other activities, such as maintenance of software after the initial release.

loup-vaillant · on April 23, 2012

Not always, but there is software that you write once, and then forget. For some shops (proprietary and custom software, mostly), the cost of the first try is what counts most. Nevertheless, I addressed that effort to create: geniuses write languages, and the rest of us use them. And with sufficiently good compilation languages, I bet we wouldn't need a genius to implement a worthwhile language.

tgflynn · on April 23, 2012

But if given some set of tasks and a program A which performs them one can show that there exists a program B which performs the same tasks and is 100x smaller than program A (taking into account language/framework implementations) doesn't that show that at least 99% of the complexity of program A was not essential ?

skew · on April 23, 2012

That's incoherent.

If you accept the claims of

> People regularly say they translated their system from language A to language B and got 10-100 times code-size reductions.

then it's clearly false at least for first systems that

> there is essential complexity in software, and it is the dominant factor.

hxa7241 · on April 23, 2012

Well, it should really say something more like 'essential complexity in software development'. The size of the software does not simply, directly represent the effort -- making compacter code might take more effort, and it is only because of the extra work of design iteration that improvements were possible.

But there are various things involved so I am certainly not sure.

Aqua_Geek · on April 23, 2012

> If the same team of developers could rewrite the system from scratch using what they learned from the original system?

We're doing this right now at work - completely rewriting one of our apps from the ground up. It's been an awesome experience to look at what we did wrong in the old code base and discuss how to fix that with the new one.

I definitely don't advocate doing this on every project, but unless there's been due diligence (in refactoring, consolidating duplicated code, etc) I think oftentimes we reach a point where this becomes somewhat inevitable for progress to be made.

A forest fire now and again is a good thing to clear underbrush and replenish nutrients and whatnot.

mickeyp · on April 23, 2012

The issue, as always, is that you trade your "verbose" (by comparison) imperative programming language for something that let's you specify or model the behaviour of a system in very few lines of code. Impressive, indeed, until you realise that knowing how to specify what it is you want is as much a black art as it is a science. Tools like that are hard to make understandable enough to a large enough subset of developers that the investment in learning it fully will pay off.

Back in University we had to so specify a minicomputer (PDP-11) in a metaprogramming language called "Maude" using its complex, Prolog-like rewriting functionality. Needless to say, the entire computer (CPU, ram bank, store) only took up maybe a hundred or so lines but boy oh boy did it take me ages of fiddling to get it just right.

Languages like that are just too difficult to work with for a lot of things, and that's setting aside the inductive and provable nature of something as simple as building a minicomputer out of logic.

InclinedPlane · on April 23, 2012

Most coding tends to add a significant amount of accidental complexity. Take that into account across every layer of development from the requirements and design phase onward and you get an exponential explosion of complexity, and thus an explosion of code size, of bugs, etc. More so when you consider that the typical method of dealing with poor or leaky abstractions is to add another layer of abstraction.

ced · on April 23, 2012

From Alan Kay's interview:

The problem with the Cs, as you probably know if you’ve fooled around in detail with them, is that they’re not quite kosher as far as their arithmetic is concerned. They are supposed to be, but they’re not quite up to the IEEE standards.

Does anyone know what he's referring to?

wmf · on April 23, 2012

I wouldn't be surprised if he's talking about integer overflow, which has bitten people in surprising ways for decades.

AdrianRossouw · on April 23, 2012

adding more programmers doesnt make it faster to develop things. it just means more time in meetings.

me_again · on April 23, 2012

Some software systems are too large for one person, however talented, to write. There's a reason not every line in the Linux kernel is written by Mr Torvalds.

johnpmayer · on April 23, 2012

But that's maybe not the best example; wasn't it originally just him? I think GNU+Linux might better demonstrate what you're trying to say.

me_again · on April 23, 2012

It was originally, but I don't think it's controversial to say it wouldn't be in its current state as a 1-man project.

aaronblohowiak · on April 23, 2012

I believe Linus has said that lately, he doesn't write code so much as review and discuss patches.

alexchamberlain · on April 23, 2012

Philosophical question: Is time in meetings wasted?

scott_w · on April 23, 2012

Short answer: no Long answer: it depends

Some meetings are a complete waste of time, but they're one way of bringing people's thinking closer together.

Meetings probably don't get everyone on the same page, but if they can get people reading from the same book, it was probably worthwhile.

As an example, in our organisation, we use meetings as a way of getting information from development to our sales/support teams and vice-versa (what features are selling, where are the pain-points).

Whoever is chairing the meeting will make efforts to remove people who don't need to be there e.g. if a meeting moves into technical discussions, we may ask the sales representatives if they want to go to avoid being bored.

its_so_on · on April 23, 2012

Actually there used to be a 100x better approach to software than this blog post. Yes, that meant 10000x better software.

All you needed to do is break down what you were doing into steps, and pipe one step to the other with this character: |

Unfortunately, one thing led to another and, well, we are 10,000 times slower now than then.

whats your email address | confirm your email | get a link | pay with my payment processor | get my service

Hahaha, setting this up today takes 8 hours. Approximately 10,000 times slower than it should be. Actually, someone who can set up a complete billing solution in a day is considered a hero.

Amazon is the only company that even comes close to doing for the web what we had twenty years ago for local sys admin tasks. Only one lets you manage loads of complicated files (kind of cool) and the other lets you provide a service to hundreds or thousands of people (kind of awesome).

where did we go wrong.

its_so_on · on April 23, 2012

I wish someone would reply instead of just downvoting, so I can at least see if I've made myself clear.

It used to be that for most tasks a superuser, a real ace of an admin, used to do, they would NOT write new programs: instead they would stitch together old ones.

I wish I could find it but am having trouble just now, but there was an experiment done among various programming groups completing the same task. You had people doing it in various scripting languages, C++, Java, whatever.

The way I recall (again, having trouble finding this), the team or person/approach that won handily was the one using Unix from the standard commandline interface (e.g. bash) - no scripting or programming at all! Where instead of writing a program to do it all, the person or group simply used standard Unix tools, piping them together etc, until the problem was solved. This approach was by far the fastest.

I'm saying, these days on the web we don't really have the same thing when you develop a new application. We don't have a "Unix of the web" - though, again, AWS and Amazon's ambitions on e.g. payment, database, etc, seem to be vaguely in that direction - which is far more productive than writing a piece of software.

No matter how productive -- HOW PRODUCTIVE! - you are at writing a script to bill a user ten dollars, you can never -- NEVER! -- be as fast as typing "| bill 10.00" where "bill" takes an email address on standard input. That Unix program does not exist.

The way the web is developing, it does not look like it will exist. This was my point. I guarantee you that typing "| bill 10.00" is nearly ten thousand times as fast as writing any program in any programming language that does that.

Unix works because someone took the time to write programs that can be stitched together at the command prompt (or from a script). The Internet just doesn't work that way.

The blog post I'm referring to ends with: "Is that just the way things must be, that geniuses are in short supply and ordinary intelligence doesn’t scale efficiently? How much better could we do with available talent if took a better approach to developing software?"

I say, the problem is that the geniuses are no longer creating the "Unix programs" of the web. They are writing software, i.e. lines of code, they are not writing web "utilities".

If all the geniuses got together and gave me the top thousand things you need an API for, and instead worked on making them a set of small, modular Unix commands, then I would be literally ten thousand times more productive than now.

I could literally do in 20 seconds of typing what I can do in two days.

jbooth · on April 23, 2012

You know, I like grep and sed and all that too, but what the hell are you talking about here?

The bill command just takes an email on stdin? How's it know which account that email belongs to? From a database? With what credentials? Does it bill to paypal or visa? To which merchant account number?

The thing with "unix style programming" fetishism is that, yeah, pipes are great, but now you're writing incredibly complicated options parsers to configure all your little standalone programs. Isn't there a point at which a simple method call is easier? We've had method calls for a long time.

The reason you were downvoted (not by me) is probably that people thought this was obvious and you were being obtuse and ideological.

icebraining · on April 23, 2012

Instead of doing "| bill 10.00", I'd do "from payment import bill; bill(10.00)".

Now the issue is implementing the payment library, just like you'd have to implement the 'bill' program.

Your argument is for composability, and while UNIX utilities are certainly a great example of that (which I use every day!), they're hardly the only. Perl, Python, Ruby, etc all have plenty of small libraries to solve specific problems.

Arelius · on April 23, 2012

The problem is this begins to break down when you want to do more complicated things, A small gripe that still illustrates the point:

> be as fast as typing "| bill 10.00" where "bill" takes an email address on standard input.

What happens then when "bill" takes and email address on stdin but I need it to take an amount on stdin and an email address on the command line. This is the sort of thing that unix starts to break down on, you spend far too much time trying to munge the data into the delicate format that various programs expect it on.

Secondly you say: > No matter how productive -- HOW PRODUCTIVE! - you are at writing a script to bill a user ten dollars, you can never -- NEVER! -- be as fast as typing "| bill 10.00" where "bill" takes an email address on standard input. That Unix program does not exist.

But have little evidence to back that up, and with a little inspection it seems that that holds very little grounds against something such as "bill(10.00, email)" as one would do in a modern environment.

If you want to know why you are being downvoted, it's because you have remarkable claims (10000x improvement is very remarkable) with little to zero evidence to back those claims up.

loup-vaillant · on April 23, 2012

Look at the video he linked, here: http://tele-task.de/archive/video/flash/14029/ (tldr: 20.000 lines seems to suffice to make a running OS with "personal computing" apps.)

I think you will mostly agree with the video, but If you don't know of the Viewpoint Research Institute, it will be worth your while.

PaulHoule · on April 23, 2012

100x better is a bit much. 10x better, maybe.

skarayan · on April 23, 2012

In the large projects that I have seen, when I think about how much of the work being done is the core product vs frameworky stuff and/or integration, I think 100-1000x is more accurate.

The pure business logic constitutes a very small part in comparison, but the type of environment/company also matters. Startups tend to have less fluff than large enterprises.