Hacker News new | past | comments | ask | show | jobs | submit login
Software Development Has Diseconomies of Scale (allankelly.blogspot.com)
138 points by gatsby on Dec 15, 2015 | hide | past | favorite | 87 comments



Normally EVERYTHING has both economies and diseconomies of scale.

You model the price per unit as the sum of different curves.

Complexity not only increases on Software, but if you design a thermal engine, or a plane, or a car.

Working making something as simple as fiberglass, we had something like 100 components, like tensioactives. Most of them we had no idea what they were for, as they were added like decades ago by someone who new.

Nobody wanted to remove a given component and be responsible for the fiber breaking and stopping the line, incurring on tens of thousands of dollars in penalties, so new complexity was added, but not removed.

In my experience, software is the thing in which YOU CAN get the most economies of scale possible, because you do not depend of the physics of the world. But you need to control complexity as you develop.

In the real world, you create a box because it is the only way of doing something, and the box automatically encloses everything that is inside. You can't see inside, nor want to. It is a black box that abstracts your problems away.

In software you have to create the boxes. Most people don't do it, with nefarious consequences.


Your experience with fiberglass reminds me of http://c2.com/cgi/wiki?OnionInTheVarnish


"But you need to control complexity as you develop."

One way to control the complexity is by consistent development of automatic regression tests which cover >90% of functionality. Then you are much less afraid if you remove something or make a big change in production grade software used by many customers.


This is one of those seductive statements that is true, but it's also such incredibly bad advice that just this one piece of theory has the potential to completely destroy your company if you follow it.

But it's seductive for a reason : it's true. Doing regression tests on this scale will protect you from quite a sizable class of bugs. BUT there is no way to write lots of tests without locking down the implementation of the program. If you limit yourself to black box testing (meaning either no unit tests, or breaking and replacing unit tests is encouraged) you can retain some flexibility for a while. You want to be in the situation that if I go into an application and replace one algorithm with an equivalent one at any point in the app, none of the tests fail. I have never once seen any testsuite do that correctly. In most cases tests complicate an already complex codebase.

This leads to having to choose between 2 approaches : either you ignore tests results (which of course does result in bugs), or you always fix tests (which of course leads to the tests becoming tautological over time as different people "fix" them and results in both more effort AND bugs).

I don't disagree that for some software locking the functionality down is appropriate. But not for line of business software, apps with large UIs, essentially anything consumer facing ... It is only appropriate for standardized infrastructure software, things that manage accounts, process payments, regulate a power plant, avionics, ...


I've been thinking about this a lot, and I think a way out might be to drop the phrase 'black box' in your comment. I've been exploring white-box testing by making assertions on the log emitted by a program. Not only does it allow greater flexibility in large-scale reorganizations without needing to mess with tests, it also allows us to write tests for things that we don't currently write tests for, like performance (make sure the number of swaps in this sort function don't exceed this threshold), fault tolerance, race conditions, and so on.

More details: http://akkartik.name/about. An early write-up on white-box testing: http://akkartik.name/post/tracing-tests


Following your fiberglass manufacturing example. It sounds like you've identified technical debt as it applies to physical engineering.

In all likelihood, somebody on the 'business end' made some very bad decisions that were penny-wise/pound-foolish.

Long-term sustainability for mechanical systems require paying extra to maintain the design.

A lot of business decisions involving engineering are completely reactionary. Changes are usually required when some unforeseen event happens (ex something breaks, market changes, etc). Poor budgeting will lack a contingency plan to cover these costs.

Even with a full-time staff of engineers, their role is usually limited to daily operations. Future updates usually require contracting outside support, at a premium cost.

When it comes time to do the work, it's in the interest of the business decision makers to reduce the impact on the 'bottom line' as much as physically possible.

The work will need to be done on a short schedule to reduce the impact on production. As soon as the changes are complete, it's in the business' best interest to send the outside support home as soon as possible. As a result, design/documentation updates are left undone.

Technical Debt:

Over time, physical changes accumulate whereas the design stays the same. When the design/documentation is unreliable, the cost of future changes grows dramatically with each additional update.

Technical Bankruptcy:

At some point it gets so bad that the design/documentation isn't worth the paper it's printed on, nobody knows how the system works, the implementation becomes fragile. Nobody is willing to accept the risk/cost of a significant engineering redesign. The system runs until it fails and the business writes it off as a loss.

There are some parallels that can be drawn. Complexity, is inherently difficult to maintain in the long-term.

You could choose to favor:

- Traditional Design:

Monolithic architecture w/ deep-coupling in software vs monolithic engineering designed for production not maintainability.

Everything is delivered as one big ball of mud. Even with good test coverage, the sheer complexity makes it difficult to make changes.

- Black Boxes:

External libraries in software vs COTS components in hardware.

The internals are opaque, if it breaks the only option is to replace it. If it becomes obsolete and/or long-term support is cut, good luck.

- White Boxes:

Internal well-tested libraries/modules in software vs modular/pluggable well-tested designs in hardware.

White boxes solve the long-term support issue but require significant resources to maintain/develop.

- Gray Boxes:

Open source libraries/modules in software; open hardware designs.

May not be feature-complete or include sufficient test coverage. The upfront cost is dramatically reduced but there may be license/communication barriers that prevent changes from being made.

There are trade-offs and risks that come with each approach. Monolithic designs require better language-level architecture and internal separation of concerns. Modular design require better system-level architecture and external separation of concerns. The only constant is, the greater complexity the greater the cost to maintain.


Bootloaders are small, but very important software.

k/q is small but a very useful interpreter.

There are so many examples, but it appears that to "the market" the most valued software development is large scale.

The sentiment is create and contribute to large projects or go home. Stupid, but true.

"Do one thing well" is more than just a UNIX philosophy. It is an essential truth. Most programs are lucky if they can do one thing "well". How many so-called "engineers" are afraid to write small, trivial programs lest they be laughed at?

Large programs often become liabilities. Can we say the same for small programs? If it happens, write a new one.

Maybe a user with an unmet need would rather have a program that does the one thing they want as opposed to one program that can allegedly do everything... whereby they are granted their wish through addition of "features". More internal complexity. And majority of users only using a fraction of the program's feature set. Waste.


> "Do one thing well" is more than just a UNIX philosophy. It is an essential truth.

Yes, thanks. My eye detects light and passes the signal on to my brain. It does it quite well. My liver removes toxins from my blood. It does it quite well. The big ball o mud made by BigCo is a hammer that reads the ambient temperature, weighs my nails, tells me the time, and predicts the outcome of futures markets. Is it any wonder it doesn't work worth a damn?


Oh please. Your eyes are a system rife with bugs. You have a blind spot that exists for absolutely no good reason - the bundle of nerves can just as well connect to the back of the retina rather than the front. About 50% of people have a defective lens, which can only be corrected by putting yet another lens in front of it (in frames, directly on the eyeball or by ablating the eyeball in order to turn it into a lens). You can be near-sighted AND far-sighted at the same time. Your eyeballs contain imperfections called floaters that your body has no way of removing, so your brain learns to filter them out. You constantly see your own nose. Some people's eyes are misaligned and they have no depth perception. The colour violet doesn't exist, your red cones are actually activating in response to your blue cones detecting high-frequency blue light. And I don't mean they activate in response to the light, physically the red cones don't detect any of the wavelengths of violet light - they're just wired up in such a way that a low activation of the blue cones triggers activation of the red cones. Some people have a defect that makes their red cones activate in response to wavelengths too close to green light, so they can't tell the two apart. Some are just completely colour-blind. Oh, and the image you get in your eyes is rotated 180 degrees.

Your eyes most definitely don't do just one thing and they most certainly don't do it all that well. But they do an important job reasonably well, which is what strive for in any large software system.


You're right. It's only one of the most complex and intricate pieces of machinery in the universe, enabling billions of organisms around the planet to build a realistic, 3-dimensional picture of the world around them by picking up hundreds of millions of bits of information from the fastest moving particles known to man.

It's about on par with Microsoft Word.


It's the intercommunication with the web that becomes so painful. Users have no way to connect a stream of web applications, and the web disincentivized solving the corporate distribution problem.


How so? The web is already standardized on REST as the 'universal' API.

Publicly accessible microservices are becoming more and more common over time and some of them are being used to create integrations.

Have you heard of Zapier? https://zapier.com/how-it-works/

Slack bots also interact with other services: https://slack.com/apps/category/At0EFT67V3-new-noteworthy

There's also the Huggin project: https://github.com/cantino/huginn


I am thinking more at the internal-software level.

Consider an enterprise that has a simple case management app -- essentially, a help desk app for one portion of the business.

There is another app within the system that was developed 5 years before to help choose when to start a case. It's maintained by a different team, so there's no natural coordination.

If these were two Unix tools built like programmers would build, someone could connect the output of one, to a filter in the middle, to the input of the next. However, in the real organization, there is an individual who spends 20 hours a month to manually copy the information from app 1 to app 2 because the work to create a proper hypermedia interface.

The combination of a corporate-aware Huginn (single sign on, etc) and a revolution in corporate apps that treated hypermedia as a first class citizen could make it happen. It's not happening now, however.


The problem with 'enterprisey' corporate applications has more to do with the culture of the developers who build those applications.

Instead of breaking applications apart into distinct services they devs of corporate apps prefer to use OOP to create complex inter-dependent monolithic architectures. APIs for such apps are usually exposed by using a library, so a lot of emphasis is placed on access privileges at the language level (ie private/internal/public) instead of at the interface level (ie pipes/REST).

The problem isn't that the capability doesn't exist, it's that corporations are either unaware or unwilling to experiment with alternative approaches.

For example, one employee could spend half their time collecting data in Excel files from multiple members of a team, organizing it in a presentation-ready format, and emailing it up the chain.

Alternatively, the team members could each have a Google Spreadsheet where they input the data and a timed trigger fires off a bot that processes the data, saves it to a new file, and emails the link up the chain.

Both accomplish the same result. The second is a lot less prone to error/bias, and is also a lot more cost effective. But, to get an organization to adopt the latter approach requires convincing the organization to give up Excel as their canonical tool for organizing data.

I have fought that battle and it's a lost cause. Even when adequate proof of cost/time savings are presented in a business-friendly format, inertia trumps reason.

A company could theoretically hire a developer to deploy a digital assistant to streamline/automate the mundane and repetitive business processes. Except, devs are expensive, in short supply, and it's a really hard sell business leaders on the idea of adopting processes/tools that they don't fully understand.


> How many so-called "engineers" are afraid to write small, trivial programs lest they be laughed at?

Very few. Engineers love simple and reliable stuff. You might be thinking CS graduates.


Indeed. I think the problem is that "engineer" has come to have two meanings: "engineer" and "computer programmer."

The defining trait of an engineer is that he or she can't make innocent mistakes. If engineers sign off on a structure and it collapses, it's their fault; anyone who does that is civilly liable (i.e., able to be sued for damages), and, if memory serves, criminally liable as well if the mistake was negligent enough.

That standard doesn't apply to computer programmers; programmers who want to be called engineers should try thinking about what their lives would be like if they, too, could be sued or imprisoned if they messed up.

(They should also remember Paul Graham's point, at http://paulgraham.com/love.html , that prestige is the after-effect of doing something well. Done well, anything -- even jazz or novel-writing -- can become prestigious. Done badly, anything can become disgraceful, and then it starts changing its name in order to pretend to be something less dishonorable. If you don't want to be called a programmer, it's because programmers are incompetent and personally disagreeable; do what you can to change that.)


KISS has been a cherished principle for most of the past century and for good reason.


Indeed, the history of the development of very large software systems has been littered with disasters. Don't people read The Mythical Man-Month any more?

http://www.amazon.com/Mythical-Man-Month-Software-Engineerin...


Apparently not. Communication between team members has a huge cost factor ... not only ito dollar/time but also ito the quality of the final product.


Maybe within a small bubble of people who understand. I thought the idea was self-evident, but after working with people whose instinctive response to every problem is to throw code at it, I can assure you it is not.


Not all engineers. Doesn't emacs represent a different philosophy than the unix philosophy? I'd find the reference, but I have a pile of software to deal with--what a predicament.

I suspect that some engineers even make mistakes so they can heroically correct them later. I might be that engineer subconsciously.


While some small, simple programs are great, big economic contribution is mainly in large software: manufacturing control, ERP, air-traffic control, power-plant control etc.. Small software is great, but the big economic impact belongs to big software.


There is no reason why 'big software' can not be made up out of small software pieces inter-operating. In fact those are the strongest and most maintainable systems.


Small software that's made up of small pieces is still big software. In fact, most big software systems in the last twenty years or so are built that way, but that doesn't eliminate the the complexity. It makes it manageable -- not cheap.


Big software is made up of inter-operating small pieces of software: subroutines and objects ;)

Breaking something into multiple processes doesn't magically reduce complexity. You can have the same architecture within one process or spread out among 50, and in the latter case you've actually added the complexity of some IPC layer and lifetime management for all of the different processes.

That's obviously not what you meant to say. You meant: break things down into the right abstract components such as to decrease coupling and increase cohesion. You meant: do it so well that Component A can be maintained by a team that doesn't even know about the existence of Component B. And that routine changes in the requirements for one don't require changes in the other. That's really hard, especially if your codebase is a coupled mess because you wanted to SHIP IT fast. It's very hard to make a business case for big refactoring ("So we're going to spend all of this money and at the end it's going to work the same way it does now, but maybe a bit better? Can't we just spend that money fixing bugs instead?"). Also, strong boundaries between components can ironically make it harder to refactor, when it turns out that a different breakdown makes more sense. And God-forbid that other projects have taken dependencies on the components, further tying your hands.

If this was like traditional engineering, you'd have some massive upfront planning effort, dozens of committees, reviews by regulators, massive fault tolerance, etc. You'd then have a (hopefully) mini version of this same process every time requirements change or technical limitations get in the way of your original plan. Five years later you release your shiny piece of modular enterprise software. But your competitor shipped their big monolithic alternative that GetsTheJobDone in a quarter of the time and price. A good example of this is that all of the successful operating systems have monolithic kernels, even though academics and researchers favor microkernels.

So you're right, there's no reason why big software can't be made up of a bunch of small software, but there are a lot of reasons why it sometimes isn't.


> Breaking something into multiple processes doesn't magically reduce complexity.

Actually, it does.

If you break up a complex system into communicating processes then for each and every process you have a clearly defined set of inputs and outputs, memory protection, the ability to run multiple instances seamlessly without mucking around with threads and a very much reduced scope.

Decomposition into multiple communicating processes is an extremely powerful tool to reduce complexity.


Yes, it reduces complexity considerably, but nearly all big systems are already built this way and are still very complex. You cannot use this approach to eliminate complexity, i.e. make the cost linear with the system size.

Some domains where software is used to yield a tremendous economic impact are so complex that even with all complexity-management approaches at our disposal, the cost of creating them still grows super-linearly with their size.

If there are n abstract features to the system, all dependent on almost all others, you have an order of n^2 interactions. Whether you make every function its own simple communicating process, or bundle a few of them together (shifting complexity into a process) the abstract complexity of the system is still n^2.


Yes, it reduces complexity considerably, but nearly all big systems are already built this way and are still very complex.

complexity is a function of interactions. multiple processes won't do much for you if every one communicates with everybody else.

If there are n abstract features to the system, all dependent on almost all others [...].

the discussion has been about using process separation as an aid in preventing accidental complexity arising from improper coupling introduced as a shortcut so this looks a little disingenuous. my response is that if you separate every function into its own simple process, a misbehaving function need not take down the whole group.

on the improper-coupling-reduction front, process separation is apparently paramount to success, see eg. sendmail vs postfix.


Let me put it another way. Many of the software systems that make the biggest economic contributions have a very high essential complexity, and so their cost grows super-linearly with their size.


As soon as you've got a few hundred small pieces interoperating, you're not talking about something small any more.


That's true, but if you architect it well you should still be able to have an overview with every program as a functional block and inside the programs the scope should be limited enough that they are easy to understand. It's all a matter of balance. Good examples of such systems: message switches, telephone installations, routers, large scale web applications and so on. These are all very suitable to such decomposition into communicating processes.


> These are all very suitable to such decomposition into communicating processes.

That doesn't make them cheap to build; only possible.


Better expensive than not for sale.


Sure, but I think the author's point about "diseconomy" of scale still stands even when the large system is composed of small parts (and pretty much all large software systems in at least the last 20 years have been composed of small parts).


You could make the exact same claim about any bespoke piece of technology. What does it cost to make one new car? one aircraft? one piece of electronics? one moon rocket? one satellite?

It's all economies of scale that come into play as soon as you can ramp up the volume. And that's where software wins hands down, in spite of all the up-front costs.


I think that all the author was trying to say is that the cost of a single software system grows super-linearly with its size.


Software has economies of scale in distribution. In fact the economies of scale of software are the key point of how software businesses are causing disruption. A single software program can be replicated infinitely at zero cost and allow anybody who has 1 liter of milk to have 1000 liters of milk at no additional cost. So in the author's example, software would be the same price for both 1 and 2 liters.

Complexity is something completely different and is well known in all products. I can design a calculator that adds numbers very easily. A calculator that does fractions is much harder to design and costs more. A car with a more complicated engine is much harder to build than a simple engine. This has nothing to do with the actual economies of scale of the calculator or car or you could say that cars have dis-economies of scale too - and obviously they don't. They're the poster child for economies of scale.

Building a truck that is 10km long is worse than building 100 trucks that are each 100m long, but this has nothing to do with 'diseconomies of scale' inherent in trucks.


You're responding to an argument the author doesn't make with a fact he is aware of. That's why he wrote: "once the software is developed then economies of scale are rampant". You can't distribute something before it is developed.

Most of this HN thread seems merely a reaction to the title—specifically the failure to say "development" in the title—which is a shame, because the point it makes about staying small is a critical one. If we don't even understand this here, the odds of any large group ever understanding it seem negligible. Another diseconomy of scale I guess.


Not at all. Let me quote the article then if you can't see it.

> Finally, I increasingly wonder where else diseconomies of scale rule? They can’t be unique to software development. In my more fanciful moments I wonder if diseconomies of scale are the norm in all knowledge work.

They apply to every known form of manufacturing, such as vehicles or genetic engineering of vegetables (the more parts of a vegetable you try to engineer, the more difficult it gets). The author seems to have missed this, which is what I was pointing out.

And on the idea that even developing software does not have economies of scale, I present the following: Linux. Linux is what we build everything on top of. Without Linux, we'd have to start from scratch. Linux is just code. Databases are the same way. We build our applications on top of thousands of lines of code and we utilize the economies of scale of everyone building off the same code to make our code easier to create.

So I'd say that not only does software have economies of scale in distribution, it must have economies of scale in production or we would not have operating systems or databases. The issue he is dealing with is a simpler one - complexity is hard to manage.

If you manage your complexity well and create a Linux or an Apache web server, then you have economies of scale. If you create a ball of mud, you have diseconomies of scale. If you build your 1km long truck out of well designed modular blocks, you have economies of scale in transporting massive amounts of goods along flat roads. If you try to custom build the whole 1km of truck, it will be a disaster. Nothing unique to software here, and diseconomies of scale are a failure of software design, not an inherent property.


For what it's worth, the actual term for this is the 'marginal product of labor'. It is a type of economy of scale, so I wouldn't say it's confusing, but it is a little weird to describe this way.


Perfect, bug-free software has economies of scale.

There is no such thing as perfect, bug-free software.

Once you start getting into any sort of maintenance, bug-fixing, feature creep- you lose all of the economies of scale of perfect software and replace that with the expenses the author discusses in the article.


You've possibly missed my point. The author made the statement that software differs from other items. It does not. Try making a 1km long truck and see if it's perfectly bug free. That doesn't mean trucks don't enjoy economies of scale.

Let me try another way to explain this: if software had diseconomies of scale, then selling each successive copy of your software that you made would make all the software more expensive. Eg, if I made a computer game and sold it to 1 person, I could do it for $10. If I sold it to 10 people, I would no longer be able to afford selling it for $10 and would need to sell it for $20 (x10 = $200 total). Makes no sense right? That's because software enjoys economies of scale. If I sell 1 copy of my game and want to make a profit, I need to charge $500000 to 1 person. If I sell it to 500000 people, I can get away with charging each person $0.99. The more software I sell, the better my economies of scale become.

abrgr explains it nicely in a sibling post here.


In software you pay for complexity. Big software is more complex than small software (by definition!) so it's more expensive.

However, managing systems of small software also incurs complexity, the smaller the software components the harder you have to work to make them play together.

It's often not clear a priori whether it's worth to pay a lot more up front to get a monolithic solution or to try and glue together many simple tools.


Well, of course it's easier to have a single off-the-shelf solution which solves N things than have to cook up the same solution using separate independent pieces of software.

But I would argue it's easier to debug and modify each piece separately if the divide between them is clear and the interface layer is at least somehow contained in one place than to debug/extend monolithic software solution.


This diseconomy of scale argument rings true for the size of releases for sure. From an operational standpoint, I agree - the calculus is a lot less clear.


As RyanZAG says, all production has diseconomies of scale in unit complexity. The production process has economies of scale, meaning that churning out more units of equivalent complexity reduces the marginal cost of churning out the next unit of equivalent complexity.

Software exhibits this same economy of scale in production. Take Google's machine learning platform. They allow multiple functional teams to churn out roughly-equivalently-complex machine learning-powered widgets in less and less time. Contrast that with a startup building a single machine learning-powered widget and the marginal cost to Google is significantly lower.


It's a difficult analogy to make, in particular when you forget to consider the ocean of milk underneath you from all the libraries and frameworks you are using. Then the difference between 4 small bottles or 1 big bottle seems less significant.


The best part of the article is the concrete image of the milk cartons. On first seeing the image, your mind is going to tend to think things ought to be one way. Then it comes out and says, "No, it is the opposite." That creates a bit of cognitive dissonance and makes one ask: "Wait, why?" This is good as far as software goes, because it is so abstract that often the brain is not fully engaged when talking about it. It is too easy to know something in the abstract, but then not know it enough to apply it in the concrete.


Software works like this too, the author is confounding different aspects.

To see how software is like milk, you have to compare writing custom software for each and every client vs. writing one product and deploying this same program to all clients.

The analogy of mass production for software is when the same operating system runs on millions of machines.

However, the author of the article suggest that the equivalent of mass production is when a piece of software takes on a lot of responsibility, and gets bloated with features. But that's a totally different issue. When you compare ten cartons of 1 pint milk vs one carton of 10 pints, then you are dealing with indistinguishable ten cartons.

If you want a real world metaphor that's like software complexity, you could rather use recipes. Elaborate, complex recipes don't scale; just like software. There it's not trivial to put the non-indistinguishable ingredients together. With milk, it's just pour all the milk in one big container. There is no additional thinking required.

TL;DR author compares apples and oranges.


Every time software economies of scale come up, I can't help but be reminded of Jira's pricing model (https://www.atlassian.com/software/jira/pricing?tab=host-in-...):

  (Per month:)

  Users	Total 	Per user
  ----- ------- --------
  1	$10	$10
  5	$10	$2
  10	$10	$1
  15	$75	$5
  25	$150	$6
  50	$300	$6
  100	$450	$5
  500	$750	$2
  2000	$1500	$1


It is certainly odd, but I'm guessing this is more about price discrimination. I think price must be greater than or equal to cost at some level, but certainly not proportional to it.


Price discrimination is an important aspect to consider but I don't think it's directly relevant to the post.

From a business perspective, price discrimination probably has more to do with accounting, support, and long-term sustainability.

A transaction that includes 1000 seats requires 1 account to manage them and will likely have in-house support staff to help solve with technical issues.

A 1000 transactions for 1 seat requires 1000 accounts to manage and 1000 potential points of contact that need to be supported when things go wrong.

In addition, companies that have the resources to buy a lot of seats up front are more likely to be more 'stable'. Those are the customers who will standardize their internal processes to use your software, thereby guaranteeing consistent payment over the long term, and higher likelihood of return sales in the future.

That's probably why a lot of SaaS sites offer free accounts for individual/small-team usage. Early adopters tend to be more technically inclined, will evangelize the product if they like it, will be more sympathetic when minor bugs are encountered, and the 'free' aspect means there's no guarantee of support. They're essentially buying a community of QA testers with free seats.

The 1000 seat customers cover the bills (ie CAPEX+Fixed-Costs). The lesser customers cover the rest (ie OPEX+Variable Costs). The latter keeps things in motion, the former signals it's safe to scale up.

To remain profitable it's important to minimize CAPES+Fixed-Costs as much as physically possible because when feast turns to famine they're the hardest to reduce.

In accounting terms, favoring financial liquidity means favoring flexibility to adjust to changes in the market. This is a key deciding factor behind companies shifting to 'the cloud'. Even if the upfront costs are higher than maintaining bare metal servers, it takes little/no effort to ditch redundant infrastructure when it comes time to reduce costs.


Diseconomies of scale apply in wholesale finance too. Put a big order on a limit order book, and you'll exhaust the liquidity and move prices against yourself. Dealers usually offer wider spreads for large trades as they'll have to work off a large position afterwards, and they need compensating for taking the risk of being on the wrong side of a major price move while they have the position.


Diseconomies of scale certainly aren't unique to software, and the author sensibly notes "individual knowledge".

One of the main effects of protectionist and interventionist policies has been related to them. A domestic firm starts to rot, unemployment prospects are rising and a sense of national preservation starts to set in. Thus, in the short term, tariffs are levied, subsidies are made and some macro notion of "stability" or "optimality" is reached. The long term costs are the artificial delaying of the onsets of diseconomies of scale with state and business expansion leading to symbiotic interests. Then people complain about Big Business fucking them over.

(The fact that the author quote Keynes makes this all the more ironic. Keynes-the-man wasn't objectionable, but the neoclassical synthesis/"pop Keynesianism" of his disciples Paul Samuelson and John Hicks did influence government policy in a negative way, as noted in James M. Buchanan's Democracy in Deficit.)


People don't think software has economies of scale. The amount of articles I've seen about the "mythical man month" and such all talk about how hard software is to scale.

What people do think is that the marginal cost of reproducing software is basically zero, regardless of size. This means that choosing between two products, if product 1 has n amount of features, and product 2 has those same exact n features plus an additional feature, all consumers will rationally choose product 2 (lots of assumptions, i know).

This is why companies try to get bigger because if they can offer more features, than all the consumers will choose them and they get all the sales. One could argue that this is the reason why the "power law" effect thats been talked about on HN recently happens.


I get the point the article is trying to convey. Scale increases organizational complexity and overhead relatively more than the added manpower contributes. However, the analogy with the 'milk' is far off. First of all, since the 'duplication' cost of software is near 0, buying a single seat or license is more expensive than buying in bulk. With a few exceptions, this can be established by browsing any product or SaaS website. Second, but this is minor, in retail vendors often abuse this expectation pattern and have now started to charge more per volume for the larger packages. The production side of software is more like R&D, and there you find the diminishing returns, as iconified in DeMarco's 'the mythical man-month'.


There are many kinds of scale.

Poor performance on military projects is often an issue of huge development costs spread out over a tiny number of units.

Apple spends as much to develop an iPhone as it costs to develop a new weapon system, except they sell millions of the phones so the unit cost works out ok.


I'd argue that Apple spends considerably more on the development of an iPhone than is spent on a typical defence project.

For any product which is manufactured in large quantities, the BOM will account for the vast majority of the overall cost, so limits on the cost of the BOM will significantly constrain design decisions.

These constraints are less significant for products which are sold in more limited quantities. Here, design & NRE costs predominate, leading to (very) different constraints on design decisions. In this situation, for example, it might make sense to buy a more expensive materials to make the system easier and faster to design.

Software itself isn't homogeneous. Different software products may scale in different ways. Embedded Software is different from Enterprise Software which is different from Application Software ... and so on.

Some commonality can be seen however. For example, product line engineering can be used to find commonalities and amortize design costs over a set of similar-but-distinct products ... although this is easier to achieve when you are doing bespoke work for clients than when you are producing shrink-wrap applications.

Here, you are scaling up over multiple different customisations of a design, (or multiple different members of a design family) rather than individual product elements -- but it is still an economy of scale none the less, and it is still software.


It depends on your point of view.

The point of software is to deliver value to the business. There's overhead with supporting and integrating each system -- to borrow an analogy from the article, each milk carton needs cardboard, a date stamp, etc. Even if software development productivity drops 75% and delivery cost increases, having one big carton of milk may be more cost effective than supporting 50 smaller, more nimble cartons.

If you want evidence that this exists, consider that SAP and PeopleSoft exist and are thriving businesses. Or that the general ledger of most big financial institutions are running on mainframes with code that's been in production for 30 or more years.


If you had to build a windows GUI from assembly code, almost all software projects would be too expensive. Instead we reuse high level languages and frameworks to start with the basics a program needs.

To extend the metaphor to milk, what if the milk industry had to invent the glass industry in order to make the bottles which it comes delivered in? Consumers would have cows not refrigerators.

The dis-economies-of-scale-software are programs where normal glass simply can't be used to hold the milk. A whole new custom type of glass has to be developed. And this usually for a type of milk only like 1,000 people even drink it.


From a "Computer Science" perspective: "Economies of scale" is another word for "sublinear growth". Software is, fundamentally, a graph. And the number of edges in a connected graph grows quadratically.

Pretty much any strategy to improve making software at scale, whether code organization or organizational design, is finding ways to limit the complexity of the graph to a constant multiplier of the number of nodes, and keeping that constant small, rather than allowing things to grow quadratically.


I feel conned. But an honest headline -- Software Development Has Diseconomies of Scale -- wouldn't have sounded controversial....


"Much of our market economy operates on the assumption that when you buy/spend more you get more per unit of spending."

Supply and Demand says the opposite. The supply curve slopes upward, meaning that a higher per-unit price is required when the aggregate supply is higher.

Economies of scale apply in some situations, but people generally place way too much weight on them.


> And if you don’t know, the UK is a proudly bi-measurement country. Countries like Canada, The Netherlands and Switzerland teach their people to speak two languages. In the UK we teach our people to use two systems of measurement!

The Netherlands? We only speak Dutch here. :-)

I guess the author means Belgium, where they speak (at least) two languages: Vlaams and French.


> The Netherlands? We only speak Dutch here. :-)

... he wrote in perfect, neutrally-accented English :-)


Tell that to the folks in Friesland (they speak West Frisian).


1 (British pound per liters) = 5.69500061 U.S. dollars per US gallon

Milk is a lot cheaper in the US. I usually pay about $4/gallon.


You'd have to be buying organic goat milk to pay that much. From the figures in the article:

Buy it by the pint: 0.49 (British pounds per Imperial pint) = 4.90836249 U.S. dollars per US gallon

Buy it by the 2 pints: (0.85 British pounds) per (2 Imperial pints) = 4.25725318 U.S. dollars per US gallon

Buy it by the 4 pints: (1 British pound) per (4 Imperial pints) = 2.50426658 U.S. dollars per US gallon


Software has diseconomies of scale, but also has economies of scale.

For example, because of context switching: when a developer makes one change it can be pretty easy for them to add another change (everything is already "open").

Other comments here mention distribution and combining small simple tools for something larger.


>Finally, I increasingly wonder where else diseconomies of scale rule? They can’t be unique to software development. In my more fanciful moments I wonder if diseconomies of scale are the norm in all knowledge work.

The pop music industry seems to fit the bill.


Article seems to be conflating fixed costs with variable costs; development with production. Takes more labor to set up a plant to manufacture a million Model-Ts than building one horseless carriage in a barn.


It is actually a lot worse than that article suggests. There is a history of big government software projects which proved practically impossible to complete on time and on budget or to get them working at all.


> Four, get good at working in the small, optimise your processes, tool, approaches to do lots of small things rather than a few big things.

Why, I think I've heard that before...

"Do One Thing and Do It Well" from https://en.wikipedia.org/wiki/Unix_philosophy

Edit: Strange post to be down voted on. It was an interesting connection to me.


Well I made a command-line Go program that processed a 10+GB XML file, made some computations, and put it in a JSON Redis structure. It had zero bugs, with months of development. I made it myself, so you can have zero bugs if you do things yourself. The point where you have to work with others results in bugs.


To be fair, the first few developers are often more than 1x multipliers. But you definitely reach a team size where additional developers have decreasing marginal value pretty quickly.


Unix command-line tools


In my experience, many if not most command-line tools and programs we use on UNIX are not great examples of simple, small, focused design. Lots of Swiss army knives packed with options. And the interface is limited in frustrating ways.

One example: I started using email with mh, which used command-line tools to slice and dice your mailbox. Each of these tools was relatively easy to use. But who uses this approach anymore to read email?

It's a more interesting design question I think to study when UNIX tools are subject to the same diseconomies of scale as everything else...


"Loose coupling" is one of the basic principles they tell you in beginner's courses. This is common knowledge to the point of being cliche. This encourages coming up with good boundaries, interfaces, whose both sides can independently vary.

Choosing these separation points is not trivial, by the way. It usually requires many iterations and a good gut feeling for what future requirements may arise. One example that is frequently mentioned is content vs form, for example website content and its presentation. It's useful, but still strongly depends on the actual uses. What works on screens doesn't necessarily work on paper. Not all forms can accommodate any kind of content. In other words the content also needs to fit the form. Or see how the layered model of the TPC/IP or the ISO-OSI protocols actually isn't that principled in practice. A high-level protocol may want to adjust its workings depending on what lower-level protocol it is running over.

So decoupling is not trivial. It often reduces performance too. Say you want to chain two tools sequentially. What if you end up using a specific combination 90% of the time, but there is some intermediate step in the first component that calculates something that is thrown away and the second component needs to recompute it because the general interface between the two doesn't include passing that data along? Then you can either accept it, or tightly couple them, OR even worse: you may try to find the real pure elegant reason for why this piece of data is actually logically required there and then you end up with abstractions that only have one instantiation, just for the sake of not "hard-coding" things.

But anyway, back to the core issue. There is not much to be surprised about.

Complexity of system = Complexity of individual components + Complexity of their interactions

This is analogous to the decomposition of the variance of a set of numbers: total variance = variance inside groups + variance between different groups. There is also some physical analogy for calculating energies, where you need to take couplings into account.

If you have a big system you need to take care of lots of interactions. If you choose to use small independent tools, then you need to write the code for their interaction.

Should the components come from the same vendor or different ones? There are arguments for both. Use same: they tested all sorts of interactions and fixed any bugs that they found (Apple supporters often point to this). Use different: each component only cares about its job, so less bugs in them, clear responsibilities etc. Certainly, if the same vendor develops all components of the system, they don't have to be so conceptually clean and they may make up for bugs in one subsystem by quirky code that tries to cancel out the bug's effects in some other subsystem, leading to fragile software.


The argument the author makes is really that software development and maintenance has diseconomies with the scale of projects and releases (basically that development and maintenance output needed scales superlinearly with complexity and output scales sublinearly with team size), which seem to be fairly widely.accepted observations in the field.

There some effort to portray this as unusual compared to other industries through a direct comparison to retail costs of larger grocery goods and manufacturing economies of scale, but that's somewhat missing the point. Product development and engineering probably faces similar diseconomies in non-software domains (the same complexity issues and human factors issues that effect software development are present) and, OTOH, actually delivering units of identical software (or services provided via software in the SaaS world) have similar (perhaps more extreme in some cases) economies of scale as are seen in many areas of manufacturing, as the marginal costs are low and more units means that the fixed costs divided by units sold goes down.


Yeah. Software development is a kind of design activity. Its' the compiler that does the "construction" bit.


The body of the article refers to software development, so we put "development" in the title. Also I don't think he's so keen to distinguish software from other kinds of design work—in fact he makes the opposite point at the end.


It's not a brilliant article.

Software is not like milk. That analogy is facile and stupid.

Software should be more like civil engineering, where it's normal to unleash a big team on a big infrastructure project and still have some hope that costs and deadlines stay under control. Or maybe like movie making where there's a cast of thousands, the time is huge, and the costs are epic, but some projects stay under control - while others don't.

It's maybe more interesting to wonder what's different about software than to look for enlightenment on supermarket shelves. Because the problems stated - multiple communication channels, mistakes in modelling and testing - are handled just fine in other industries.

The crippling issues are that you can't model software, and there's not much of a culture of formal specification.

So you can't test software until you build it, requirements may change iteratively, the latest technical "solutions" often turn out to be short-lived fads, and you're always balancing between Shiny New Thing and Tarpit of Technical Debt. That's why it's hard to build. You have to build your cathedral to see if it stays up when it rains. You can't simulate it first. And even if it stays up it may be the wrong shape, or in the wrong place.

It doesn't help that management often sees software as a cost centre instead of an engine room, and doesn't want to pay a realistic rate for quality, maintainability, and good internal documentation.

Having too many people on a project is not the problem. The problem is more usually having no idea what you're doing, why you're doing it, or how you want it done - but believing that you can throw Agile or Six Sigma (etc) at it to make it work anyway, because Management Theory.


"Software is not like milk" is exactly the point: the author's observation is that (some) people treat software as though it had scaling economies like a commodity product, when in fact it does not.

Software engineering would be like civil engineering if civil engineers had teleporters, matter-replicators, and armies of construction robots capable of building anything from a set of blueprints, without supervision.

Software engineering is its own practice and the fact that it is unlike other forms of engineering is a feature, not a bug.


> Software should be more like civil engineering, where it's normal to unleash a big team on a big infrastructure project and still have some hope that costs and deadlines stay under control.

Since when do giant infrastructure projects meet their budget and calendar goals?


There are examples of it happening. The Gotthard base tunnel, the largest in the world, opened ahead of schedule and under budget.


There are examples of it with software, too. It doesn't seem to be the norm in either domain, however, so I'm not sure that there is a good case to be made that they should be similar in this regard but are not.


The Software industry is what Construction would be like if architects were allowed to dream up whatever contrieved designes they would fancy that week, without those pesky engineers telling them that: No, no matter how much they wanted, steel and concrete and stone do not behave that way and Nature does not feel any obligation to comply with the customer's requirements.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: