In most of the cases discussed in the article, it's not the software that rots but instead the users and/or organization that decays. When a system is created, it is generally designed to solve a known set of problems encountered by an existing population of users. Over time, the tool creators, original users, and the current problem domains change but often the old system is modified to solve these changes. A flexible system may be adaptable to certain amounts of change, but only if the current populace of creators/maintainers as well as users understand the limits. If they do not, the software often ends up less useful than if it had never been touched.
"...it's not the software that rots but instead the users and/or organization that decays."
Orgs decay, sure.
But I believe the OC's point was that orgs become more brittle over time as they specialize. Conway's Law would suggest that software architecture mirrors org structure, so it too would become more brittle.
Applicable cliches: "victim of one's own success", "the complexity catastrophe".
Going the full meta here, I'm okay with the death-rebirth cycle for orgs. Yes, a lot of knowledge (experience) is lost. But forgetfulness is also crucial for learning, adaptation.
It was nice up to this: "But forgetfulness is also crucial for learning, adaptation."
In IT, it's the opposite: history repeats itself endlessly with same flaws, same missed opportunities, and same techniques recreated due to often-willful ignorance of the past. One of the things I do here is get old or even current work to people it might benefit. The number of times the old stuff applies to current problems, but was never handed down to those people by predecessors, shows the problem we need to work on is maintaining, packaging, and delivering prior wisdom. Interestingly, the other engineering disciplines already do that a lot better with IT remaining pretty stubborn. To put loss into perspective, it took some groups 50+ years to reinvent benefits of Burroughs B5000 and ALGOL. Just one example among many.
By "forgetfulness", I also mean the stubborn clinging to the old ways, because reasons. The necessary dying of the old guard, a la Kuhn's Structure of Scientific Revolutions.
But I also mean that each new cohort apparently needs to rediscover the deep truths (first principles) for themselves, to earn the wisdom vs blindly accepting received knowledge. I have no idea why this seems to be the case.
When I was a boy of fourteen, my father was so ignorant I could
hardly stand to have the old man around. But when I got to be
twenty-one, I was astonished at how much the old man had learned
in seven years.
-- Mark Twain (disputed)
Perhaps forgetfulness is the wrong word. I'd don't yet have another. Some trite phrase for the culling of the herd, gouging out the dryrot, burning off the grasslands, aggressive RIFFing of the tentured... That also conveys that we must give people the room and opportunity to learn things for themselves.
Maybe what I'm advocating is the Socratic Method. Versus telling people "just because".
"By "forgetfulness", I also mean the stubborn clinging to the old ways, because reasons. The necessary dying of the old guard, a la Kuhn's Structure of Scientific Revolutions."
Well, that's true. I think we might need to distinguish between what ideas are lost or are not what a new group is clinging to. Often comes with new people, young or old, coming into an organization. Maybe also distinguish between "current thoughts on what's best method" vs "known methods along with their benefits and problems."
"Maybe what I'm advocating is the Socratic Method. Versus telling people "just because"."
Well, there's how we publish the knowledge and how they learn it. I'm for traditional reporting with empirical focus, less silos, easier search, and easier distribution. Far as learning, Socratic Method is one among many methods that might have potential. It's too big a topic for me outside brief work I did on using various parts of brain simultaneously to increase recall, incremental problem solving of increasing complexity, matching training to realistic scenarios, and side-projects that are open-ended to match fun & curiosity. Don't think I did anything else in research on the topic. Well, a M.C. test generator and some expert system prototypes but they were crap honestly. :)
Note, article is not talking about bit rot, I think that's confusing a lot of people.
The point might be clearer as:
"Adaptability and efficiency are opposing priorities."
Ecosystems face this too. A stable environment will lead to adaptations that improve efficiency, while creating new dependencies on everything staying the same. In a sense, species are constantly competing to make the ecosystem more fragile.
If this holds, there are broad impacts to information systems outside of software. broader impact of this. We like to fantasize about the mind being immortal. Maybe we could fix the telemere thing, figure out cancer, hop our brain to a clone, or upload our consciousness to some cloud.
But in my experience, being mentally alive involves some mix of plasticity and progressive refinement. You can't have both forever.
Software rots if you can no longer set up the exact tool chain and environment (compiler, build tool, OS version, external database, etc) required to build and run it. Even interpreted languages suffer from this problem as features are subtlely changed - by design or accident. It doesn't matter if your project is using an ancient version of VC++ or a two year old version of NodeJS. If it doesn't keep up with the latest releases of the tools then it's already got one foot in the grave.
I think this is an interesting argument for something like scheme (r5rs or r7rs-small) where you could hypothetically implement an interpreter later for the same code base with much less trouble.
Mind you, it probably won't perform well or do everything, but you can imagine it'd be easier than implementing an entire VM.
Unless it's video games.
People will write mountains of software to play an old video game.
Your claim does not really follow the standard meaning.
Bit Rot is usually defined as all the ways that software can appear to stop serving it's function despite remaining the same, in the sense of being the same bits (or bytecode or interpreted code) run by the same computer.
"the software does not actually decay, but rather suffers from a lack of being responsive and updated with respect to the changing environment in which it resides."
Edit: Not being able to recreate or modify a given piece of software would an example of the bit of your tools/environment and this is one (but the only) source of bitrot in a piece of code.
In a nutshell, we can only safely use it if we somehow know that the source string fits into the destination buffer. Those situations are few: basically, they involve fixed-length string literals being moved into buffers that are "obviously" larger. E.g. some widget_t object has a char name[256] field, and we initialize it by default with strcpy(new_widget->name, "<unnamed widget>"), or some bullshit like that. The literal is way shorter than 256 and nobody in their right mind will ever make it that long, or shorten [256] in widget_t so that this literal doesn't fit.
In situations when we don't know the length of the source string, there are two cases: it is going into some buffer that is already allocated, or else we will be allocating one. In either situation, we must measure the length of the source string, to test whether it fits or to determine how much to allocate for it.
In the case when the string doesn't fit into the target buffer, we cannot use strcpy, obviously, since it copies the entire string. We use some truncating-copying function, or we handle the situation in some other way (diagnose the problem and abort or whatever).
In the case when we allocate, because we have calculated the length, we still should not use strcpy to finally copy the string. We should use memcpy, because memcpy doesn't wastefully examine every byte to see whether it is null.
So the opportunities for a correct, non-wasteful use of strcpy are few.
You're correct about the problems, and I would always recommend strncpy over strcpy for safety, but it's not quite as limited as you portray: You only need to ensure an upper bound on the string length that will fit into your destination buffer. In addition to literals, this can also be guaranteed by copying a string that's already stored in a container that's smaller than the target, such as a fixed-size buffer or database field.
strncpy is strcpy's daft cousin that wastefully writes extra null characters---except when there is no room; then it neglects to null-terminate at all.
strncpy was not needed even in C90, because of sprintf, which is less clumsy to use, doing everything in one step:
People say that gets is fundamentally broken as an API, beyond repair.
However, gets could have been "rescued" in C by requiring every implementation to publish a constant, say in <limits.h>, which indicates a maximum of how many bytes (including null terminator) the gets function will place into the target buffer, and ensuring that gets observes this constant.
Indeed, evidently, there was talk of this, and supposedly Doug Gwyn proposed that simply BUFSIZ be re-used for this purpose.
With such a constant in place, programs using gets could be wrenched out of the jaws of undefined behavior by sizing the input array according to that constant, ensuring that there cannot be overflow.
Rather than introducing this requirement, in the end the function was simply removed from the language.
As jussaskin said, it can happen to them too. For example, VM/370 was among first virtualization schemes that could do all kinds of things PC/server virtualization bragged on later. Many key apps/OS's might be put in it. Yet, you'd be hard-pressed to run one of those with today's IT budgets if you couldn't afford a mainframe. Likewise, quite a lot of PC virtualization vendors came and went. The survivors occasionally have compatibility issues or corporations start going in different directions with less support of the older product. The product itself is often tied to a specific OS or HW combination that might go away.
One can use virtualization to help future-proof apps. IBM's System/38 (later AS/400, iSeries, and IBM i) did this for decades. However, to be fully future-proof (esp for vendor level), the virtualization SW/HW solution itself needs to be immune to the problem. That means legally, easily-cloned HW with same for software and virtualization layer. Closest thing maybe NOVA microhypervisor on Loongson-style OSS CPU with x86 emulation if it's x86 apps. RISC-V w/ virtualization support otherwise.
Guess what future-proof virtualization all the mainstream approaches aren't using, though? ;)
If devs have to download a VM running an old OS with old tooling, the barrier to entry is getting quite high. I wouldn't touch an OSS project that made me do that, I would assume it was effectively dead.
Even though the VM image and tooling may still be runnable, the world around it with which it must interact continues to evolve and can leave your software obsolete.
I think the big question with the storage space we've come into is... why don't we have all the toolchains/environments available? Why isn't having multiple versions of every dependency just part of how we do things now?
Unless your sandboxed in a self contained hardware and software environment (such as an embedded system) you will eventually be screwed. (rot sounds like a gradual degradation its generally not in my position)
Standards changes and OS Updates are the biggest culprit; The Vista update broke a couple of my old programs (written for Windows 95) due to the user access control changes, another was broke due to the fact it interfaced to a piece of hardware on the parallel port and the parallel port and the manufacturer went the way of the dodo.
I have seen a stand-alone DOS 6 program running a machine at a factory. The PC has been replaced three times now is only a few years old but the operator says it still does the job, I also have a 8051 powered clock I built in 1987 that still happily ticks along if I plug it in.
This is because there's little glory in simply making things work. There is glory in making things that look shiny and new. As they say, "Absolutem Obsoletum".
And congratulations on understanding exactly why I prefer embedded systems - you don't have to keep chasing the dragon of the fashionable new thing ( which is invariably old wine in new skins ). It is not that there is no new wine it is just that proportionally, it's less than all new things on offer.
FWIW, if you look carefully there will always be a motherboard available with a working parallel port. They're not cheap, and at some point the thing plugged into that parallel port wears out.
Embedded systems seem to have all the same problem but worse - the difficulties to get an obsolete software environment working (possibly in a VM) to maintain a legacy program are insignificant compared to getting replacement hardware to keep a legacy embedded system working if some of these particular components haven't been manufactured for a decade, and you either have to port the system, throw it out and replace it, or enter the wonderful and pricey world of custom hardware manufacturing.
semi-related: a couple months ago I found a large industrial shipping scale in my parents basement. I grabbed an FTDI USB -> RS232 cable and a DB25 -> DB9 cable, plugged it into a Raspberry Pi, and hacked this together:
There are some nasty hacks in there and probably a few bugs that I could fix up at some point, but it works just fine!
I love how in a little more than 100 lines of Go code, I can so easily have a program with a dedicated thread to reading data from the serial port, a dedicated thread to serving up the initial http page to clients, a dedicated thread for sending data to websocket clients, and all of it works together so cleanly with messaging through channels.
I copy-pasted some d3.js code from around the web, retrofitted it, and in maybe 2 hours of work I have something pretty cool.
My Mom runs an ecommerce website and uses another scale for shipping, I wonder how hard it would be to write a driver for Windows that consumes data over the network and would interface with her shipping label software.
In a way, isn't this a form of make-work for software engineers? Plus, I'm not sure what is worse - having to remake things every 5 years or so slightly differently, or not having jobs.
In a way. It's hard to stay focused; they used to say the ultimate for a CS grad was to write the Great American Compiler ( as for an English major it may have been the Great American Novel ). As per Orwell, the hardest thing anyone does is to see what's right under their nose - to stay focused and commit to the most relevant work.
1. The code is the same but the people using it forget / never knew the reasons it was built that way - and so it looks rotten for the job
2. Because requirements change and people try to make the old code do new things without cleaning up / refactoring correctly - so it now does neither job well, and looks rotten.
3. Because the environment / platform changes, the FTP server is moved to a new data center and the timeouts kill the jobs etc. It looks rotten.
"Rot" more accurately is just not keeping the code inside the code base up with entropy outside the code base
Awesome! Comparing the two, I merged your 2 and 3 into a single reason, considering both as cases of environmental change. In its place I have a new one: changing personnel causing evaporation in system knowledge. Did you consider this? Now I think about it, maybe it overlaps with your 1 (my C).
I think that point 1. is caused by the fact, that the industry tends to divide programmers in two categories: most competent ones become architects and lead developers and are hired when you need to build new software system from scratch.
After the system is built you let the people who created it go, because you can't afford their salaries and hire younger, less talented people to maintain it.
So, instead of "software rots" i would say "software evolves to reflect the competency level of it's current maintainers"
An interesting simile here is that software rots in the same way that Encyclopedias do.
(This is admittedly a fitting simile partly because the simile itself is being rotted by software like Wikipedia.)
In the world where new Encyclopedias (and Almanacs and Recipe Books) were printed and sold on an annual basis, the question was often why do we need "this year's Encyclopedia" when the old one is still perfectly valid. Books in general decay pretty slowly and have a long shelf life, but the facts and the views in the world inside them are frozen and possibly. Changes from year to year of an Encyclopedia are somewhat hard to notice, but in Middle School in the 90s I recall having to compare articles from a tobacco yellowed Encyclopedia set from the 70s to trips to the same articles from very early predecessors of Wikipedia. The worlds contained in those two sorts of Encyclopedias were very interestingly diverging. The yellowed Encyclopedia's facts were almost all still valid and "worked", but there were things that didn't hold up and lots of new facts that needed to be inserted in various places. If I were to edit an Encyclopedia, I'm not sure I would start from the version in that yellowed Encyclopedia if I could find a more recent set. Some of the predecessors to Wikipedia were direct descendants of that yellowed Encyclopedia and yet for various reasons historical and technical, Wikipedia itself did not inherit directly from that set in any meaningful way.
(It's interesting to note too that the physical media of software to date has a much shorter shelf life than the pulp medium of books, tobacco-smoke-filled library aging included, so argument exists that software rots worse than Encyclopedias physically, at least.)
It's a nice analogy but there are important differences here.
The thing is that the process of encyclopedias roting/going-out-of-date is obvious because we understand that an encyclopedia is a collection of assertions about the world. It is not as obvious that software is also a set of assertions about the world - and it's not as easy to determine which assertions about the world the software depends on.
Just as much, an encyclopedia is a static list of supposed facts and its change is somewhat predictable by in terms of how our understanding of the world changes. We have some idea what will be valid or invalid in an encyclopedia X years old.
Software depends on obscure facts about OSes, about networks, about UI expectations, about time, and so-forth and moreover, software involves chains of dependencies so it's harder to predict what would or wouldn't break software X years old.
Thanks Joe. I think the difference you point out only strengthen analogy, and which I tried to colorfully explore in the original post, but expand here because it is interesting.
«It is not as obvious that software is also a set of assertions about the world - and it's not as easy to determine which assertions about the world the software depends on.»
I like that definition: "Software is a set of assertions about the world". It may be less obvious than an Encyclopedia, but again, perhaps that only makes it a better analogy.
«An encyclopedia is a static list of supposed facts and its change is somewhat predictable by in terms of how our understanding of the world changes.»
Ah. This is where Wikipedia actually strengthens the analogy. An Encyclopedia has always been loosely hyperlinked with cross-references. Entire complex webs of assumptions and assertions about the state of the world. Refactors of Encyclopedia content always could have interesting complex relationships with other content.
Certainly there is an extent to which Encyclopedias are known to predictably change: new elected officials each cycle, for example. But there is still plenty of unpredictable changes: new red letter days in history, new evolutionary scientific shifts in consensus, new lenses in which a culture views itself.
An example of the lattermost, in that yellowed 70s Encyclopedia you could still spot fragments in the process of being excised of what we now consider to be racist views of culture and cultural history. That among other things was a source of fascination to this teen in the 90s.
The often unspoken truth is that as much as Encyclopedias try very hard to be objective, they are still a product of their culture. (So too do we often want to think that Software is objective and nothing but mathematical assertions about facts about our world, but here too they are products of our [corporate] cultures and moored in ways both obvious and not obvious to the culture that built them.)
Wikipedia shows us that complexity of Encyclopedias much more directly, but those bones were always there in Encyclopedias even when we couldn't see them. A link to click is much faster than shuffling between volumes of an encyclopedia to follow cross-references, but the cross-references have always been necessary to build up a view of the assertions about the world.
More interestingly, Wikipedia strives to give a better, more transparent view, into the cultural biases of its contents; all of the [citation needed] tags and "More Research Required" templates and Talk pages of endless discussion. The novelty here is that they've opened up all of that to the common user, but the essence of that has always been in Encyclopedias, just locked away by the librarians and researchers in their own notes.
I find that integrated tests in software projects go a long way to reducing rot. When something eventually invariably breaks due to some external factor, the test suite greatly reduces the time to identify and fix the problem.
If you can build a piece of software and run unit and integration tests on it, it is more or less rot proof. That's because all those tests lets one delete or replace components of the system without having to worry too much that the whole thing is going to break in some unknown way that won't be caught until it's in production.
Without a good suite, the larger the system gets, the more people get scared of doing radical things to it because they are worried about breaking some critical functionality. As far as the not being able to build aspect goes, the main threat here is relying to much on closed source software and/or libraries that go out of support that no one can find the license, or critical documentation for, or worse manifests some critical unfixable bug.
Id love to agree with this but my experience differs on modern stacks. We so often mock or stub APIs, use a web driver that's pinned to a certain version of WebKit, test against virtual dom, etc that it's hard to get real reliable test results that don't break all the time.
But I suppose that's the problem; we don't have the time to fix bugs related to the environment that crop up when no code changes so we insulate our tests from the environment by design.
1. Do what's already been done, not even knowing it's already been done.
2. Declare "rot" or some similar claim of obsolescence and proceed to redo what's already been done.
There's nothing necessarily awful about this unless they fail to do a better job than the earlier effort.
Alas, this is too often the case. For a variety of reasons.
In the early days, portability was a higher priority. Not to mention longevity. Because everything was expensive.
Today's software "rots" a lot faster than the software from the early days of computing, IMO.
And so the younger programmers have lots of "work" to do.
Yet I do not see much progress being made. Because I do not measure progress by productivity alone.
Programmers who can churn out code in a dozen different languages to do the same old things are a dime a dozen.
As a user, I do not want software that needs to be updated every week. Poorly written software and gratuitous use of network bandwidth.
But I can see how programmers who love writing code would enjoy this state of affairs.
I think rot is not quite the correct metaphor. In my experience it's more likely to ossify, become sclerotic, build up scar tissue. As features are added or performance is tweaked, individual pieces become more complex and the connections between them multiply. If specific action (refactoring) isn't taken to fight this tendency, later developers will react to one piece being maintainable by making even more spurious connections and workarounds in adjacent pieces. That fixes the immediate problem, but makes things worse overall in the long term. Ultimately everything turns into the kind of tangled mess that everyone who has worked on an old multi-person project can recognize.
Unfortunately, a good refactoring requires understanding greater than the original author's[1], and therein lies another whole essay. ;)
I feel like "rot" is the wrong term, and the right term might come from that old parable about the man who built his house on the sand rather than the rock.
> Apache, the most important web server software today, is an old piece of technology whose name is a play on words (“a patched server”) indicating that it has been massively patched.
Is this true? I have never heard that Apache was a play on of words.
There is an analogy to be drawn between software and societies, and they way their early adaptations to one environment block their later adaptions to another.
As far as I can tell, he means "Why is complex software hard to change" which is a reasonable, though fairly easy to answer, question.
Software doesn't rot. New features or refactors screw it up. New needs or technologies may make it obsolete. But it doesn't rot, and people who talk that way are often busy-body rewriters who want to pitch the existing implementations, with all their Chesterton Fences[1], and begin anew.
Changes in the environment screw it up, too. As an obvious case, if your software is screen-scraping a website and the website's layout changes, your software's just rotted even though your code hasn't changed one bit.
Another example is software which interfaces with hardware: Swapping one specific piece of hardware out for "the same" hardware made by a different manufacturer can cause rot, because "the same" hardware isn't necessarily the same in every respect. If your software tickles it the right way, it can expose differences which can cause your software to fail in new and exciting ways. Nothing changed, yet everything's different.
As you expand the universe of things your code has to interact with, this kind of change becomes inevitable. As always, the more points of attachment, the more potential for future pain.
I think you're taking the phrasing too literally. When we say software rots, we're not just talking about the bits that make up a program, but also its interface with its environment, and the knowledge about it in people's brains. Both those connections are indeed subject to drift/evaporation/entropy. The two failure modes you mentioned are symptoms of that underlying evaporation.
I spend a lot of time thinking about this (http://akkartik.name/about), but lately I try to steer clear of metaphors, because they're so often clear in my mind but utterly opaque to someone else's. So, I don't care what we call it, as long as more people try to fix the problem.
On the face it, that's true. But the reality is that the environment that software runs in is constantly evolving and software that doesn't also change slowly starts working worse until it stops working entirely. We have plenty of software at my work that hasn't changed fundamentally in 10 years and can no longer be used.
Lots of really old software requires equally unchanged old environments and old hardware and that hardware slowly rots too.
... add to this that user expectations change. Applications not matching that feel rotted. Just look at websites from 10 years back to get an idea only on very surface levels.
It rots in the sense that it becomes harder to change (or fix) over time, even if the code stays the same. This happens via loss of institutional knowledge as people forget how it works, loss of knowledge of the technologies used as they're replaced, etc. It's not just the complexity. Old code is usually harder to change than new code, even if they're equally complex.
It's a metaphor, not an equivalence. It doesn't even have to be in the same field, as long as the meaning is conveyed. Bit rot is an excellent metaphor, even though it's nothing like physical rot if you get pedantic about it.
You're the first person I've heard say that the metaphor doesn't make any sense; meanwhile, a lot of people regularly use it, which implies that it makes sense to them. You're welcome to not like the metaphor, but you don't really get to declare that it's objectively wrong… it's a metaphor.
Software doesn't exist in isolation. It interacts with its host OS, shared libraries, other programs, protocols, file formats and external systems. If it can't keep up then it effectively rots even though not a single byte of it has changed.
> Newer programming languages are often interesting, but they are typically less flexible at first than older languages. Everything else being equal, older languages perform better and are faster
At the cost of more difficulty in writing, or other tradeoffs (e.g. security).
> Programmers, especially young programmers, often prefer to start from scratch. .. In part because it is much more fun to write code than to read code, while both are equally hard.