Things You Should Never Do, Part I (2000)

bguthrie · on Sept 4, 2013

If you absolutely must rewrite, consider isolating and replacing components piecemeal rather than scrapping the whole thing and starting over. You don't eat the whole cost at once, you can keep moving forward with new features if needed, and you get more modular architecture to boot.

It's called the Strangler Pattern. http://www.martinfowler.com/bliki/StranglerApplication.html

arghnoname · on Sept 4, 2013

I just want to second this.

Earlier in my career I was put on a project from a recently fired developer that was some of most convoluted and buggy code I've ever seen. I and the other developers that were put on the project to save it all lobbied _hard_ to rewrite it from scratch.

We were denied, so we tried to trudge along and ended up rewriting the worst component of the bunch, as it was so buggy that rewriting it and then making facades to maintain the old interface was the only conceivable way to make it work.

It went so well we started doing it to other components, and when two components on each side of the facade had both been rewritten, we found we could get rid of the facade. Without putting the project on stall, over time, we had essentially rewritten the entire thing.

It's a really good way to do things. I've been on rewrites before and they didn't go as well (second syndrome effect, etc).

moron4hire · on Sept 4, 2013

This is exactly how it should be done.

gaius · on Sept 4, 2013

The second time this has come up today: http://en.wikipedia.org/wiki/Ship_of_theseus

gadders · on Sept 5, 2013

Known in the UK as "Trigger's Broom" :-) http://en.wikipedia.org/wiki/Trigger_(Only_Fools_and_Horses)

prawks · on Sept 4, 2013

As a note, you pay for this in regression testing. Just something to consider when you employ this pattern, not necessarily a deal-breaker. If your system does not have adequate automated tests in place it can make regression tests rather ugly.

mcv · on Sept 5, 2013

This is really where unit tests will save your hide. I never did any truly dramatic rewrite, but there was one project (fairly new, even) that had one very badly written, very badly understood module.

I started out writing unit tests for it. There weren't any, so I wrote them until I had unit tests that tested everything. Those unit tests also document the behaviour of the module. So then I started asking people what it was really supposed to do. Then I refactored it into something more maintainable. Only after that did I change the unit tests to reflect the behaviour the module was supposed to have, and based on that I finally wrote the code we needed.

The unit tests were my anchor in the storm. They keep me grounded. They mean I can do anything to the code, while keeping tabs on the impact my changes have.

waps · on Sept 7, 2013

I see you've read "working with legacy code".

The trouble with unit tests is their small scale. You write 3 Mb of source code filled with unit tests, does that give you any guarantee the program will even start when you call it ? (A point of frustration that I have had with other developers is that they deliver "a product", which immediately sigsevs upon execution, and then claim that's not possible, all tests pass. Same with memory leaks, which are never caught by unit tests, not in C++ and not in Java either).

First thing to do is to get the program having a backend and a frontend separately, then introduce a system test. Something that does what the main customer(s) do with your application, and does it while seriously stressing the system. You need it to send out mails ? Make it send out 10000000 mails while only using the backend calls the webserver calls while activating a sleep(1) in the mock database routines, and impose a memory limit of half a gig, checking if memory usage actually drops when you stop pushing the application. Make it use an actual database, and make it send actual emails to a fake "smarthost" email server. If it doesn't complete in 5 minutes, something is wrong. I've been meaning to implement this test atop docker. I tried to generate VMs to test the full application, start the app in a simulated network and run most of the actual production code, but making the VMs took ~25 minutes (and they weren't even on the correct machine by then), which made it unusable. I wonder if docker can do better.

Yes this test will be flaky, yes if it does indicate failure figuring out what's wrong is bloody hard, yes it means abandoning several holy cows of software development, yes it means some program will tell you where to focus attention, yes it will force you to actually think about debugging production failures BEFORE shipping, it forces you to have test systems for the program's components, it generally makes you deal with reality. You will have discussions with this test if you do it right ("No that hash-table MUST be faster than that linked list ! Aaaargh !"). And that is something sorely needed in software development.

Second thing I use this for. Boss-man wants a new feature. Ok, sure. I add a system test that uses it. Needless to say, it will fail. But it allows me to have a good think about what the high-level changes needed are, without unit tests constantly badgering me that some method now accepts different parameters. It will tell me if I got it right. And when I deliver it, the freaking program will start, and won't crash the first time someone uses this feature. You tell me how to make unit tests that deliver 10% of those guarantees. Forgot the big picture after spending 2-3 hours on getting a module fixed after adding new functionality to it ? Just single step through the system test and see where it blows up, and hey. There's the next thing needed for the high-level change.

Unit test programmers are terminally afraid of changing module interfaces to suit higher-level objectives. Given the amount of unit tests needed to even approach 95% coverage, I fully understand the resistance. But if they choose to write unit tests and I need the module change, I just go in and do it, disabling the unit tests. I tend to get some attitude in code reviews.

It also means that replacing the database with something else is something you can just do in an afternoon. Changing the core logic of the system, from a set of rules to a rule interpreter ? Changing that rule interpreter to jython ? No problem, the test will tell you if common use cases of your product are actually still working. And that's what the business/boss (should) care about.

Ops will love you if you do this. You will consistently deliver a working product that operates well within known parameters, as opposed to a system where the tiny wheels all turn, just not in the same direction.

Unit tests optimize for the wrong thing : they make sure programmers' jobs become easy, because the modules behave the way they "should", and can be changed with "predictable" results, which of course turn out not to be predictable in the real world. Furthermore the interactions between the "should"s of multiple modules ... turn out to be less than simple. Unit tests allow you to say "not my fault" (e.g. if your method is merely slow, but suddenly gets called billions of times because some other system changed and makes the system suddenly crash when too much is going on, but not due to a logic error). They make it easy to change systems by only touching the smallest components, which of course also massively limits the changes possible in a system.

Unit tests impose strong limitations on the changes that you can make to code (in reasonable time). Therefore, for the large majority of code, they are a burden, not a boon. There is one (big) exception. Algorithm and data structure code should be thoroughly unit tested. In most projects however, you have maybe 2-3 methods that fall into that category. If you have more, find an open source project to replace 50% of them, or figure out how to use more general data structures to do your thing.

davidtpate · on Sept 4, 2013

Depending on the nature of your product and your rewrites in my experience a customer facing system that has parts improved in parts can be interpreted as not solving the right problems (if something UI related is being updated).

The best thing I could say about it is that when approaching behind-the-scenes pieces this is my preferred approach, but for customer-facing pieces I prefer (and have had better reception on updates) to update those pieces as a whole. But again, this depends on the complexity, updating a small piece such as a button should not require a large change but updating the layout or styling would require more significant updates.

nawitus · on Sept 4, 2013

The problem with that seems to be that you're tying yourself to the old application in one way or another (e.g. old interfaces, old architecture, old programming paradigms).

tomjen3 · on Sept 4, 2013

Not at all. It is perfectly possible to go from a big freaking desktop app (BFDA) and rewrite it, piecemeal, to be a single page web app.

The way you do that is that you first, piece by piece, split the BFDA into multiple layers, then you make it client-server, split the data-storage into its own server, split the gui of, split the business logic of. Then you rewrite the parts that request stuff for the client into using JSON and rest, then all you have to do is to port the (hopefully) tiny amount of gui interface the user needs to Javascript and the entire app still works.

Is it easy? No. Does it waste some work? Yes.

MartinCron · on Sept 4, 2013

What you're really giving up on is conceptual integrity. Some developers find it very distasteful to have an application that is only partially composed of a newer architecture, programming paradigm, whatever.

A few years ago I stopped an impending complete code re-write and a fellow developer was aghast at the very idea of supporting an application that was "only half MVC".

outworlder · on Sept 4, 2013

Not necessarily.

You can replace parts of an application with webservice calls, for instance.

hammock · on Sept 4, 2013

This is the flip side of staying cash flow positive. Just as cash flow is more important than profits when it comes to surviving, undertaking a full rewrite of code, while theoretically long-term beneficial, could have the effect of causing you to run out of cash.

sdoering · on Sept 4, 2013

Thanks for pointing this one out. Never heard of it before, but was inspired by it. Being a product-manager with a little spaghetti-coding-skill, I see this urge to bulldoze something and build it anew in a lot of our IT-population.

Maybe I can introduce some things like the strangler-pattern in upcoming projects.

moron4hire · on Sept 4, 2013

This article, when I originally read it 5 years ago, was hugely influential to me. It completely convinced me of the superiority of refactoring existing systems to an ideal over rewriting. It doesn't take anywhere near as much time to fix cruft as it is to create a system from scratch and try to avoid your own cruft. But it's scary and it's an unknown. But if you approach it with courage, it falls quite easily.

EDIT: refactoring large systems has also convinced me of the superiority of statically typed programming languages for large projects. When I break code, I want it to break as hard as possible. I want to see absolutely everywhere function XYZ is called, and one way to do that is to rename XYZ to XYZ___ and see where all of the compiler errors show up.

famousactress · on Sept 4, 2013

The article and Fowler's Refactoring book had a similar effect on me as well. Your point about statically typed programming languages is one that I've been feeling the brunt of, though I'm not convinced yet.

I've been 100% python for the last few years, after a career mostly in Java and I definitely miss the sweeping refactoring abilities and hard-breaks during those kinds of refactorings. It does lend courage for sure. I'm just SO MUCH MORE productive in python for most of what I do that I'm not willing to throw out the baby with the bathwater yet. Instead I've been trying to focus on leveling up test coverage, and experimenting with Rope for refactorings ( http://rope.sourceforge.net/ ). I'm also slowly learning patterns for coding to make refactorings easier moving forward.. choosing code and data structures that are more easily affect-able later, or even things as simple as naming things for grep-ability.

GyrosOfWar · on Sept 4, 2013

Modern statically typed languages (cough cough not Java cough) generally have static type inference, which removes a lot of redundancy from the code. Being able to do

val object = MyObject(someOtherObject(12, "hello")) // Scala

instead of:

MyObject object = new MyObject(new someOtherObject(12, "hello")); // Java

saves a lot of time and it just looks nicer. All the while retaining the advantages that static, strong typing have (great potential for refactoring, a lot of errors can't happen at all). Of course, there are some things that will still take up more space than they do in Python or Ruby but it's a great improvement. Languages like Rust, Scala or C# also have a good variety of functional programming tools that can make your life a lot easier.

dllthomas · on Sept 4, 2013

Interestingly, this (narrow) case is supported by C++ (auto) now.

Even C, kind of:

#define AUTO_TYPE(X,Y) __typeof(Y) X = Y

... though I'm not sure I'd recommend it.

Type inference in full generality can give you a bunch more, as I just went into over here: https://news.ycombinator.com/item?id=6327327

sebcat · on Sept 4, 2013

Java 8 will have static type inference, and some other goodies too.

moron4hire · on Sept 4, 2013

I find C# extremely productive, if you don't try to do like most C# developers seem to want to do and treat it like Java, with interfaces out the ass. An interface can be handy, but generics fill the role more often. However, there are some others like Boo and Nemerle are still statically typed but much, much closer to Python in terms of lack of syntax verbosity.

famousactress · on Sept 4, 2013

I used the very first version of C# and enjoyed the language. I found it an improvement (and obviously a reaction) to Java, for sure.

marcosdumay · on Sept 4, 2013

Just remember that you can hint types in Python, and statically enforce those hints. You can create a type system as powerfull as Haskell's.

People just don't do that because it makes them less productive, but it may be a huge bonus for refactoring.

scrabble · on Sept 4, 2013

I don't know about other IDEs, but in Visual Studio I'm able to just right-click on a method and hit "Find All References" to see a list of everywhere something is called.

quantumstate · on Sept 4, 2013

I work on a large piece of C++ software (millions of lines) and unfortunately this feature does not work well enough to be worth using. The first problem is that it takes far too long, much longer than the rename method and waiting to recompile. The other pretty serious problem is that it sometimes misses references so I just can't trust it.

moron4hire · on Sept 4, 2013

While that is true, and I do use it frequently (so much so that I prefer the keyboard shortcut CTRL+K, CTRL+R), it's not available everywhere. Also, I tend to still use the "break-it" pattern in Visual Studio because the errors serve as a good TODO list to clean up all of the references.

Though, it gets a little more complicated when dealing with overloaded methods. In that case, Find All References really is a lot better.

dllthomas · on Sept 4, 2013

In my vim setup, ctrl-\ <Tab> drops a search for the word my cursor is sitting on (with word boundaries) in my quickfix buffer, for easy stepping through. More or less the same thing.

agscala · on Sept 4, 2013

Could you paste your config for doing this? I'd like to give it a try.

dllthomas · on Sept 4, 2013

The actual vim config is terribly simple:

    nmap <silent> <Leader><Tab> :make search PATTERN='\<^R^W\>'<Cr>

Where ^R and ^W are literal ctrl-R and ctrl-W (typed with a leading ctrl-v).

Then my Makefile contains:

    search:
        git grep -n "$$PATTERN"

plus some project-specific bits to prune out portions of the tree which should be skipped.

The following are also nice:

    clip:           
        xclip -o        
                    
    todo:           
        PATTERN=TO''DO make search | sed -n 's/\([^ \t]*\)\(.*TO''DO \(([0-9]*)\)\):\?\(.*\)/\1 \3 \2\4/p' | sort -k 2

I've also found it useful to make sure my unit tests output some relevant file:line: info and run those from my makefile.

Relatedly, I have:

    nmap <silent> <Tab> :cnext<Cr>
    nmap <silent> <S-Tab> :cprev<Cr>

to make it easy to jump between quick-fix items.

romaniv · on Sept 4, 2013

This is the most overrated and overquoted piece of (bad) software engineering advice ever. Sure, it's harder to read unfamiliar code than write new code, because your understand the reasoning behind the new code. That's exactly why rewrites often make much more sense than labor-intensive incremental changes.

I've had many real-life situations where I successfully chain-deleted thousands of lines of legacy code and replaced it with some sub-100-line method that simply worked. All that without understanding how legacy code works. You're wondering what's the secret here? Instead of trying to deduce the undocumented logic behind legacy code I gathered current requirements and implemented them in the simplest way possible.

In most cases the results were dramatically more readable, because I used build-in language capabilities and standard libraries whereas old code spectacularly failed to do so. Also, I didn't have to worry about requirements that were no longer relevant.

enraged_camel · on Sept 4, 2013

>>I've had many real-life situations where I successfully chain-deleted thousands of lines of legacy code and replaced it with some sub-100-line method that simply worked. All that without understanding how legacy code works. You're wondering what's the secret here? Instead of trying to deduce the undocumented logic behind legacy code I gathered current requirements and implemented them in the simplest way possible.

Except one of the main reasons legacy code tends to be so long is that it deals with tens/hundreds of small edge-cases or bugs that your "sub-100-line method" simply cannot account for. So what you are doing when you re-write the legacy code without even understanding what it does is regressing the product by several years in terms of maturity and stability. Your new code will face the same problems the legacy code did, except it will break. And then you will have to start adding to it...

Now, it is absolutely possible that those edge-cases and bugs that the legacy code dealt with are no longer an issue: maybe they were added back in Windows XP days, and your users no longer use Windows XP, or something. In that case, yes, go ahead and rewrite it. But you need to think about and understand what it does first, and why.

barking · on Sept 4, 2013

Exactly, It comes across as really arrogant to replace thousands of lines like that without even looking at them. Maybe those old timers first iteration only had 100 lines too.

nairteashop · on Sept 4, 2013

I've done massive refactorings in the past as the parent is suggesting. But before I do it, I try to figure out whether the code is hard to read because the developer was bad, or because a good developer was forced to add support for hundreds of edge cases over the years.

If I notice at a cursory glance that I can replace more than a few ridiculously convoluted chunks of code with something simpler and be sure that I broke no edge case for those chunks, I just assume that the previous developer was simply incompetent and rewrite the whole thing to meet the original requirements. (And perhaps down the line a developer better than I will do the same with my code!)

coolsunglasses · on Sept 4, 2013

Well, maybe. It really depends and you won't know unless you're reading the code and cross-referencing with applicable unit/regression tests.

Some of my most productive days have been in eliminating unnecessary/poorly written code, not in adding lines.

But you can't do that unless you know:

1. What the program should do

2. What the specific code you're looking at is trying to accomplish vs. what it actually does

Bringing #2 (what it actually does) into synchronization with what it's supposed to do can allow you to eliminate code, and the opportunities for reducing the overall amount of code grows as you look at a wider scope/% of the overall project.

But it all depends.

romaniv · on Sept 5, 2013

I didn't say I never looked at them. I said I didn't need to understand how they worked. There is a huge difference.

For example, I've seen logic that looked at roughly 10 user input fields and generated one out of 30 pre-defined values. That was the extent of what it did. It had multiple classes, complex global state management, database and session interactions and thousands lines of code (with no automated tests or comments). That's how real legacy code looks.

When asked to modify something like that, a bad developer would simply add a few more lines of code, do some manual testing and move on.

Another kind of bad developer would start refactoring that code 'by the book', writing unit tests... and would never finish.

A good developer would start by asking what is the point of that final value the code generates and why we really want to modify the logic. Sure, you want another classification of stuff, but why?

In my case I realized that everything and everyone that were relying on that final value could use something else. Instead of spending time messing around with pathological code from above I spent my time switching systems and people to using other values. Then I simply removed that legacy classification mess altogether. The number of tickets we got for the system dropped by at least 50%. I still don't know exactly how all that code worked, because I merely traced out where it interacted with the rest of the world.

Someone will surely comment "But that's refactoring!". I don't care how you call it, because at the time and place I was working on the project it was considered a rewrite, and that's what matters. If everyone followed Joel's advice, no one would question the necessity of that code, no one would look at deleting it altogether it would still be there, except bigger and buggier.

snowwrestler · on Sept 4, 2013

A full rewrite is a great excuse to re-examine all those exceptions and see if they are really, or still, needed. In fact this can be a reason to choose a full rewrite over incremental change.

enraged_camel · on Sept 4, 2013

Yes, sure. But if you read the parent's comment, he's bragging about replacing the code without even understanding what it does.

jbert · on Sept 4, 2013

> I gathered current requirements and implemented them in the simplest way possible.

If you can get complete current requirements, that's fine. The issue with this is "all the stuff which falls through the cracks".

In the example from the post, some code to "handle case where internet explorer is not installed". That the sort of thing which would often be missed in requirements gathering, discovered in the field and then patched into the codebase.

Sure - if you have one of:

* close-enought-to-perfect requirements

* close-enought-to-perfect regression tests

then a rewrite isn't too hard. But achieving either of the above is something I'd consider a hard problem.

Do you have a particular way of getting signoff that your current requirements are "complete"? That would be worth knowing :-)

kyllo · on Sept 4, 2013

Yeah, my favorite definition of "legacy code" is "code that has poor test coverage."

The regression test suite should literally implement all of the business requirements, and then the two become the same thing (if there's no test for it, it's not a requirement, and vice versa), so then you can work on the codebase with confidence, and it's therefore not legacy code, no matter how old it is or what old school language it's written in.

vectorpush · on Sept 4, 2013

I've had many real-life situations where I successfully chain-deleted thousands of lines of legacy code and replaced it with some sub-100-line method that simply worked.

This just about never happens outside of start-up world. You can't go around deleting thousands of lines that you don't understand and just pray that your new implementation won't silently (or catastrophically) break subtle side-effects (errors or otherwise) that the organization has adapted its business around. That's just madness and would probably get you fired in most places.

ebiester · on Sept 4, 2013

Eh, We just did the equivalent in an enterprise app.

It's amazing how convoluted code can get when someone doesn't sit back and think.

It's true -- if a codebase is predominantly written and maintained by competent developers who write good tests and are given the authority to refactor as they go, this shouldn't happen.

On the other hand, who wrote it can be a significant factor in the decision.

rblatz · on Sept 4, 2013

I hate working with people that have this attitude. Blowing away old code that works and rewriting it, because obviously they are the smartest person, and can do it better than anyone else.

I've had no end of regressions and just full out failures because someone rewrites code and doesn't understand what it does or why it does it. Then I come in and see that if they had only spent an hour reviewing the code and understanding it, we would have saved weeks of development and testing time.

avenger123 · on Sept 4, 2013

I would be hesitant to just make a broad statement like 'overquoted pice of (bad) software engineering advice ever' just based on your experience.

It's one thing to replace legacy plumbing code (which what you describe is likely just that) with new 5 line plumbing code.

It's quite another to walk through lines and lines of classic ASP code (sub your own 'legacy' language here) that mixes data access with business logic and perfectly meets the businesses needs.

I have found it hard to go to a client and say we will build you a new version of the application but most of the cost is in the rewrite to make the software more "modern" and only a small portion of the cost is new features.

Also, try getting current requirements from business stakeholders who sometimes don't even have a clue what the behind the scene 'magic' is happening in the application but it fits their business needs and gathering requirements amounts to 'do what the app already does'.

I certainly don't disagree with your opinion but I would venture that if Joel's advice were to be treated as a software development pattern, it would generally fit the situation.

RogerL · on Sept 4, 2013

We have plenty of empirical evidence that you can rewrite from scratch - every competing product in the marketplace is essentially a 'rewrite from scratch' from the perspective of all the other products. Almost very start up poster on here is 'rewriting' an app in that sense.

If you can't 'rewrite from scratch' (i.e. write a new app) you probably don't have very good programmers (they are only comfortable bandaiding and throwing a few routines in existing code), or, perhaps, management is tying their hands someway. So, I don't buy into the horror stories. Well, I buy that they exist, just not that they should drive my decisions.

Of course, you should expect the cost of your new app to be about the same as entering the business cold. It's particularly difficult to meet every requirement/feature/bug of the old system. Orgs get fossilized around features. If you were to download and install a new IDE you'd expect it to have different features from the old one. You will miss some of the old features, but presumably the new features more than make up for it (otherwise why would you switch). Yet, when we upgrade apps, it's "remove nothing at any cost". It doesn't necessarily make a lot of sense. It can take enormous effort to re-implement some broken feature as opposed to just offering a new way to do things.

None of that is to say I blithely throw code away. I did, once, amidst a bunch of hue and cry of 'oh no', but it was the correct decision because the old system was hacked together by 'weekend programmer' (my boss at the time who was very smart but learned programming from a book). I turned it into a modular, decoupled system, and most of those parts got reused over and over in different parts of our vertical. If you have a plan, and the old code doesn't, it can pay off.

There's more to be said, but this is a wall of text already. Every day I work in the refactor mode, even while mostly adding new features. Its an extremely powerful idiom. But I also have my eye on code that really needs to go to the big bit bucket in the sky. No one understands it, every fix introduces new bugs, and what it is trying to do is very well served by existing open source code, or code be written from scratch.

obiterdictum · on Sept 4, 2013

Note the article is from 2000. Joel mostly talks about big and mature desktop software (big as in Microsoft Excel). Average startup's code is small, narrowly focused, has a (relatively) short lifespan, so it can be rewritten from scratch by 1 or 2 people.

If you've ever worked with legacy systems older than 10 years, you'd notice that they become a victim of their own "success". You can't throw away features added over years, because a lot of users depend on it, and if you tried to rewrite it, you'd have to rewrite bug-to-bug.

I worked on such systems, and I did try rewriting, and gave up because of sheer volume of work and the knowledge of institutional lore that was required to do it. On the other hand I'm currently embarking on our startup's "rewrite" and things are much much easier, because the feature set is small and I can freely throw away stuff that didn't work.

See also, http://en.wikipedia.org/wiki/Gall's_law

smacktoward · on Sept 4, 2013

> Average startup's code is small, narrowly focused, has a (relatively) short lifespan, so it can be rewritten from scratch by 1 or 2 people.

But the reason those startups can be narrowly focused is because they can leverage huge volumes of working code written by others. In that sense, Joel's advice still holds, if you think of "rewrite" less as "throwing out your own code and starting over" as "throwing out memcache and writing your own caching layer/throwing out Postgres and writing your own database/etc." Which is a temptation that lots of startups fall victim to, rarely with a happy ending.

RogerL · on Sept 4, 2013

My bad, I brought up 'start up', but that was not the point of my comment. Big as in Excel? Well, we have OpenOffice, Google Docs, and so on. They were written from scratch with the aim to compete with Excel/Office.

As for the victim of their own success, I addressed that, arguing that it is often inertia and silliness, as we accept different feature sets when we switch our tools out.

The reality is, we rewrite all the time. I used to develop in Borland and OWL. I moved apps into MFC and Visual Studio. Now stuff is being moved into Qt. People have switched from native to cloud, and so on. We endlessly move to new platforms, new software, and so on. In all those cases we accept that the feature set will be different, yet for some reason we don't accept it when we are rewriting an app. I'm not dismissing the cost - if you need to generate a TSP report, and I don't offer that feature, then you are pulling out Python or something to hack it together yourself, and that has to be counted as part of the cost of the project.

We do rewrite large infrastructure software. We do. All the time.

mathattack · on Sept 4, 2013

It isn't just that people depend on the features, you frequently lose track of which features were created for who, and for what reason. Sometimes it's just easier to keep things in motion, rather than start over and wait for someone to scream. (A case can be made for both)

RogerL · on Sept 4, 2013

Right. I do not argue that rewrite is always the right thing to do; indeed, I argue that very most often it is the wrong thing.

But, answer me this. How many companies have imploded because their software was not maintainable? That's the other half of this article (Joel only wrote half an article, I contend). You can no longer make competitive bids for work because your impossible to understand. It takes months for the simplest change. Your customers leave in droves because your code is endlessly buggy, and you are pouring money into the drain of bug fixes that just introduce new bugs. Or, it is just a long drawn battle, as your profit margins slowly erode away as each new feature becomes incrementally more expensive to implement, until you are at negative return.

I say again; we have massive empirical evidence that total rewrites of very large infrastructure works. If you don't do it, your competitors will do it for you. And, of course, if you do it when there is no competitive need for it, you will be flushing money and/or your company down the drain.

(edited to fix some grammar and clarify a few poorly worded points)

mathattack · on Sept 4, 2013

I wish good data existed on this, but most would be confidential, and this is very hard to measure to begin with. Ultimately it's a judgment call. There isn't a black and white, but Joel is giving a Year 2000 plea that people at the time were tilting the wrong way.

melbourne_mat · on Sept 4, 2013

"If you can't 'rewrite from scratch' (i.e. write a new app) you probably don't have very good programmers"

A statement like this makes me think you haven't been doing this stuff very long. Software development is a continuous compromise between good engineering practise and business priorities. The skill of your team has nothing to do with it!

prospero · on Sept 4, 2013

Nothing? Why all the fuss around hiring good developers, then?

chaosphere2112 · on Sept 4, 2013

I suspect that he meant that the skill of your team has nothing to do with being able to do a rewrite from scratch.

samatman · on Sept 4, 2013

The key sentence you buried there is "It's particularly difficult to meet every requirement/feature/bug of the old system".

Because that is precisely the point that Joel is trying to make. My whole job consists of doing that for code that will never, ever be 64 bits or multicore, and therefore had to go. But it is painful. Avoid it if possible.

Aqueous · on Sept 4, 2013

"(they are only comfortable bandaiding and throwing a few routines in existing code)"

In my experience it actually takes a better programmer to refactor existing code than to start from scratch, because anyone can type 'git init' and the temptation to do so is high. Truly experienced programmers can read and understand other people's code, and more than just bandaiding it, can fix it where it goes wrong.

DanBC · on Sept 4, 2013

> Of course, you should expect the cost of your new app to be about the same as entering the business cold.

That's a pretty big cost if I'm WordPerfect.

jscore · on Sept 4, 2013

I think you completely missed Joel's argument.

shubb · on Sept 4, 2013

This is an interesting article because though the rewrite killed Netscape, the rewrite became Firefox, possibly the most popular browser in the world.

It became that way because developers were able to run laps round Microsoft (who clung to an old, bad codebase).

And because it was a redesign not just a rewrite, and the redesign redefined what browsers did.

Firefox, as we use it today is a rewrite of that rewrite, with god forbid far less features than the original!

I wonder, with a little more money behind Netscape, and without anti-competitive MS browser bundling, would Joel's piece look so correct in hindsight?

hkarthik · on Sept 4, 2013

I remember using IE when they started bundling it with Windows and it was a FAR better experience with instant loading instead of waiting Netscape to load. Everyone likes to blame the bundling of IE as causing Netscape to fail, but in reality, the initial versions of IE just worked better. Netscape's engineers knew this, and probably realized they had to rewrite to compete with the speed of IE.

Interestingly enough, these days hardly anyone I know uses IE or Safari even though both are bundled with their respective operating systems. Most of these folks install Chrome because they are familiar with it and it doesn't impose any significant performance penalty.

Ideally, Netscape should have been optimizing their browser for speed all along. But since they were the only kid on the block for so long, they became numb to how slow things had gotten. It took IE coming in with a much better experience to shake them into action. Unfortunately, by then it was too late for Netscape as a business to recover.

emiliobumachar · on Sept 4, 2013

For your information, The bulk of IE loading happened during Windows boot. The user just wasn't informed of it.

I don't know whether Netscape could have done the same and didn't, or whether it required some inside access to Windows.

shubb · on Sept 4, 2013

In oldschool firefox, maybe in navigator too, pre-loading was implemented as a tray icon. The tray icon would load at startup, and could be used to launch the browser. It still felt overly heavy.

to3m · on Sept 4, 2013

Sure. You make a great point. A mere four and half years after the article was written... ta da, Firefox! Straight outta nowhere, refuting the article's core point. One day they chuck out their dirty old codebase, then just four and half years later out pops their shining new browser. Stick that in your pipe and smoke it, Spolsky!

nickm12 · on Sept 5, 2013

The rewrite wasn't Firefox. The rewrite was Netscape 6. Firefox was streamlined version of Netscape 6 which hit 1.0 four years after Netscape 7. So from Netscape 4.0 to Firefox 1.0 was 7 years.

I think Joel's argument was that continual refactoring and improvement could have allowed one to evolve into the other in a much shorter time.

nostrademons · on Sept 5, 2013

And I think that's a dubious proposition, because the advantage Firefox had over both Netscape and IE was simplicity and speed. Can you name a single software product that gets faster, smaller, and simpler over time? I can't think of any; in general, software seems to accrete features, and if it manages to lose them, it usually grows new features in their place. Netscape was replaced by IE, which was replaced by Firefox, which was replaced by Chrome, which is getting replaced by native mobile widgets. In each case, they needed to build off something that started off at an earlier point in their evolutionary line, throwing away all features that had been developed as the software met actual users. Mosaic begat Netscape, then Microsoft started back with the Mosaic code by licensing Spyglass, then Firefox went back to the rewritten Netscape 6 code, then KHTML begat Safari begat Chrome.

outworlder · on Sept 4, 2013

We have no way of knowing that.

But I can imagine we'd living in a very different environment, had Netscape succeeded.

freework · on Sept 4, 2013

If I had superpowers that allowed me to delete one website from the in internet forever, it would be this article.

I find the "complete code rewrite" to be the most powerful programming technique I have in my repertoire.

In 2007 I wrote an application called FlightLogg.in. It was basically a clone of logshare.com that I wrote in my spare time with PHP. It was my first really 'big' project. You can see the code here: https://github.com/priestc/old-flightloggin

If you look around that codebase, you'll see nothing but spaghetti. I starting writing the app in late 2007, back when you could say I was a 'noob' (so noobish in fact, I didn't even use source control). At the time I thought I was writing the most awesome code ever. By the middle of 2008, that project had reached a state where it was pretty much 'done'. I then stopped working on that codebase, and moved on to other projects.

Then by late 2008, FlightLogg.in's traffic had kept growing, and there was a list of bugs and really cool features that I had thought up since I last touched the codebase a few months earlier. I set out to continue development on the PHP codebase. The problem was that in the past few months I've been away, I had become a better programmer, and the sight of that spaghetti code made me vomit.

It was less about the code being 'bad', and more about me not remembering how things worked. The small set of fetures and fixes I wanted to make would have taken me a few hours to do back when I was first working on this codebase. Since a few months time had passed, it was taking me much longer.

Eventually I decided to do a complete code rewrite. This was before I ever saw that Joel Spolsky article. The new codebase is here: https://github.com/priestc/flightloggin2 It took me roughtly the same amount of time to build the new codebase as it did the old codebase.

Ever since, each and every project I take on, I accept the reality of doing a complete code rewrite might come up. For my own personal projects, I do complete code rewrites all the time. On the other hand, at the various jobs I've had over the years, its a much different story. If I came into work today and brought up the idea of doing a complete code rewrite, I'd either get laughed out of the room, or even worse, threatened with being fired, thanks to this Joel Spolsky article. Thanks Joel.

trustfundbaby · on Sept 4, 2013

Joel isn't talking about the kind of software that can be rewritten in a few days or weeks. He's talking about "large scale commercial applications" ...

If you have an app with a complex codebase that is making money in a pretty competitive space and you decide to stop everything and do a complete refactor (the blocking kind in which you can't do anything else while that is going on) then you find yourself in a situation where you can easily be surpassed by a competitor. Its probably a smarter move to do a more 'gradual' refactor.

etler · on Sept 4, 2013

Assuming your code base is even clean enough to do that. Lack of encapsulation and modularity is also part of technical debt. Getting a disastrous codebase to the point where gradual refactorization is even possible will be a blocking task, and if the technical debt has built up high enough, just that single blocking task may cost you more than a complete rewrite. It depends on the code base and how much foresight was had when it was first written. If technical debt is such a serious problem that you're even considering a total rewrite, the blocking debt may be so massive that it makes a gradual refactor impossible.

tieTYT · on Sept 4, 2013

Did you really mean to use the phrase "complete refactor" here? AFAIK, refactoring means modifying your code without modifying features. That's very different from a "complete rewrite".

freework · on Sept 4, 2013

Its not like I had to shut off the old site while I worked on the new codebase... The old codebase was running perfectly fine, up until I shut it off and replaced it with the new code.

aidos · on Sept 4, 2013

There are plenty of projects where by the time you've finished your rewrite the product will have moved on so far that you'll have to rewrite some again. I seem to recall Facebook making several attempts to rewrite in a language other than php but the codebase just evolved too quickly.

etler · on Sept 4, 2013

You don't rewrite for features. You rewrite for better maintainability and productivity. After your rewrite you shouldn't have to rewrite it again. The balancing act is that a rewrite should improve your implementation velocity, so while it may be a setback, your velocity should be much higher than your competitors, so over time you would be able to surpass them and then gain a lead they cannot beat thanks to your higher productivity. As the technical debt grows, your productivity will get slower and slower, so you won't be able to keep up with your competitors anyways. That's the gamble.

Just because Facebook was written in PHP doesn't mean that the code was unmaintainable. I don't know if it was or not. The problem they had was that it was unscalable, and they didn't do nothing about it. They wrote a PHP to c compiler, and that's a monumental task. So they paid their technical debt in a different way that they decided was the most efficient way for them to deal with it. If the problem isn't scalability, and is maintainability, you may not have that luxury. Unmaintainable code's problem is inherently in its structure, and the only solution for that is refactorization or a rewrite. The code may be so unmaintainable that even refactorization isn't really possible. The best offense is a good defense. Deal with technical debt from the get go. Sure, write your MVP in a weekend just to test it, but rewrite it early for maintainability while it's still small and possible.

aidos · on Sept 4, 2013

Fair points. I guess I was reacting more to the fact that the original example probably wasn't of the scale (single person) to use as an example. The Facebook case is definitely different in that they were building new features too fast to keep up with.

My project (single developer) is undergoing a big underlying change (switching from MongoDB to Postgresql) at the moment. I still need to keep development going on the live branch while I do this work. I'm doing it because I'm having maintainability issues with the data store that I've been putting off for a couple of months. It's as you say - make sure you refactor before you discover that refactoring is impossible. It's something I do regularly and my business partner has noticed that after a refactor of a component he gets other features faster so that's the only justification he needs.

cpeterso · on Sept 4, 2013

But are you adding new features to the old codebase? The challenge is when the rewrite takes longer than planned (e.g. Netscape) and your old product is no longer competitive but the replacement (which, as a rewrite, may not include any new features) is not ready.

GFischer · on Sept 4, 2013

It appears that for your case, a pretty new and small project, doing a complete code rewrite was the right choice.

However, for BIG codebases (especially those that you didn't write yourself !!!! ), I'd say Joel is spot on.

The company I work for is currently busy throwing five million dollars down the drain on a greenfield re-development of a bad but servicieable insurance package, when they could have either overhauled it or bought a commercial product - which they deemed "too expensive, but I'll eat my backpack if the final rewrite isn't both late and over the budget of the commercial product. It's not a Netscape-style failure, but it will probably cost it a lot of market share.

Osiris · on Sept 4, 2013

Even for smaller projects it may not be the right choice.

I had a side project I work one that would take a complete re-write to make it testable as well as being able to work better in Windows 8.

But I find that when I think about it, it's a huge amount of work and while new features and bugs can be annoying to fix (lack of tests also means inevitable regressions), but still far less time consuming than a re-write.

Plus, if I completely rewrite it so it's awesome and more flexible, I expect that my revenue will probably be about the same as it is now. Rewriting it won't increase my sales unless I'm able to add features that would be impossible in the current code.

MartinCron · on Sept 4, 2013

It's not just(!) the five million dollars that they are throwing down the drain, think of the opportunity cost spent in what you didn't develop during all that time you were basically running to stand still.

GFischer · on Sept 4, 2013

Yeah, I'll be abandoning the Titanic ASAP :) .

It's an insurance company, it shouldn't think of itself as a software shop. It does sometimes make sense to build stuff in-house, but not an insurance package (reinventing a pretty expensive wheel).

MartinCron · on Sept 4, 2013

Well, if they look at their ability to make an effective insurance package as a competitive advantage instead of a necessary evil, I don't see anything absolutely wrong with an insurance company making insurance software, but I'll defer to your domain expertise :)

Good luck.

GFischer · on Sept 4, 2013

It's a small subsidiary of a big insurance company, with 120 employees in the local branch (20 of which are now in software) and 45 million dollars in premiums (that's sales, not profits).

They hired a new CTO, who hired a new team specifically to write the new software, and are NOT using any of the old developers' knowledge (me and 2 others are stuck maintaining the old software).

The new team blew the deadline for the first small module (a small subpart of claims management) by 6 months (the old team had estimated 2 weeks for delivery of that module if we had done it ourselves).

They vastly underestimate the effort and have no idea of what they got themselves into (and they're paid employees, it's not like they can be sued for not delivering).

It doesn't make sense for a company that small to divert that much effort into a non-core competency.

Now, if the parent company (10.000 million in premiums) decided to build their own insurance software, I believe they could do it right :) .

oblio · on Sept 4, 2013

Personal projects are one thing. But company projects involving more than 3 programmers? Forget it! He's right about these projects - you shouldn't rewrite them. First of all the experienced people will probably move on, secondly business pressures will never allow you to do things "just right". I can't tell you how many rewrites I've seen fail, compared to the handful of rewrites that actually achieve their goals. And this is a very risky proposition for a company.

MartinCron · on Sept 4, 2013

If I came into work today and brought up the idea of doing a complete code rewrite, I'd either get laughed out of the room, or even worse, threatened with being fired

If that's really the case, then you're working with a bunch of abusive, pedantic, and narrow-sighted assholes. I will generally say "no" to the idea of a complete code rewrite of any meaningful and currently useful system (based on my own experience as much as Joel's article), but I won't laugh at people or threaten to fire them over it.

umsm · on Sept 4, 2013

Or they may have experience with people that walk in and declare all code as "insecure" and "dangerous" or "unusable". They usually promise the world and deliver a small fraction, if even that.

Then, for the next 9 months delay the "1 month" project week after week. The product owners love this one. It cost them hundreds of thousands of dollars to get to square one again.

And all of this happening to a product that wasn't released yet.

At the end, the developer quit and everyone in the company agreed that it was good to see them leave.

MartinCron · on Sept 4, 2013

Good to see you're not bitter about that.

umsm · on Sept 4, 2013

Well, I am young, so I may go through something like this again. But at the moment, I was totally against it.

Now, when something like this comes up again, I learned enough to PLAN through it better. Realistic deadlines will make any work place a better place to be.

MartinCron · on Sept 4, 2013

Of course, and it is easier to plan for incremental replacement and refactoring than for a wholesale re-write.

etler · on Sept 4, 2013

Technical debt will never go away unless you confront it directly. If you put it off for too long you simply cannot recover it. Yes a complete rewrite is painful, and can kill you, but so will technical debt. By the time you're drowning in it, there really isn't any easy escape, if there is one. The cost to fix is an investment, it pays off over time, so any refactorization work will eventually pay off, it's just a matter of whether you can pay the cost. Eventually the cost will be so big that a full rewrite will be cheaper. If Netscape didn't do a rewrite, their technical debt would have stayed and they would have died a slow painful death. The real worst mistake you can make is having a policy of putting off technical debt. I've seen it kill companies, and for them, it's a permanent death with no hope of a revival.

silentmars · on Sept 4, 2013

Joel's article is cautioning against the kind of cost/benefit-blind thinking that makes "rewrite all the code" a default option. Actually, I think in a certain way your story - perhaps unintentionally - agrees with this point.

In your case, you accepted the cost of doing the project over, which as you said was to write it again taking the same amount of time. The fallacy that Joel is talking about is engineers thinking that they can do a total rewrite of some software that took years to build in some much shorter period of time with better code. The predictable result is huge financial losses, and sometimes even a total market abdication.

Aqueous · on Sept 4, 2013

"It took me roughtly the same amount of time to build the new codebase as it did the old codebase."

Imagine if you spent that time re-engaging with, documenting, and cleaning up the old codebase - you could have also added a bunch of new features in the time it took you to get up to feature parity with the new rewrite, which also doesn't have documentation and doesn't have any additional features. If you're like me, you'll find out when you go back to look at it in 7 months that you again don't understand it. What then?

It's better just to accept that there is cognitive overhead involved with going back in time in your projects. But it's worth it since the cost of a re-write is higher.

There are very few circumstances where re-write costs less - and I think it's when the exact financial cost of the technical debt from the first version outweighs the overhead of the re-write. Sometimes this happens, but it's not very often.

I'll admit that I've done exactly one rewrite and it was exactly in this type of situation. I was rushed on a project with a new employer and forced to use a language I despise, PHP. The project was to create a document management system for power plant engineers. I had to come up with a production ready system in a month and a half. I did, but of course, it was initially very buggy, poorly designed, with no unit tests, and written in a sub-par language(Sorry PHP fans.)

In this very rare case, the (in my mind unavoidable, thanks to the time constraint) technical debt from the first version was so high that a complete re-write made sense.

I have since rewritten the entire project in Scala with the initial project in production (and most of the initial bugs fixed). I did so understanding that from a financial perspective, there was a certain amount of overhead in re-writing it. But I felt that the technical debt outweighed that overhead. In this case, I did so much integration testing and unit testing on the re-write that I was able to get it up to feature parity without many bugs in about the same amount of time.

But I know that because this code is properly documented, architected, and unit and integration tested, I will never, ever re-write this version, because it will never make sense to do so.

Which is the lesson: Take the time to design your system right, and test every component, the first time and there won't ever be a need for a re-write, because you'll always be able to look at the code and tell what it does.

"I'd either get laughed out of the room, or even worse, threatened with being fired, thanks to this Joel Spolsky article."

I doubt very many people remember this blog post from 14 years ago.

Osiris · on Sept 4, 2013

The last project I worked on we were doing rewrites by replacing specific functionality and displaying that within an iframe. Over time we slowly replaced various pieces and added new functionality in a new code base with the end goal to eventually replace it all.

In the end, the last project we took upon ourselves turned out to basically require a rewrite of the majority of the app and the rewrite was scraped at about 80% completed.

cpeterso · on Sept 4, 2013

> the rewrite was scraped at about 80% completed.

Which rewrites were scrapped? All of the iframe rewrites or just the last feature rewrite? Was the result considered a (partial) success or a failure?

Osiris · on Sept 4, 2013

Just the last feature, the previous ones were completed and deployed. The department shifted focus which left us with a small window to finish up the project and we weren't able to get all the features done in that time frame. Had the department shift not happened, we could have finished within about 2-3 months.

I actually think we should have deployed without re-implement every existing feature, but enough to make it work for most use cases (with an option for users to switch between old and new), but there was concern that we'd drive up support costs during our pivot and they wanted to avoid the distraction.

So the project wasn't a failure, but a department-wide pivot put restrictions on the project and made it less attractive to finish.

mtdewcmu · on Sept 4, 2013

I think an essential part of Joel's point is that the existing code was written by other people, and you get the impulse to rewrite it rather than read it. If you wrote the first version entirely by yourself, then you probably have a good idea of what's in there and what it would entail to rewrite it.

ibudiallo · on Sept 4, 2013

Complete code rewrite is only good when you are in a learning stage or when the project is not too big. Incremental update is much better and allows new features to be added without halting all the project for rewrite.

MartinCron · on Sept 4, 2013

You have to remember that at the time it was written, complete code re-writes were a lot more common and often had disastrous results.

Sure, I've also completely re-written systems in a way that wasn't disastrous. I'm doing one right now. But the legitimate case for it is sufficiently rare to deserve some extra scrutiny, even if it is a more fun approach for the developers involved.

zimpenfish · on Sept 4, 2013

If you read jwz's output, it's not that "Mozilla decided to rewrite it", it's more like "bunch of idiots who didn't know what they were doing took over and tried to rewrite things whilst ignoring all the sane advice".

hvs · on Sept 4, 2013

Great article that is still relevant today (sadly, I remember reading it when it first came out). If you really, really feel the need to rewrite from scratch, I recommend instead picking up a copy of "Working Effectively with Legacy Code" by Michael Feathers [1]. It will give you ways to improve those terrible code bases while not throwing out the existing code. Plus you'll still get that "new car smell" of working on your code.

[1] http://www.amazon.com/Working-Effectively-Legacy-Michael-Fea...

TheRealDunkirk · on Sept 4, 2013

Seems to me that a lot of what Joel was complaining about could be mitigated by some well-placed comments in the code. "This function is 2 pages long! Oh, this part does that, and THIS part does THAT. I see what's going on. This is all good."

This seems to be something people don't even bother to argue about any more. Everyone's given up! I don't do much myself, but no one I have worked with over the past 20 years has done ANY. Scary. I've only recently moved out of engineering to a "real" coding firm, but I'm still not seeing it.

I guess coders, in general, either think that everyone should immediately understand WHY code should be doing what it's doing, or be fired, or they think it's job security to not share these details.

pjungwir · on Sept 4, 2013

That's sad to hear. Part of it is an ideology that code that needs commenting should simply be rewritten. But I've written many paragraph-long (or page-long) comments, especially at the class and module level. Where comments are really necessary is to give a big picture overview how all the parts come together. Or to explain the "why". My personal theory is that comments are like chess annotations. Sure you could just read the notation, but the comments show you what is unseen, like what would have happened in six moves if white had taken that pawn. . . .

omegaham · on Sept 4, 2013

This is a really insightful comment. I've struggled in the past with what the purpose of commenting is. I've written some programs where comments were the majority of the code, and it was actually more difficult to read than if I'd just left it unannotated. I've also written quick-and-dirty programs that gradually turned into behemoths, left them alone for six months, and then come back to them and felt like kicking myself for not putting comments in.

I just had a crazy idea for commenting, though - I'm imagining an IDE that is built for two monitors. The left monitor is your source code. The right monitor is your comments. You wouldn't put your comments in and around the code; you'd put them into this application that would be running on the right monitor. When you select a line of source code, the relevant comments pop up on the right side. They can be as complex as you want them to be. If your class needs five pages of comments, you can put five pages of comments and not have to worry that it's going to disrupt the source code. Then, when you select something else, that comment disappears and is replaced by the comments for that line. You could even have extra monitors (or sections of that comment monitor) showing multiple things - for example, you could have the function's comments along with the line-by-line comments.

Is this crazy, or more importantly, has this been done before? I think the first thing that someone will say is that it's too much effort, but it doesn't have to be used for everything. You don't need to make a five-page comment for your for-loop, but you might for your massive class that encapsulates ten objects and has forty-five functions that modify different things.

pjungwir · on Sept 4, 2013

I've never heard of showing two different view styles of the same source code document in two separate windows/monitors, but it makes sense. Perhaps this could be done with literate programming, or LaTeX packages (where you use the same file either to build the documentation or install the package), or even Javadoc and its imitators, with an IDE that lets you collapse either the comments or the function bodies. Maybe Light Table could have something like this? But if your tooling had good support for it, like being able to click on a function in the code window and make the doc window bring up that function's info, that'd be pretty nice.

btilly · on Sept 4, 2013

Fixing unmaintainable code with guaranteed to be unmaintained comments is a losing cause. To the extent possible, you should not duplicate logic in code and comment.

I find that I comment for three reasons. 1) To make clear what the public API is that shouldn't change versus what can change at any time. 2) To comment on non-obvious but important details of other APIs that I am calling. 3) To give a very general overview of the code.

But NEVER do I put comments of the form, "This part does this, and that part does that." If I feel tempted, well, maybe I've discovered a function that I want to use instead...

ronilan · on Sept 4, 2013

The rewrite meant Netscape 6 used Gecko, which brought us Firefox, which revived the browser wars, which brought us the WebKit browsers.

History is a twisted mess...

dubcanada · on Sept 4, 2013

The real question is would Firefox on old Netscape 4/5 engine run the same and allow them to do what they did with Firefox? Or would they have eventually rewrote it into what Gecko is today?

yuhong · on Sept 4, 2013

I think Mariner (the codename for the cancelled project) was supposed to be developed in parallel with Gecko.

GVIrish · on Sept 4, 2013

One of the huge pitfalls in the writing from scratch situation is that often times the original application didn't have adequate requirements. So the team sets about building the new version without a comprehensive and detailed understanding of what the old version did. That alone can lead to huge timeline and cost overruns.

And if your organization didn't learn from its mistakes the first time around, it'll make many of the same mistakes the second time around, except this time they'll be concentrated in one monster release rather than a number of smaller releases. Especially if the original developers have left.

brudgers · on Sept 4, 2013

It is easier to write an incorrect program than understand a correct one.

-- Alan J. Perlis, Epigram 7

danso · on Sept 4, 2013

I picked up the Pragmatic Programmer some time ago and among the great lessons it had, the one that sticks with me every day at the code editor is: You spend far more time reading your code than you will writing it.

With that in mind, it makes it easy to write tests and more comprehensible variable names (still thinking and re-thinking how to best do documentation...was checking out github's Ruby guide and Tomdoc yesterday http://tomdoc.org/)

nahname · on Sept 4, 2013

That looks like noise to me. There is almost four times more code dedicated to comments than useful code. If you write unit tests it would cover/explain all these cases. I don't see the point.

moron4hire · on Sept 4, 2013

I agree. It would work a lot better of the function were called something like "stringRepeat" or "duplicateText" or something other than "multiplex", because that's really not what multiplexing means. Then, rename that one parameter from "count" to "repetitions" and the function is almost completely self-explanatory.

Not to mention, it probably doesn't deserve to exist at all. "text * count" is simple enough that it should probably be inlined.

nahname · on Sept 4, 2013

Comments are generally superfluous, sometimes incorrect and often used to explain away bad code.

collyw · on Sept 4, 2013

Depends on the comments. I had to go and fix someone else's scripts, with comments like:

#open files and read contents

#loop through list

No idea what was in the file, or why the contents were being read. What was the purpose of the loop.

Commenting on those things would have saved me a lot of time. (It took me a week to fix something that should have taken a day, if it had been coded better in the first place.)

(To put it into context, it was a previous workers code which interacted with the database I had written. When I changed some things in my code, her code stopped working. I didn't know how exactly her code worked, or the nature of the files that it was reading from the system, but I did know what working right would be. I am sure this is not an uncommon situation.)

moron4hire · on Sept 4, 2013

Yes. I consider comments to be part of the code, and thus in need of being maintained as much as the code. The more code you write, the higher probability of defects to be written. As comments have absolutely automated 0 checking applied to them, defects in them are all but ensured.

chatman · on Sept 4, 2013

This is esp. true for dynamically typed languages.

An IDE helps a lot more deterministically for a static typed language like Java.

mcphilip · on Sept 4, 2013

I took over a large Groovy codebase that was damn near unreadable since it's controller and service layers heavily abused Groovy 'features': the def keyword (basically never defining intended types at all); not declaring return or argument types; implicit return statements; etc.

My first project was to spend a few days and literally remove all possible occurrences of def and add back in return and parameter types everywhere.

It was possibly the best decision I've made since it didn't involve throwing away any code and made existing code much more readable.

It's just sad that the highly capable developers that wrote the original code base didn't take the extra few seconds to declare their types.

Roboprog · on Sept 4, 2013

Groovy 2.x makes a bit of a nod in that direction. There are annotations to turn static typing on (or off) on classes and methods as needed.

This allows you to be "safe" in most of the code, and explicitly label the parts with needed "magic".

I agree with you that it's probably best to limit def or type-inference-like operations to local variables within functions/methods.

Tangent: I like how Go does strict typing with interfaces, but still allows duck typing of parameters by not binding classes to interfaces at class definition time.

vorg · on Sept 4, 2013

The static typing features in Groovy 2 were written by just 1 person only 1 year ago, Grails doesn't use them in its codebase, and Groovy's only other user with any traction, Gradle, still ships with Groovy 1.x. Best not to trust its features until Grails does.

Roboprog · on Sept 5, 2013

Hmm. Thanks for the tip. I've been working on trying to graft Groovy 2.1 onto an old Java app. The static type annotations don't work with Java 1.5. You have to have at least Java 1.6. (I discovered today) So, if you have Java 1.5, you can really only use Groovy 1.x language features, as well.

Otherwise, it turned out to be not so hard to start putting little Groovy seeds into a Java garden, which I'm not going to rewrite as Grails. Just being able to use closures with the Groovy Collection extensions is pretty useful.

kyllo · on Sept 4, 2013

Say what you want about Java, its awful "kingdom of nouns" paradigm, its lack of nice functional features, etc., but it is a highly readable programming language.

jeremyjh · on Sept 4, 2013

I think all the boilerplate really does detract from the readability. Your eyes sort of glaze over and you can miss small departures from the boilerplate you expect to see.

crucini · on Sept 4, 2013

Really? You have trouble with code like this?

    RegistrarInstallationsChoicesSubclass registrarInstallationsChoicesSubclass = millisDOMX509Grow.getRegistrarInstallationsChoicesSubclass();
    KeyedGSSPolygons keyedGSSPolygons = new KeyedGSSPolygons(incompatibleTransliterationNoPerfCombiningContention, iSO885915ConnectionlessAsyncDiagonal, dotDataetJDIMarshallingResourceszhObservablesDiagonal, native2asciizhDelegateEnvarLearned, insertStrategyRTFLearned);
    keyedGSSPolygons.addISO885915ConnectionlessAsyncDiagonal(iSO885915ConnectionlessAsyncDiagonal);
    AppendingGuiceJustificationMarshalling appendingGuiceJustificationMarshalling = instantiatorParenthesisForeachOutlineCounted.getAppendingGuiceJustificationMarshalling();
    AdaptiveReflogMerges adaptiveReflogMerges = new AdaptiveReflogMerges();
    adaptiveReflogMerges.addIishCoolbarTuple(iishCoolbarTuple);
    AppendingGuiceJustificationMarshalling appendingGuiceJustificationMarshalling = keyedGSSPolygons.getAppendingGuiceJustificationMarshalling();
    RanksPessimisticSeparableDialFirst ranksPessimisticSeparableDialFirst = new RanksPessimisticSeparableDialFirst(native2asciizhDelegateEnvarLearned, colorizeHmacComplexJanitorEnvar);
    ranksPessimisticSeparableDialFirst.removeIncompatibleTransliterationNoPerfCombiningContention(incompatibleTransliterationNoPerfCombiningContention);
    AdaptiveReflogMerges adaptiveReflogMerges = fixerAppletPerfEs.getAdaptiveReflogMerges();
    ReferencerBreakingViewersvAes256RenderingsGJ referencerBreakingViewersvAes256RenderingsGJ = new ReferencerBreakingViewersvAes256RenderingsGJ(membershipDriverConsistency, iishCoolbarTuple, sectorPartsGJForking, ranksPessimisticSeparableDialFirst);
    referencerBreakingViewersvAes256RenderingsGJ.addISO885915ConnectionlessAsyncDiagonal(iSO885915ConnectionlessAsyncDiagonal);
    DitherTWSDLGraphemeInserting ditherTWSDLGraphemeInserting = cnxnsCNTWSDL.getDitherTWSDLGraphemeInserting();
    Message12DL2ExposeReconcilablePerfEditionUnaligned message12DL2ExposeReconcilablePerfEditionUnaligned = new Message12DL2ExposeReconcilablePerfEditionUnaligned(dotDataetJDIMarshallingResourceszhObservablesDiagonal, sAXXMINamessvInconsistencyAnswerLoopingCommunicatorEdition, insertStrategyRTFLearned, dotDataetJDIMarshallingResourceszhObservablesDiagonal, nIDiffuseGraphemeINAes256Kind);

kyllo · on Sept 5, 2013

Lulz. This is why you don't do graphics programming in Java.

kyllo · on Sept 4, 2013

Perhaps, but I think the fact that it's such a boilerplate-oriented language, requiring explicit type declarations, arguments, and returns, and lacking lambda expressions, it really discourages cleverness and subtlety, making it easier to read than dynamic and functional languages. It also lacks the pointer arithmetic, bit shifting, and operator overloading of C/C++ that make those languages so difficult to read. Java may be repetitive, verbose, and dumbed-down, but that definitely keeps it from being a "write-only" language.

I do agree that Go is a good compromise, very readable but with less boilerplate, and probably an improvement on the Java paradigm, but the lack of generic data structures makes it kind of hard to use Go as a Java replacement.

Gravityloss · on Sept 4, 2013

Perhaps java is fast to read in characters per minute. But for actual functionality, I'll take something like Ruby any time.

kamaal · on Sept 5, 2013

Readable by IDE's.

Large Java applications have 200+ files. Its next to impossible to handle that kind of an interaction without an IDE.

randyrand · on Sept 4, 2013

This has always been my major complaint with dynamically typed languages. This and bad optimization - its a wonder that dynamic languages are so popular. GO is a good compromise though.

eldavido · on Sept 4, 2013

This article shows how much progress we've made in languages and tools over 13 years. It's vastly more difficult to read large, c++ codebases with tons of memory allocation, pointer arithmetic, macros, and complex locking schemes than much of the Java/Ruby/other mainstream language code written today.

Also, it strikes me that this was written in an era where most large-scale systems were giant monolithic codebases sharing an address space, rather than distributed groups of services communicating over a network. Distributed systems are inherently easier to rewrite because a lot of the hard problems around abstraction, state, representation, etc. already have to have been solved by the architecture -- you can't exactly pass a naked void* over the wire.

semjada · on Sept 4, 2013

So he was wrong then? NS6/firefox seems to be doing alright today... Rewriting your OWN codebase from scratch certainly makes it get better with each iteration - as long as "from scratch" implies constantly referring to the old code.

rythie · on Sept 4, 2013

In 2000 Netscape had 19.25% market share, it took 8 years to get back to that position with Firefox: http://en.wikipedia.org/wiki/Usage_share_of_web_browsers#The...

kbd · on Sept 4, 2013

This is from 2000. People need to stop posting old articles without putting a date in the title.

mattlutze · on Sept 4, 2013

Relevance and recency are not always directly proportional.

etler · on Sept 4, 2013

The new historical perspective on this is very interesting. The code rewrite killed the company, but it gave birth to mozilla, which came back to topple IE from its throne and help bring through a new renaissance of browser development. If they hadn't rewritten their code how would have history changed? Would we be where we are today? How many years behind would that setback have put us? The release of Firefox was instrumental at disrupting and evolving the web. With a poor code base could they have done that as effectively as they did?

A company may be transient, and killed by choices that hurt them in the short run, but will support them in the long run. But good open source code is eternal. The quality speaks for itself. Their decision may have killed their company but it was astronomical for the web. From a higher level perspective, they did the right thing, and it shows that good code is so powerful, it can even outlive the company that made it.

masswerk · on Sept 4, 2013

On the other hand it was particularly for this stand-still of NS, why IE was on the throne. And this made HTML/JS effectively a proprietary standard, which brought web development eventually to a stand-still also.

When open standards enforcing browsers finally became available (and – saying this as a once ardent NS-follower* – what a buggy mess NS 6.0 was in the beginning!), it was essentially a political question, whether they would succeed or not (as they were incompatible to most of the existing web code base).

*) I used to code NS4.x-compatibility and open standards calls then, even, if I wasn't able to bill for these as clients were only interested in "does it work on IE, so it's fine".

lxe · on Sept 4, 2013

I know I don't have the expertise to argue with Joel Spolsky about software engineering, and while I agree that rewriting from scratch is a huge strategic mistake, it is hardly the worst.

Have any of you ever worked for a large engineering firm with a sizable operating budget? There are many "strategic" errors that happen on the daily basis in those kind of shops that range from software stack choices to engineering process drowned in bureaucracy. Rewriting code is hardly the worst.

Also, don't assume that even though rewriting code is bad for business, it's bad overall.

You iterate, and learn from mistakes you made that were caused by the lack of experience at the time previous code was written.

You de-couple components during a rewrite and eliminate unnecessary features, drastically improving maintainability.

You make it easier for other programmers (and yourself) to read the code.

You can alter core platform pieces, make tech stack alterations, and eliminate reliance on legacy components during a rewrite.

The list goes on...

ht_th · on Sept 4, 2013

At first sight, this statement seems odd. For sure, writing a text is more difficult than reading it, and programming is writing a text. But a different kind of text. A program is written to communicate ideas with a computer first, and with humans second, whereas an article or novel is written to communicate ideas with humans only. What if we would have a programming language that allowed us to write to communicate ideas to humans first and have the compiler/interpreter automate the communicating with the computer?

Or could it be that a non-trivial program is essentially more complex than an article or novel or any other regular text we encounter daily?

Or is reading a program similar to reading a text in an unknown language? Like trying to read some Roman text when you don't really know Latin, but you've access to a Latin-English dictionary and a good grasp of history and the context and topic of the Latin text?

derefr · on Sept 4, 2013

Reading and writing themselves are the least difficult parts of reading and writing programs. The majority of writing a program is conceiving of a well-defined model--after that, all you are doing is typing. The majority of reading a program is attempting to reconstruct someone else's conception of a model into your own brain from the typed description. Reading just the text written for the computer answers all of the "what" (as if it didn't, the program would not run), but none of the "why" -- and so makes reconstructing that model very difficult.

jeremyjh · on Sept 4, 2013

Yes and even after you have enough working knowledge to support or enhance the code you will still find yourself continuously suprised because your model is not a perfect understanding of the author's intended model, AND the code itself is not a perfect representation of that model to begin with.

Gravityloss · on Sept 4, 2013

That's why it's so nice to read higher level language code. It's more weighed towards the why and less towards housekeeping.

cehlen · on Sept 4, 2013

Code may not rust overtime but it does become harder to maintain and more importantly harder to find quality innovative people willing to work on it. I would guess that very few of the people making a pro argument for Joel Spolsky’s article would be willing to support 1970’s COBOL banking software for a living.

YZF · on Sept 4, 2013

It's not black and white. There are situations where investing in your old code base is throwing good money after bad and there are situations where the old code base can be better. You need to make educated decisions not blindly follow rules.

Refactoring can be a problem because you're always working towards some local minima. It's a little bit like the decision to renovate an old house vs. raze it and build a new one or buy an old car/boat and repair it.

There's a good discussion of various factors here: http://programmers.stackexchange.com/questions/6268/when-is-...

drderidder · on Sept 4, 2013

There's a time and place for everything. Firefox and Mac OS X are examples of successful rewrites. I've had some positive experiences with re-writing certain things from scratch. Not to downplay the issues Joel raised back in 2000, but if a product is tied to a technology that's limiting its effectiveness, and better alternatives have become available, a next-generation implementation bears consideration. I think companies that fail to iterate and evolve this way will ultimately stagnate.

mtdewcmu · on Sept 4, 2013

I can't imagine how a codebase could be so horribly bad that it doesn't do even one thing right, and you have to throw it out wholesale. That is, assuming it was good enough to ship at some point.

I don't like to feel that I'm wasting my time doing rework, and vulnerable to getting in way too deep with no backup plan. I'd definitely want to rewrite in pieces without ever discarding the whole, even if eventually there might come a day when every single piece has been rewritten.

ryanackley · on Sept 4, 2013

You usually only hear the catastrophic failure anecdotes. Large, complex, yet successful rewrites I know about:

* The Windows kernel. From my understanding, Windows NT was a completely different kernel from Windows 95 and eventually replaced it in Windows XP.

* Coldfusion, I used to work with an ex-Adobe guy that told us the story how this was rewritten from scratch at one point. From C++ to Java

* I used to work on the SQL Server team at Microsoft. We completely rewrote the Reporting Services product between the 2005 and 2008 releases.

masswerk · on Sept 4, 2013

Windows NT was an incremental update to what would have been the next VAX-OS. (As for my humble knowledge.) See "Dave Cutler and Windows".

But another good example for a very successful rewrite would have been Adobe's ActionScript for Flash 5 (turning to an ECMA-script-like language).

to3m · on Sept 4, 2013

Windows NT development started around the time of Windows 3.0.

kailuowang · on Sept 4, 2013

I always deem readability the number one priority in writing working code. Readability is not just about understanding what the code in front of you is doing but also easiness to find where in code a particular business logic was implemented. Code written with readability as the top priority has the best chance of avoiding the need for a full re-write. And even if it needs to be re-written from scratch, it's so much easier with a readable codebase.

bluedino · on Sept 4, 2013

How does this apply to a platform change? Say, going from an application written on 80's UNIX hardware that's still going strong today on a 9-year old Itanium system?

Start writing replacements for individual modules? The original programmers are set to retire in another 5 years and then it's going to get really, really ugly.

igl · on Sept 4, 2013

April 06, 2000.

Any progress report on this?

trebor · on Sept 4, 2013

This article has been posted around 3-4 times this year alone. Most people should already be well aware of the cost of a full rewrite due to all the horror stories. Yet, sometimes the best software has come from a full rewrite.

moron4hire · on Sept 4, 2013

Unless the original system just did nothing of what it was supposed to do, a full rewrite being successful would be in spite of itself.

niuzeta · on Sept 4, 2013

Shakespeare is still an entertaining read after five centuries.

lowmagnet · on Sept 4, 2013

(2000)

yuhong · on Sept 4, 2013

An example is legacy color parsing. I personally figured out where legacy color parsing is in the Netscape classic source: http://stackoverflow.com/questions/8318911/why-does-html-thi...

It is so subtle even Netscape's own Gecko rewrite did not get it completely right the first time: https://bugzilla.mozilla.org/show_bug.cgi?id=121738

powertower · on Sept 4, 2013

Does anyone have a good way to distinguish between -

A) Rewriting the codebase.

B) Fixing & modularizing the codebase - breaking it up into self-contained units/parts while improving, cleaning, and modernizing them?

Because if you throw away the edge-case of completely starting from scratch, B) can easily be seen as A) and vice versa.

Is there a thin line here?

(*I have a 200K line C#/.NET app that will be evolving and pivoting with it's next major version).