Hacker News new | past | comments | ask | show | jobs | submit login
How to write untestable code (2008) (googleblog.com)
95 points by jeremylevy on Feb 15, 2022 | hide | past | favorite | 80 comments



> OO cowboys will want to have a whole polymorphic soup of collaborating objects. Say no to the OO-ist. When you nest and branch conditionals, all you need to do is read the code from top to bottom. Like a great novel, one simple linear prose of code. With the OO-overboard paradigm, it's like a terrible choose-your-own-adventure kid's book. You're constantly flipping between classes and juggling patterns and so many more complex concepts. Just if-things out and you'll be fine.

This is meant sarcastically, but I think this is actually one example of where a lot of backlash against OO comes from, especially in 2022.


I come from the opposite camp. I detest functions that are five pages of boolean logic, which I guarantee you contain at least a dozen bugs, no matter how much you think you've tested it.

Code coverage is like those skin creams that say "Scientifically tested!" on the label. Yeah, well, tested doesn't mean that it worked. It just means that there was a test.

Watch this video of Andrei Alexandrescu's talk in CppCon 2015: https://www.youtube.com/watch?v=LIb3L4vKZ7U

I've done things like this in my career, and it's always worked out brilliantly. Bugs just evaporate when you let the compiler "combine" things for you, instead of just plopping everything into a bowl and mashing it together with your hands like some sort of animal.

Especially in today's C++ or Rust, you can "have your cake and eat it too". It's possible to build up these nice, single-purpose class hierarchies and combine them into complex, powerful constructs via templates / traits. The compiler takes care of the edge cases for you, because it never gets bored or distracted. There's no performance downside because of static dispatch, inlining, and aggressive optimisers.


But having a function spanning five pages isn't smart practice, regardless of OOP or Functional style.


Well, usually the way to avoid a 5 pages function is by encapsulating concepts away into self-contained code. It is not by encapsulating it away into code colocated with state.

OOP is not the answer here. Class hierarchies can be, as long as they are not OOP and "classes" are just synonym for "modules".

Code and data colocation has its uses, but it's normally best fit for interface layers than internal code.


> Be Defensive - They're out to Get Your Code! - Defensively assert about the state of parameters passed in methods, constructors, and mid-method. If someone can pass in a null, you've left your guard down. You see, there are testing freaks out there that like to instantiate your object, or call a method under test and pass in nulls! Be aggressive in preventing this: rule your code with an iron fist! (And remember: it's not paranoia if they really are out to get you.)

This seems nuts to me? Sacrificing guarantees for the sake of testability? I mean defensive coding is one of good tools to fail early and with meaningful error messages and confidence on state within your method.

Moreover there are tons of points just to enable mocks'n'stuff. Why not just go higher up the abstraction layer and test things functionally there? It's more useful IMHO than some state object that may never be true in real life.

It reads like chasing some code coverage metric with fake state. When in reality, you may still get nullreferenceexception because your assumption on false state was incorrect. Or some method call there may fail with unhanled error code or whatever. It doesn't take branches for fully covered code to fail.

IMHO unittests are useful, but in limited scope.


In my experience, if you find you need to check every parameter because your code keeps blowing up from all the bad parameters its getting, the problem is at the source of the bad data. Checking input everywhere is a questionable treatment of the symptom, it won't fix the bigger problem that some part of your code is producing invalid output. It's a lot harder to fix bad data as a consumer of that data, than as its producer.

This primarily applies to internal code. For external APIs there is no such thing as too much validation.


I don't trust my future self that I'll start feeding in bad data. Of course there are some boundaries, but not too deep.

I've started more of defensive coding when I've read that for a software I forgot how it called - there was security audit which find no important security flaws. Because their ideology was to not trust the input, even if its your input. Today your assumptions are correct, tomorrow, because of changes, they can be incorrect.

I'll post link here if I find it. I think I read it on HN few years ago.

Edit: A-ha, here it is. Dovecot. HN Article: https://news.ycombinator.com/item?id=13407233 and https://wiki.mozilla.org/MOSS/Secure_Open_Source/Completed#d...

> The Cure53 team were extremely impressed with the quality of the dovecot code. They wrote: "Despite much effort and thoroughly all-encompassing approach, the Cure53 testers only managed to assert the excellent security-standing of Dovecot. More specifically, only three minor security issues have been found in the codebase, thus translating to an exceptionally good outcome for Dovecot, and a true testament to the fact that keeping security promises is at the core of the Dovecot development and operations."

Practices documented in source: https://github.com/dovecot/core/blob/master/doc/securecoding...

> Don't rely on input validation. Maybe you missed something. Maybe someone calls your function somewhere else where you didn't originally intend it. Maybe someone makes the input validation less restrictive for some reason. Point is, it's not an excuse to cause a security hole just because input wasn't what you expected it to be.

Perhaps it's just my way of dealing with "I know that input is validated". Yeah, defensive coding can get verbose, but makes me feel confident. + The NASA Coding Guidelines of states:

> Each calling function must check the return value of nonvoid functions, and each called function must check the validity of all parameters provided by the caller.

> The code's assertion density should average to minimally two assertions per function.

/just extracting useful bits of advices from veterans


My experience is that this type of coding actually introduces more fragility and sensitivity to change. If you validate conditions you think will be true, but do not explicitly depend upon at every level the data passes through, you'll find your application randomly blowing up when those conditions change.

I think the problem boils down to a violation of separation of concerns. Defensive code inevitably concerns itself with ensuring guarantees about state that is the concern of other modules.


Blowing up can be the correct answer. What should you do if a slider that ranges from 0 to 100 sends you a negative number? Fail loudly, and log it.

Silently, trying to work with invalid inputs just results in bugs that people may not notice and are much harder to track down.


Right, I would say the problem exists within the slider, and not the consumer of its state. With a sensible design, you might even have the slider producing a type that makes strong guarantees about the range of its value. That moves the concern of validating the value range to that type's constructor, and removes it from the rest of the codebase.


That doesn’t resolve the issue. If at runtime the physical memory associated with your strongly typed data has a negative value before the function is called either the type system checks it again which then blows up, or assumes it’s still valid and your fiction operates on faulty data.

Essentially what you’re describing is automating such checks.


Hardware failure is virtually impossible to deal identify and deal with from software, because the validation relies on the same unreliable hardware.

If you want to harden your stuff against that category of errors, then the solution exists with in hardware. RF-shielding and ECC memory for starters.


Technically true, but practically hardware is very unlikely to fail in a way that maintains self-consistency.

Eg, if you checksum all the data, what you can expect from bad RAM is either to detect a mismatch, or to mis-calculate the checksum and something else will detect a mismatch later.

You certainly can't count on it, but enough of that sort of thing makes it a lot more likely you'll notice something is not quite right eventually.


Most hardware failure is extremely intermittent, otherwise it almost instantly result in a crash. ECC etc helps, but you still run into manufacturing defects etc.

That said, I agree you can’t guarantee everything is working via software but detecting hardware failure via software is simply good practice.


100% agree. That's why eventually I want to learn F# (currently my main driver is C#). It features more advanced type system and I love static typing, because by itself it provides nice guarantees. F# brings it further by guaranteeing specific state. I may learn F#, but I don't think I'm gonna use it in existing code base, as maintainability with fellow developers or whoever works after me is perhaps more important than F#...

Not that your example cannot be achieved in C# - it can. But what I suspect F# just brings those static typing guarantees further https://fsharpforfunandprofit.com/posts/designing-for-correc...


This is turning into an argument for strong typing, but, in light of your original claims, only for external inputs, not for internal interfaces - but if it is useful in the former case, why not the latter?


In some cases, say when you work very close to a business domain, this may very well be a good thing. But for larger applications, especially that do low-level stuff, the tradeoff is cluttering your namespace with single use parameters. You quickly run into scaling problems.

It's exactly the same reason bindly following "textbook" OOP design is a bad idea. Unless you're able to do some very prescient upfront design of a large part of your application, you're in for a big headache later.


> But for larger applications, especially that do low-level stuff, the tradeoff is cluttering your namespace with single use parameters.

What does this mean? I have not figured out any way to interpret this, such that it is a consequence of choosing strong typing.

Your second paragraph is more of the straw man that we are discussing in an adjacent thread.


I mean that you need to assign each type a distinct name, and in many programming languages, the namespace for types is effectively global.

If you add say a thousand types for all the things combinations of things you may want to represent, then you've effectively added the need to have a thousand type names.


That is merely being explicit about what is implicit otherwise. When you are being explicit, there is at least the possibility that these declarations could be scoped - and, in many languages, this is something you can do.


> I think the problem boils down to a violation of separation of concerns. Defensive code inevitably concerns itself with ensuring guarantees about state that is the concern of other modules.

This is something of a straw man argument. If someone is doing this, they are simply doing it wrong. It is certainly possible to write validations that only check for constraints that must hold within the code they are gatekeeping for.


The context of this discussion is adding null checks for parameters that are not explicitly used within the module, thus impairing the ability to test specific parts of the code without instantiating or mocking huge parts of the application.

That is, this pattern

  F(x, y, Z) {
    if (x == null) bad();
    if (y == null) bad();
    if (Z == null) bad();
    Z.doThing(x, y);
  }
I would argue that Z may have reason to validate x and y, but in this scenario, F only has reason to ensure that Z is not null. What Z does and requires is Z's concern, not F's.


That's what I am saying - your argument boils down to "bad validation is harmful" - true, but hardly controversial; bad code in general tends to be harmful.


Is it controversial to state that defensive programming boils down to "more validation is better"?

This is a style of programming I've seen in the field, and it's seemed to emerge within code bases with major data quality problems, but done little to actually fix the problem. You're less likely to get disastrous train wrecks, but the number of errors largely remains the same unless the cause is addressed.


> Is it controversial to state that defensive programming boils down to "more validation is better"?

It is a straw man to take this as being intended to justify incorrect and useless validation.

If this is something to be decided by personal anecdote, I will just add mine: I haven't seen this, while I have seen many cases where validation has picked up subtle and edge-case errors.


It's probably fallen out of favor, but I saw it a lot around the same time the OP was written, late-2000s, early 2010s.

Look, this is the paragraph we are discussing:

> Be Defensive - They're out to Get Your Code! - Defensively assert about the state of parameters passed in methods, constructors, and mid-method. If someone can pass in a null, you've left your guard down. You see, there are testing freaks out there that like to instantiate your object, or call a method under test and pass in nulls! Be aggressive in preventing this: rule your code with an iron fist! (And remember: it's not paranoia if they really are out to get you.)

In which way am I misrepresenting it?


That's just the same straw man that you are also arguing against!

As for your anecdotal evidence, it does not count twice through repetition.


Why wouldn't anecdotal evidence count?

This may not be the flavor of defensive programming you favor, but it is something that exists, and also something that is explicitly mentioned in the start of the discussion.


It counts for something, but yours does not count more than anyone else's - and it is certainly not sufficient to make a general case against validation. Pointing out that bad or incorrect validation is bad is news to no-one, and is merely a specific case of bad code being harmful.

>...something that is explicitly mentioned in the start of the discussion.

Yes - it started with a straw man in the article.


Have a nice day.


When I need internal checks and flexible testing, I create testing types. So, instead of "Null", I let the test pass a "TestIgnore" type object and react to that accordingly.

These types should be clearly meant to be used ONLY in testing environments.


It seems like you need comprehensive unit tests if your implementations are shallow and grant few guarantees. I like TDD-style example tests to create a fast feedback loop during initial development or refactoring though.


This seems nuts to me? Sacrificing guarantees for the sake of testability?

TDD zealots tend to not care about efficiency, maintainability, readability, or any of a number of very important characteristics of a program if it interferes with the kind of false-security testability they are zealots about.


> Depend on Concrete Classes - Tie things down to concrete classes - avoid interfaces wherever possible.

Well, yes, if those classes are implementation details why would you want otherwise?

Just to write “unit” tests with mocks all over the place that later will be broken after the first refactor?


Wherever possible, I try to avoid using mocks as they just end up obscuring what the code is doing.

More specifically, I follow the maxim of only using in the tests what is publicly available to a user of the class. If it is a private instance of a class, or a class created in a function that does not have a mechanism to be switched for something else, you shouldn't be poking the internals to modify it, you should either:

1. provide a mechanism to change that class -- e.g. if the class under test is talking to a database;

2. not change the code -- e.g. if the class under test is using a particular data structure class.

You should not have/use mocks if you have a class A that holds and uses class B to perform its work. If you do, you are not properly testing class A's behaviour. That is, if you have a rotation helper class that uses a quaternion class, you shouldn't be mocking the quaternion class behaviour -- that way only leads to madness, expecially if the internal implementation of the rotation class changes its internal representation/logic to use matrices.

Acceptable places to use mocks or similar techniques are:

1. for the class that wraps the database object, or some other external system -- for which, I don't class the filesystem to be external;

2. for setting up parts of an application (e..g IntelliJ) when testing a plugin -- and there, I try to get as close to the application behaviour as possible/necessary to perform the tests (i.e. wherever possible, use the application's real implementation class).

3. for setting up things like Spring, or other dependency injection components.


One of the premises/promises of OO is the ability to substitute implementations (in particular, dynamically, but in a modular programming sense also statically). An over-reliance on concrete classes makes this harder to achieve, and (in the context of the list, which is about testing) means you have a larger portion of the program that has to be brought into the test harness to test something.

On the former, using concrete classes instead of interfaces can be worked around if they're still configurable, but now you have inappropriate (or potentially inappropriate) subclasses to compensate for the lack of an interface. On the latter, well, it makes testing harder which is the entire point of the list. If A has two components B and C which are both tied to concrete classes and not interfaces, then you have to bring A, B, and C into the test harness when you really just want A and (possibly) B or C or stubs/mocks for them.

If you don't care about testing (and, in particular, unit testing), and don't care about substitutability, then don't worry about it.


I care about testing and that’s why I hate to break my tests every time a refactor comes. If the behavior doesn’t change my tests shouldn’t break. Tests should meant to be there to give me confidence about the main functionality. They should tell me when my main functionality is broken or not after a change.

It’s not very useful to me if I get red tests just because the structure of my code changed. Those red tests don’t tell me if the functionality is the same or not. If my tests break every time even if the functionality is the same, then I start to ignore the warning, I just blindly make them pass. They become “The Boy Who Cried Wolf”.


It's ironic what attempted to be "satiric" and a writing on "good tips" back in 2008 has become a list of red flags in 2022.


Yep. Static utility classes with pure functions are a good thing. Not everything needs to be an instance or an object. Not every if or switch needs to use polymorphism instead. Still lots of good tips here. Can anyone name a Google library they've actually enjoyed using? Every library I've greatly enjoyed has come out of Facebook, never Google.


The Tensorflow C++ api seems like it was written by someone whose dog was raped and murdered by the author of a Modern C++ book, so they vowed to only code in '90s Java style and make maximal use of the C preprocessor.


> Can anyone name a Google library they've actually enjoyed using?

No, I'm mostly with you here. Closure Library is OK if you don't have anything else that replaces it (or if something you use ships with it, hello ClojureScript) but other than that, their libraries leave a lot to offer. But Google Closure Compiler is magic (in a good way) though, but you can't call that a library.

> Every library I've greatly enjoyed has come out of Facebook, never Google.

I'm sure I'm reading this the wrong way, because surely you can't mean that every library you've greatly enjoyed has only came from Facebook? Most of the really well made libraries I've used in my time, has come from private people, besides React. But now React has better replacements available as well, like Preact. But other than that, larger companies seem to be really bad at writing libraries for the world at large, which makes sense, they mostly care about their own use-cases.


I'm pretty sure that they meant that FB tends to write more useful libraries than Google.


I agree. Many of the points made I disagree with personally. I don't tend to write OO. I like clear functional pipelines, putting state and side effects at the top (singleton like?), do static dispatch per default, because then I can clearly _see_ all the fundamental decisions and effects. Functions are easy to test, their "mocks" are just data structures.

Also on a slightly different note, the advice about using small constructors and non-defensive methods. That goes pretty much against the whole point of encapsulation IMO, which is to protect and more importantly guarantee invariants. What's the point of a class that doesn't do much of anything besides naming things?


The number 1 way to write untestable code - is to build your entire project without writing tests at all. Adding tests after the fact will make you want to kill yourself, whereas when you write tests either before or at the time of writing you will generally take the path of least resistance for writing code that can pass those tests.


With that kind of take, I'll raise with "write untestable code by writing for only you and only current you."

I'm often surprised how thoughtful past me was with code, knowing how people might try to screw it up. Even to the point of having useful and sometimes fun error messages. Feedback from other people agrees with that assessment.

Avoiding misuse is avoiding the untestable.


How much refactoring do you tend to do at the start of a project and do your tests help refactoring?


Not the parent’s author, but I tend to write extensive tests from the very beginning. Sometimes they are indeed annoying while refactoring, but more often than not they are very helpful in ensuring that the whole thing still works.

And if the tests are annoying that’s usually a sign that they can be improved.


Personally, I don't do much refactoring. If I'm exploring a new idea, I write tests after I've experimented a bit and have a good idea of what I'm going to build - and then write mostly integration tests with very few mocks (usually only calls to external services). I don't refactor until I come back to add features, and yes, the tests help immensly.

Red->Green->Keep Moving


How much experimenting do you do? You could take that as far as writing it once without tests (experimenting). Then write test for what you created. Then start over and keep the tests.


Interesting how OO-focused this is. I work at Google today, in relatively old C++ and JS (now converted to TS) codebases, and we have no particular emphasis on forcing every function to be a method of a class. I don't think this has especially harmed testability.


What part of OO was about testability? Single responsibility? Liskov substitution? Interfaces to test instead of implementations? Just chunking things into bite-sized pieces?

I honestly don't get it.


These seem like a great way to write simple, high-performance code that is easy to write (because you just think about your actual problem vs. what someone who wants to ignore the problem wants -- what arbitrary slice are they expecting your code to cover?) and easily tested with simple integration tests that use quality fixtures which exercise your code thoroughly.

Endless classes of bugs eliminated through these "anti-patterns".

Guess what: if unit tests pass in an OO system, it doesn't mean a fucking thing, especially dynamic languages where simply changing a name in a large codebase is a risky operation.

And fuck virtual methods.


if i could only afford a single test, i'd much prefer it to be an integration test rather than a unit test.

but, some kinds of code can permit a huge number of execution paths with different behaviour and interactions that need to be tested. If you want to cover all the different paths using only integration tests without decomposing a system into simpler subsystems that can be tested more locally, then you end up with a combinatorial explosion in the total number of execution paths and the number of test cases you have to write and maintain.


I was a huge fan of this article and Misko's accompanying talks back in 2012 or so, but a lot of this hasn't aged that well. These ideas all assume that in all cases, the class should be the unit under test, and you can see how this plays out in Angular, where the command-line tool will generate stub unit tests for every new component you create. If you fill out those tests with the aim of maximizing code coverage, then what you'll find is that all of those tests will need to be rewritten at the first refactor. On a personal project, that might be an enjoyable way to spend a bit of time, getting to understand your project better, but on a team with actual deadlines, there's not going to be enough time to do that. Either you remove the tests entirely, or you put off the refactor indefinitely.

Class-based unit tests lock you into a specific implementation, providing a checksum on the current behavior, making it as hard as possible to fix an actual bug when it's discovered. Collaborators-included unit tests on the other hand, do not break unless your product is broken. It can be harder to chase code coverage when you're doing laparoscopic surgery on your code, but the end result is that every conditional has an actual use case behind it, and if the requirements change, you can see the name of the test with the bad requirement and delete it.

I still think that a lot of the patterns that make your code testable are still good though. It's not always clear at what level of abstraction you want to start mocking. Do you mock HTTP? Do you mock a martialed API? Do you mock a business logic abstraction that depends on that API? If you use DI for all of these layers, then you don't have to get it right the first time. You have options when you're writing your tests, and can use concrete or mock collaborators according to whatever your current need is.


How to write untestable code? Use any of Google's cloud libraries. (Well, not just Google's).

Good luck testing your cloud functions, dataflows, BigQuery integrations etc. etc. etc.


What's the least testable code you've had the pleasure of dealing with?

From my first commercial software job: a few thousand lines of complex C++ (mix of decoding the output of a mathematical model mixed with business rules) structured as a single "god class" with heaps of internal state and a dozen methods -- comically all of the methods would return void and took no arguments, and would call each other. It took days of careful refactoring and consultations with a colleague until it was possible to instantiate one of the things in a test harness and do nothing with it. In some sense this was dead simple code as there were no dependencies on external resources or concurrency.


I completely disagree with the general sentiment here, this is still very relevant today, misunderstood, and much needed in many codebases.


> C++ specific: Use non-virtual methods

Are they suggesting that all/most of my methods should be virtual? That's just nuts. Next they will say that all of the members should be public, since it's easier to test that way.


I have unironically used `#define private public` before including headers in my test code...


Well, I hope you know that in theory it's undefined behavior, and in practice it can actually affect layout of objects.


Why is it undefined behavior?


It looks like it's a C++ standard library requirement:

https://eel.is/c++draft/macro.names#2

Otherwise this is one of the unintended effects (on Itanium ABI):

https://godbolt.org/z/bY4534zhc


FYI, MSVC doesn't do this. I use this same hack in a project where I need access to private/protected members in the current stage, and memory layouts need to match ... You gave me a bit of a heart attack there.

mingw gcc (11.2.0) does have the same 8 vs 12 difference.


MSVC might not alter layout, but public and private member functions have different name mangling:

https://godbolt.org/z/ddqd3vr5x

Notice the call to `?foo@A@@AEAAXXZ` in one compilation and `?foo@A@@QEAAXXZ` in the other, supposedly to the same function. You might get fun linker errors this way, possibly nothing nastier though.


Yeah, I'm aware of that. I've had to edit mangled names a few times too many - transferring between idbs and changing const char * to char * (PBD to PAD I think? thankfully I'm already not sure anymore), for example. Or changing __int64 to int ...

Luckily I don't need to link anything, so I don't care about that for now.


This article actually has educational value other than laughs. Counter-examples are sometimes easier to remember than straight examples, especially if you're reading about pitfalls - your brain will match those patterns when it sees you're writing them. A text that explains what the wrong way looks like and mocks it, in my opinion, will be more effective than a text that explains the right way in great detail.


Hah! I actually took this sort of joke further than I probably should have a while ago with this https://boyter.org/posts/expert-excuses-for-not-writing-unit...

It also covers how to avoid writing them to begin with.


Following Pareto principles: 80% of tests should be convered with good static typed language and a clean, layered structure.

Only 20% left is to focus on testing business logic.


> Following Pareto principles: 80% of tests should be convered with good static typed language and a clean, layered structure.

My experience in dynamically typed language is that it's very rare to see a typing error that doesn't co-habitate with at least one logic error.


If most testable paths are covered by typing, then what? You're working on some kind of filter app? Sure, then, probably more than 80% of problems are caught with type checking.

I'm not saying that's not coding, because lots of things are coding. But I am saying that's not the usual case that I see.


November 5, 2009 (29 comments) https://news.ycombinator.com/item?id=924859

I thought I'd seen this here more recently, but I guess it was another similar article.


I disagree with

>Create Utility Classes and Functions/Methods

This doesn't make something untestable. Take for example a max function. A max function such as Math.max is a part of a utility class, but it's easy to test.


I believe they mean utility classes which would need explicit instantiation and lifetime management to run any test that might call them at some point like:

MathOperationsProvider::Init(endianness_config, locale, wolfram_alpha_connection).


If you can't justify it. And keep justifying it every time someone asks. Then don't do it.

Max . . . max is pretty easy to justify. But MaxMinusMagicFiveForUnseenReasons . . . not so much.

The point of all of this (I assume TotT) is to have the conversations. If you're not having them at design time or code review time or incident response postmortem times, you're doing it wrong. And you're reminded every time you visit the restroom (again, if it's TotT).

That's my interpretation.


That is more of a maintainability problem than a testability problem.


Defensive programming is important. It's better to get an assertion at the API level about a null than an error message that + cannot add 42 and null.


> When you use static methods, they can't be mocked using mocking libraries

Except they can, in most ecosystems.


I would add:

* Throw useless errors on getter or has methods!

I recently gave up on using MapboxGL.js because of this.


This can be summarized to:

How to write untestable code: Use OOP ;-)


This is really funny




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: