Hacker News new | past | comments | ask | show | jobs | submit login

And if that hypothesis is based on data, great. And if there is no other alternative, fine -- but this technique should be viewed as a last resort (and an unusual one), not a first step.



Wait a minute. First of all, the data is simply "there is a reproducible bug in the program". The real hypothesis is that a small, localized portion of the code is responsible for the defect. To find it, you can try binary search. If you have a better idea of the location, then by all means.

Second, when you have a bug and you write a unit test, you are effectively commenting out the entire codebase except for the function under test. When you have a compiler error, whether it's a syntax/semantics bug in your code or a bug in the compiler, sometimes you need to produce a minimal example, so you have to cut cut cut until the bug is just barely provoked. When you have a pipeline of data transforms, and the end result is suddenly borked, it can work to chop off half the transforms and look at the result. When latex is crashing for some unintelligible reason, just comment out half of your document and see if the problem goes away.

Sure it's really dumb if you're just excluding a* .c through m* .c (spaces due to HN formatting rules), but figuring out if the problem is in the first or second half of main is not outrageous. I don't think the guy was presenting it as the first step ("If you have no idea where your bug lives"), but I do agree that it comes across as a little naive, since he should have talked about all of the other techniques available first. So the problem isn't so much the lack of a hypothesis, but the inefficient experimental approach of using a brute force technique indiscriminately.

I think the last resort is reached a little sooner for some types of bugs and some experience levels (language, environment, codebase, programming), and yes in many cases it won't even do anything for you.

Personally I always liked dtrace. This guy gave a demo of it at my university once, I thought it was great, one of the best talks I've seen.


Part of the scientific method is that you generate hypotheses based on the data you have, not that you generate random hypotheses. It's a directed way of thinking.

> The real hypothesis is that a small, localized portion of the code is responsible for the defect. To find it, you can try binary search.

In your example, you used this as an assumption, not a hypothesis. And it's not quite right: the assumption you made is that the _presence_ of a small portion of code is responsible for the defect. That's very different than saying that the code is more broadly responsible for the defect. In my experience, very few bugs are caused by the mere presence of some code.

> When you have a compiler error, whether it's a syntax/semantics bug in your code or a bug in the compiler, sometimes you need to produce a minimal example, so you have to cut cut cut until the bug is just barely provoked. When you have a pipeline of data transforms, and the end result is suddenly borked, it can work to chop off half the transforms and look at the result. When latex is crashing for some unintelligible reason, just comment out half of your document and see if the problem goes away.

Those are fine solutions for those very specific, very simple problems. Given the problem space, the assumption that the error is caused directly by the presence of some input is well-founded.


Sorry, I'm kind of confused here. Why is the approach I'm defending considered random and not based on data? Is the generation of random hypotheses in general considered unscientific? What about fuzz testing or pharmaceutical R&D? What is the precise difference between hypotheses and assumptions in the context of the scientific method? What is the difference between the presence of a small portion of code being responsible and the code being more broadly responsible? Why the emphasis on presence?


Because it seems you're making an assumption about the problem being reproducible and identifiable by running half the code. I've never found this to be the case. Ever. Its a strange idea. Where does this come from? This isn't a hypothesis based on data, it is an assumption. And a bizarre one.


The following program segfaults:

  main() { a(); b(); }
You don't have a debugger. You don't have the source for a or b. Strategy? Sure printf works, but so does //.


I get the feeling you don't actually program. Or your writing is disconnected from the programmer part of you.


That's interesting, could you elaborate?


I can. It's long and tedious, because a lot of it is very simple things you learn very early on writing and maintaining software.

You seem to be arguing that your example demonstrates the fact that you can localize a bug in the code with binary search; specifically that the bug must exist in either a() or b(), and that running them independently is both possible and will determine the single location of the bug.

This is not true. It is only true in the case that one assumes bugs must have single locations, and that those locations can be found with binary search over what amount to incomplete programs. In other words, you're assuming the prior, begging the question, etc.

It's tempting to say "real code isn't like this contrived example" but in fact this contrived example is the best possible demonstration of why blind binary search is a poor strategy. Let us say the 'bug' exists in a(). You seem to be assuming this is the necessary consequence:

main () { a(); b(); } -- segfaults main () { a(); /* b(); / } -- segfaults main () { / a(); / b(); } -- doesn't segfault

But this isn't the only possibility. With the bug existing in a(), you could also see this result:

main () { a(); b(); } -- segfaults main () { a(); / b(); / } -- segfaults main () { / a(); / b(); } -- segfaults

You would expect this in situations where b() uses data structures created by a().

We may also see this result, still assuming the bug exists in a():

main () { a(); b(); } -- segfaults main () { a(); / b(); / } -- doesn't segfault main () { / a(); / b(); } -- segfaults

If above we learned nothing, here we've actually got a falsehood - our binary search has localized the bug to b(), but it actually exists in a()!

So, in practice, binary search fails to localize a bug in a(). All of these situations can be created by having a() write a global which is relied upon by b() - a() may write to a protected area, b() may have a default value, a() may write nonsense, or pass an integer where b() expects an address - none of this is particularly exotic, they're all the sort of things you get every day when debugging segfaults.

We might now delve into a competing series of contrived examples of a() and b() and argue about their relative prevalence in the world (which none of us are capable of knowing), because if some case is particularly rare, it may make binary search very slightly better than flipping a coin in this case.

Instead, I will point out that this is once again* assuming the prior, and that we have these simple (if shockingly annoying) facts:

1) side effects exist 2) in the presence of side-effects, binary search cannot predict the location of a bug.

And it follows that in the sense that a "scientific" hypothesis has predictive power, then binary search over the codebase is basically the homeopathy of debugging.


If you comment out b, and it segfaults, the segfault is caused by some code in a. Otherwise, the segfault is caused by some code in b. The root cause of the segfault in b may be found in a, but even looking at the stack trace from a debugger will still point you at b. This is nevertheless helpful information.

Further, there's a misunderstanding about the method. You do a binary search on the "percentage" of code viewed as an instruction stream that is allowed to execute, starting from the beginning. I understand that a can have a different length from b. Given more code, you can also search based on timestamps. You never comment out a and allow b to execute without it, this is nonsense. If that's what you thought I meant, I can understand comments about not actually being able to program. I did not explain this clearly, but only because I had forgotten that it was necessary to explain this clearly, because some people actually don't know how to program, even on HN.


Let's review the conditions of your example:

  no source,
  no debugger,
  two function calls
And the assertion that // works as a strategy. There's only two things you could possibly comment out, so really the example is not so much "explained poorly" as "conceived poorly". Don't shoot the messenger - it's your example.

As for the further explanation here, allowing a debugger makes the search irrelevant - there's only one piece of information the binary search can tell you: which function causes the segfault, but as you observe, that information's available from the debugger already. And no, it doesn't matter if you reframe things as "a percentage of an instruction stream".

The objection to binary search is not "binary search in the codebase can never tell you anything," it's that it's a method that can't -- not "doesn't" but "can't" -- produce a non-trivial testable hypothesis. Does this mean you can't ever use it? You may have no choice. But it's a terrible place to start, and most of the information it offers is available through other means, often with more precision.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: