Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

No incremental testing strikes me as a rather valueless constraint. Small experiments are one of the most crucial skills -- if not the crucial skill -- in writing correct code. It's like asking me to swim with one arm tied behind my back to see if I am a good swimmer.

Sure, I can run these 'tests' on scratch paper or even in my head, if you require me to. But real life never requires this. I prefer to prove to myself that each piece of an algorithm is correct--using a combination of reasoning and microtesting--before moving on to the next. Why would I waste attention doing this all in my head? The computer will do many pieces of it for me quickly, accurately, in a way that catches language idiosyncracies, and most importantly frees my attention for considering more abstract errors.

Edit: I had to try anyway, and (as far as I can tell) I was able to do it correctly in about three minutes. But I suspect it has less to do with native programming ability and more to do with having written dozens of variants on binary searches in the past year (I've had reason to want to handle various types of fuzzy matches, filtering and desired multiplicities in return values). FWIW, even after a fairly exhaustive suite of random input tests, I trust this binary search I have just written far less than the many I've recently written using my usual incremental methodology. Were I to include this in an actual release, I would not attempt to prove what I just wrote correct. I would destroy it and do the job properly from scratch.

Which rather underscores my point. I just wrote some code the wrong way, and even in spite of quite a lot of recent domain experience, a pretty good test suite, and a lot of staring at it and thinking about it . . . I don't trust it because I didn't test the pieces while I was writing it.



It's like asking me to swim with one arm tied behind my back to see if I am a good swimmer

I look at it more as an exercise akin to athletes running with weights. When compared with competition conditions, it is totally unrealistic, but it definitely strengthens you.


Do you want to send me your code for testing in my test suite? You've obviously tested it now, but it would still be interesting.


All right. It's in perl. Here's what I have:

  sub binary_search {
    my($ary, $val) = @_;
  
    print "Looking for $val...\n";
  
    $val < $ary->[0] and return undef;
    $val > $ary->[-1] and return undef;
  
    my $min = 0;
    my $max = @$ary - 1;
    my $mid = int(($max + $min)/2);
  
    while($max - $min > 1) {
      $ary->[$mid] == $val and return $mid;
      $ary->[$mid] >  $val and $max = $mid;
      $ary->[$mid] <  $val and $min = $mid;
  
      $mid = int(($max + $min)/2);
    }
  
    $ary->[$min] == $val and return $min;
    $ary->[$max] == $val and return $max;
  
    return undef;
  }
  
  my $ary_size = int(rand(10));
  my @ary;
  for(my $i = 0; $i < $ary_size; $i++) {
    push(@ary, int(rand(10));
  }
  
  @ary = sort @ary;
  
  print join(",", @ary), "\n";
  my $rv = binary_search(\@ary, int(rand(10)));
  print "Returned $rv\n";

I see already that it has the famous 'large array' bug. Not something I encounter in my work, so I'm not on the lookout for it. I'm curious if you'll find anything else wront with it.


Seems ok to me, liked the "($max - $min > 1)" and "$ary->[$min] ==/$ary->[$max] ==" that removed the need to add or subtract 1 inside the while, nice to see a variation once in a while :)


Can/do you write in C? Would someone care to translate this?

I'll do it tomorrow at work, but if someone can post and/or email me an equivalent C version it will reduce my time. Not a problem, but just more convenient.

You'll need to give me some way to contact you so I can send you the results.

EDIT: I've just realised the spec I usually give asks for the first occurence of a key to be returned in the case where there are duplicate keys. I'll modify my tests to account for that. It's an especially useful feature in a binary search, but rarely implemented.


Having to do it in C is even more perverse than doing it in my head.

;)


In all seriousness, if you have to translate it, don't bother. I write in C nearly daily, and I certainly wouldn't trust myself not to lose something in translation.

I'd be interested in what's in your test suite, though. I must sheepishly admit that my 'fairly exhaustive' test consisted of running the above random input until I'd seen all the usual tricky cases go by:

  - an array of length zero
  - an array of length one
  - an array of length two
  - the value is present
  - the value is absent and too small
  - the value is absent and too large
  - the value is absent and in the middle
  - the value is the first entry
  - the value is the last entry
  - the array is all one value 
  - there are multiple copies of the value you're looking for
If you've got anything more interesting in that test suite, that'd be good to know. Like I said, I represent the one-bazillionth of programmers that actually have a recurring professional interest in correctly coding binary searches.


"the array is all one value" got me. After fixing that, despite of the excessive verbosity, my experiment in literate programming passes the other tests - but I'm sure I coded a few more bugs in it - if you have some stock suite to get it through - would be very interested to know how many!

http://stdio.be/mybsearch/mybsearch.txt for the text and the http://stdio.be/mybsearch/mybsearch.c for the notangle-d C code, if you fancy to look at it.


An array of length Maxint?


Good call. As per link in http://news.ycombinator.com/item?id=1277701 (elsewhere in this page):

the binary search program that Bentley proved correct and subsequently tested in Chapter 5 of Programming Pearls contains a bug [...] it fails if the sum of low and high is greater than the maximum positive int


Mine is:

    -- find index of thing in sorted things in the range [first, last)
    function bsearch_2(things, thing, first, last)
       if first == last then return nil end
       assert(last > first)

       local midpoint = math.floor((first + last) / 2)
       assert(midpoint >= first)
       assert(midpoint < last)

       local test = things[midpoint]
       if test == thing then return midpoint end
       if test > thing 
       then return bsearch_2(things, thing, first, midpoint)
       else return bsearch_2(things, thing, midpoint+1, last)
       end
    end

    function bsearch(things, thing) 
       return bsearch_2(things, thing, 1, #things+1) 
    end
That's Lua, which uses 1-based indices. I'd be happy to hook up input-output stuff to it if you like.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: