Hacker News new | past | comments | ask | show | jobs | submit login

A 1M line codebase can mean many things. It can mean code that belongs to many systems mashed together as if it were a single thing. It can mean that the code that should belong to different systems is tightly coupled into one monolithic entity that should have been several little entities. It can mean lots of boilerplate code too. It can indicate a lack of timely refactoring on an aging codebase so interconnected nobody has the courage to separate in more manageable pieces. None of those can be solved by clever language choices alone.

You also mention things going wrong in unanticipated ways - this may signal that the problematic code where errors bubble up is not problematic at all - it is called by code written by people who don't really understand what the functions do and who probably didn't write adequate tests for that - because the tests should catch the unanticipated parts. The problematic code is the one calling the parts where errors bubble up. The canary is not responsible for the gases in the mine.

While you may be right that, in order to deal with multi-million-line codebases you need static typing, I'd much rather split that codebase into smaller units that could be more easily managed.

Wearing a straitjacket is often a consequence of an underlying condition that can, sometimes, be corrected.




I would expect that dons is talking about a 1M codebase which already consists out of manageable pieces. It's only, that the pieces have to work together, talk to each other, know about each other (but not too much).

Sometimes software solves problems which provoke incidental complexity because of sheer size and my experience (albeit not above 500k) tells me that indeed, all bits help, also compiler enforced type checks. I would never bet my life on tests. As you write: "because the tests should catch the unanticipated parts", that's the point: tests never catch unanticipated parts by their very nature. Sometimes, by sheer luck, yes.


> tests never catch unanticipated parts

No, but calling a function with arguments of the wrong type is something tests would catch.


> No, but calling a function with arguments of the wrong type is something tests would catch.

Or the compiler.


A compiler can only go so far. It'll happily compile:

  #include <stdio.h>

  int main(int argc, char *argv[])
  {
    FILE *fp = fopen("outfile", "w");
    fclose(fp);
    do_something(fp);
    return 0;
  }

  int do_something(FILE *f)
  {
    fprintf(f, "This should work if you know what you're doing");
  }
And it won't work.


In C++, proper scoping with RAII (similar to CL's with-blah idioms) would alleviate that issue as do_something in the code below won't compile.

    struct WithOpenFile {
        FILE *fp;
        WithOpenFile(const char *fileName) : fp(fopen(fileName, "w")) {}
        ~WithOpenFile() { fclose(fp); }
    };

    int do_something(FILE *f)
    {
        fprintf(f, "This should work if you know what you're doing");
    }

    int main(int argc, char *argv[])
    {
        {
            WithOpenFile file("outfile");
        }
        do_something(file.fp);
        return 0;
    }
But Lisps will happily accept this:

    (defun do-something (foo)
        (1+ foo))

    (with-open-file (file "outfile")
        (do-something file))


Fine. But now you'll have to accept both files and network sockets. What do you do?


Pass the int representing the FD instead of the FILE*?


And then you are back to dynamic typing again, except that, when the integer you send points to something fprintf doesn't like you'll get a segfault instead of a doesNotUnderstand.


The compiler will happily compile that because C's standard library thinks it's a fantastic idea to just throw type-safety out the window. Even C libraries can protect themselves from this via strong typing instead of overloading what FILE* effectively means.


I think you mean contracts or defensive coding. If someone calls my code wrong they will have to think to test that particular case themselves, which is hard. Unless I've written contracts; then when it blows up they'll know what they did wrong.


The question we have to answer to properly understand what went wrong is where did the argument originate. If it's being generated inside function A that then calls function B with it, a test of A should fail when it calls B with the wrong argument. In any case, I would imagine the test coverage in the A-B system is lower than it should.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: