There aren't very many statisticians/MLers who suggest (or practice) reimplement...

fauigerzigerk · on Nov 14, 2011

I think the choice between using existing libraries and implementing your own mostly depends on how central particular algorithms are to your product. If a better algorithm makes a great difference for my customers then it's insane for me to use an existing library.

I don't even find much value in looking at existing code as a starting point because it's bound to be either obscured by lots of optimizations or naive or it's university code left behind by someone finishing their thesis in a hurry. For code beyond a certain level of complexity I prefer to either use it as a black box or implement it myself.

Obviously, if the algorithm is not a core component of my product it's insane to waste time on reimplementing it, provided there is a good quality implementation that has the right license.

law · on Nov 14, 2011

It's a fine line to walk. On the one hand, community-vetted code is a spectacular idea for the core algorithms, but on the other, overly-restrictive licenses (like [L]GPL) effectively preclude the maximum utility being derived from them.

dubya · on Nov 14, 2011

OTOH, if you do need to write your own code, you could use the existing test suites to make it a lot easier. I think the GPL would allow this.