Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

open training data, training source code & hyperparameters, model source code, weights

I'm not an FSF hippie or anything (meant that in an endearing way), but even I know if it's missing these it can't be called "open source" in the first place.



I don't think the weights are required. They're an artifact created from burning vast amounts of money. Providing the source/methods that would allow one, with the same amount of money, to reproduce those weights, should still be considered open source. Similarly, you can still have open source software without a compiled binary, and, you can have open source hardware, without providing the actual, costly, hardware.


The popularity of fine-tuning demonstrates that the weights are actually the preferred form for making changes.

The precursor form (training data etc) is only needed if you want to recreate it from scratch. Which is too expensive to bother with.


My point is, wanting a finished product that cost millions, without paying for it, is very different than it being open sourced. Models are an artifact, a result, not a source.


I would argue that the weights are as much source code as source code. Them being generated doesn't demote them.

I don't even think the distinction is important. The "system" should be open, and that includes data central to the system's operation within certain bounds.

You can open source parts of a system at whichever fine slice you wish, you just have the part which is open A and the part which isn't B.

It's the value of A and B being open that matters, not what A and B are composed of.


Great point. Open source is different from free product.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: