Hacker News new | past | comments | ask | show | jobs | submit login

I have a hard time about the "cannot reproduce" categorization.

There are places (e.g. in the Linux kernel? AMD drivers?) where lots of generated code is pushed and (apart from the rants of huge unwieldy commits and complaints that it would be better engineering-wise to get their hands on the code generator, it seems no one is saying the AMD drivers aren't GPL compliant or OSI-compliant?

There are probably lots of OSS that is filled with constants and code they probably couldn't rederive easily, and we still call them OSS?




But with generated code what you end up with is still code, that can be edited by whoever needs. If AMD stopped maintaining their drivers then people would be maintaining the generated code, it wouldn't be a nice situation but it would work, whereas model weights are akin to the binary blobs you get in the Android world, binary blobs that nobody call open-source…


I personally think that the model artifacts are simply programs with tons of constants. Many math routines have constants in their approximations and I don’t expect the source to include the full derivation for these constants all the time. I see LLMs as a same category but with (much) larger sets of parameters. What is better about the LLMs than some of the mathematical constants in complicated function approximations, is that I can go and keep training an LLM whereas the math/engineering libraries might not make it easy for me to modify them without also figuring out the details that led to those particular parameter choices.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: