Sure, the Llama 3 Community License agreement isn't one of the standard open licenses and sucks that you can't use it for free if you're an entity the size of Google.
Correct me if I'm wrong, but that's the code for doing inference?
Meta employee told me just the other day: "However, at the moment-we haven't open sourced the pre-training scripts", can't imagine they would be wrong about it?
Personally, "Open" implies I can download them without signing an agreement with LLama, and I can do whatever I want with it. But I understand the community seems to think otherwise, especially considering the messaging Meta has around Llama, and how little the community is pushing back on it.
So Meta doesn't allow downloading the Llama weights without accepting the terms from them, doesn't allow unrestricted usage of those weights, doesn't share the training scripts nor the training data for creating the model.
The only thing that could be considered "open" would be that I can download the weights after signing the terms. Personally I wouldn't make the case that that's "open" as much as "possible to download", but again, I understand others understand it differently.
Doesn't the training script need to have a training loop at least? Loss calculation? A optimizer? The script you linked contains neither, pretty sure that's for inference only
Here is the Llama source code, you can start training more epochs with it today if you like: https://github.com/meta-llama/llama3/blob/main/llama/model.p...
It's rumored Llama 3 used FineWeb, but you're right that they at least haven't been transparent about that: https://huggingface.co/datasets/HuggingFaceFW/fineweb
For models I prefer the term "open weight", but to assert they haven't open sourced models at all is plainly incorrect.