Hacker News new | past | comments | ask | show | jobs | submit login
DeepDive in everything of Llama3: revealing detailed insights and implementation (github.com/therealoliver)
189 points by therealoliver 22 hours ago | hide | past | favorite | 12 comments





I like the use of the functional API here. I learned through a similar route and it was very helpful for me compared to trying to understand `torch.nn.Module`.

Here's a gist of my learning path if it's helpful to anyone: https://gist.github.com/kevmo314/294001659324429bae6749062a9...


Yes, these are two different learning paths. The detailed process learning is beneficial for future research, while the API-style approach is convenient and quick for getting started and using. Both are very useful!

I hadn't realized OpenAI's tiktoken Python library could work with other models outside of the OpenAI family, that's really useful: https://github.com/therealoliver/Deepdive-llama3-from-scratc...

It's more than just that, practically every notable open model released in the past year or so uses tiktoken as the tokenizer.

I'm glad to have helped you :)

great need; mulling over; shows up all the time in AI paradigms

glad to have helped you :)

[flagged]


If you’ve got nothing constructive to say… don’t say anything? OP brings a lot of value in a style they like, your comment brings absolutely nothing.

I said what I said with purpose. I can't share this shit with anyone serious. It looks like someone who watches too much cartoon porn made this. I'm sorry these types of real life viewpoints offend you. I do sincerely suggest you reflect on what I am saying and consider it with weight.

Thank you for helping me out. Such comments are really depressing.

Ignore the troll that cries over Anya.

Real life feedback: drop the anime shit. It is way out of place. I'm not kidding.



Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: