Looking for guidance here. There are a lot of courses out there on AI from estee...

tonmoy · on July 2, 2023

Although I myself am not related to the industry or academia pertaining to AI, I have heard many people speak highly of the zero to hero course by Andrej Karpathy: https://youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThsA9Gv...

I myself loved it and learned a lot, but YMMV

kekebo · on July 2, 2023

Seconded. It's a hands-on approach starting with implementing a pytorch-like api from the ground up with manual backprop up to implementing a simple transformer / gpt variant in actual pytorch.

jsight · on July 2, 2023

That course is fantastic. Just don't be afraid to pause and rewatch. It took me a long time to get through the first three videos.

nocoder · on July 3, 2023

Would you mind sharing, what level of programming & mathematical background do I need? I know basic python (read python for data analysis) & currently half way through elements of statistical learning. What else do I need to learn?

pedrosorio · on July 8, 2023

> half way through elements of statistical learning

If you're able to make it that far on ESL, mathematical background certainly won't hold you back when learning anything "neural networks" related. Specially not a spelled-out practical intro.

> I know basic python (read python for data analysis)

You may want to get more comfortable with programming in general (outside of the data analysis realm), but you can learn everything you're missing while watching Karpathy's series (and referencing the python docs).

m3affan · on July 4, 2023

Karpathy is great researcher in the field, but that doesn't always translate into good explanation skills automatically.

barbazoo · on July 2, 2023

Thanks for sharing.

godelski · on July 2, 2023

> The goal at the end is to have a deep understanding of the LLM space and its adjacency.

This is kinda a hard thing to quantify. How are we defining deep? Like you want to understand how they work? The Karpathy videos are good for that. But I wouldn't call this "deep".

If you want to get down into the weeds and into the mud, you need a hell of a lot more than 13hrs of education. You're also going to have a hard time doing this because most people are going from an engineering perspective of "enough to work with it" rather than "I fundamentally want to understand all inner workings". If you are the former, then the fastai course and others are great for you. If you want to really get deep though, you're going to need a lot more than programming. You're going to need some pretty advanced maths too: high dimensional statistics, metric theory, and optimization theory are some. (Most researchers aren't doing this btw) But if you do go down this path you'll also be able to understand the full spectrum of generative models and have a clearer picture. But I should also say that there is still a black box element to these models as they are so large that they are near impossible to analyze. But it is definitely achievable to learn a 2 layer transformer autoregressive network and fully understand its inner workings. But programming skills alone won't get you there.

blueyoda · on July 2, 2023

Thanks for the helpful advice. What would you recommend to someone who is interested in learning about diffusion models? I have a CS degree but I have 0 knowledge about AI. Things like Stable Diffusion have blown my mind and I’m really interested in learning about this field. Lots of courses out there but I lack the expertise to discern which one is good.

godelski · on July 2, 2023

Yeah no problem, this is even closer to my area of focus! What do you know about physics and thermodynamics?

I'd say a good intro for low background is from Tomczak[0]. He has a book, but the blog posts are nearly identical. He did a post doc with Max Welling (someone you should learn about if you want to get deep, like I was suggesting before). So I'd switch things up slightly. I'd go Intro -> Autoregressive -> Flow -> VAE -> Hierarchical VAE -> Energy Based Models -> Diffusion. It is worth learning about GANs btw, but this progression should be natural and build up.

Continuing from there, you're going to want to learn about things Langevin Dynamics, Score Matching, and so on. Start with Yang Song's blogs[1]. Your goal should be to understand this paper[2]. Once you get there, you should be able to understand the famous DDPM paper[3]. But why we went through Tomczak wasn't just to get a good understanding of diffusion at a deeper level, but because you need these tools to understand Stable Diffusion which really is just Latent Diffusion[4]. This should connect back with Tomczak's 2 Improving VAE papers and you should also be able to understand NVAE.

This is probably the quickest way to get you to a good understanding but if you want to dig deeper, which I highly encourage (because there are major issues that people aren't discussing) then you'll need more time. But you'll probably have to tools to do so if you go through this route. Other people I suggest looking into: Diederik Kingma, Ruiqi Gao, Stefano Ermon, Jonathan Ho, Ricky T. Q. Chen, and Arash Vahdat.

[0] https://jmtomczak.github.io/

[1] https://yang-song.net/

[2] Deep Unsupervised Learning using Nonequilibrium Thermodynamics https://arxiv.org/abs/1503.03585

[3] https://arxiv.org/abs/2006.11239

[4] High-Resolution Image Synthesis with Latent Diffusion Models https://arxiv.org/abs/2112.10752

balthigor · on July 3, 2023

The quality of these recommendations reflects favourably on OP.

azmodeus · on July 2, 2023

I would start with a fastai course such as practical deep learning for coders.

After doing one of the fastai courses you will have some applied Python project experience and you can hone in deeper on a particular part of the project you are more interested in intellectually.

chongli · on July 2, 2023

How long has it been since you studied/used university-level math? Calculus and linear algebra in particular.

I ask because it’s pretty difficult to get through the math of backprop without a firm grasp of these. The Python part is trivial by comparison, the main difficulty being the matching of dimensions.

bob88jg · on July 2, 2023

It's nothing more than the chain rule...University level it is not...the engineering aspect is the non trivial part IMHO...

godelski · on July 2, 2023

While many learn calculus in high school, many also don't get it till uni. Not everyone is at your level, or took your same path, and that's okay. Don't shame people for not knowing when they're trying to learn.

chongli · on July 2, 2023

The gradient of softmax is beyond most high school calculus students.

theptip · on July 2, 2023

I think you are asking specifically about practical LLM engineering and not the underlying science.

Honestly this is all moving so fast you can do well by reading the news, following a few reddits/substacks, and skimming the prompt engineering papers as they come out every week (!).

https://www.latent.space/p/ai-engineer provides an early manifesto for this nascent layer of the stack.

Zvi writes a good roundup (though he is concerned mostly with alignment so skip if you don’t like that angle): https://thezvi.substack.com/p/ai-18-the-great-debate-debates

Simon W has some good writeups too: https://simonwillison.net/

I strongly recommend playing with the OpenAI APIs and working with langchain in a Colab notebook to get a feel for how these all fit together. Also, the tools here are incredibly simple and easy to understand (very new) so looking at, say, https://github.com/minimaxir/simpleaichat/tree/main/simpleai... or https://github.com/smol-ai/developer and digging in to the prompts, what goes in system vs assistant roles, how you guide the LLM, etc.

samstave · on July 2, 2023

This might be taboo, but I used ChatGPT to educate me on some basic concepts, then deeper concepts - and put together a learning-plan and a syllabus for me with also a glossary of terms....

The cool thing, is it helped me put a more structured thought process on how I should pursuing AI leanings...

I couldnt find anything concise out there - and this helped me to better think through everything:

If anything - its a good primer for getting your own thought process on the subject going...

https://chat.openai.com/c/7a36b5dc-0016-4b4c-bf1a-c3a66dac7c...

rahimnathwani · on July 2, 2023

The link fails for me. I see a red banner saying: Unable to load conversation 7a36b5dc-0016-4b4c-bf1a-c3a66dac7c6d

samstave · on July 3, 2023

I think you have to be logged in it seems, as when I click in private, it asks for login first :-/

rahimnathwani · on July 3, 2023

I'm logged in.

Did you click the 'share' link on the left (which generates a public URL for a snapshot of the chat), or did you just copy the URL from the address bar (which will only work for you)?

The shareable URLs contain the word 'share' in the path.

samstave · on July 3, 2023

huh - I DID click share - lemme try again ;

https://chat.openai.com/share/b8f06d5e-f2d9-47d7-9c60-69b088...

AH... I fd up and apprently copied the URL...

EDIT,

You may link this link as well

https://chat.openai.com/share/0e2d652e-09a6-4cba-a766-dcf2c6...

tikkun · on July 2, 2023

I put this list together for 4 different angles on learning about LLMs: https://llm-utils.org/AI+Learning+Curation

kulikalov · on July 2, 2023

Define “deep understanding” here? You certainly have to lean python, at least because you are gonna need it for data manipulation and cleaning no matter what you do in this field.