There are a lot of courses out there on AI from esteemed institutions at that. What do people recommend as a curriculum for someone with a formal univ education in CS albeit from a while ago and who has programmed extensively though not in Python.
The goal at the end is to have a deep understanding of the LLM space and its adjacencies.
Seconded. It's a hands-on approach starting with implementing a pytorch-like api from the ground up with manual backprop up to implementing a simple transformer / gpt variant in actual pytorch.
Would you mind sharing, what level of programming & mathematical background do I need? I know basic python (read python for data analysis) & currently half way through elements of statistical learning. What else do I need to learn?
> half way through elements of statistical learning
If you're able to make it that far on ESL, mathematical background certainly won't hold you back when learning anything "neural networks" related. Specially not a spelled-out practical intro.
> I know basic python (read python for data analysis)
You may want to get more comfortable with programming in general (outside of the data analysis realm), but you can learn everything you're missing while watching Karpathy's series (and referencing the python docs).
> The goal at the end is to have a deep understanding of the LLM space and its adjacency.
This is kinda a hard thing to quantify. How are we defining deep? Like you want to understand how they work? The Karpathy videos are good for that. But I wouldn't call this "deep".
If you want to get down into the weeds and into the mud, you need a hell of a lot more than 13hrs of education. You're also going to have a hard time doing this because most people are going from an engineering perspective of "enough to work with it" rather than "I fundamentally want to understand all inner workings". If you are the former, then the fastai course and others are great for you. If you want to really get deep though, you're going to need a lot more than programming. You're going to need some pretty advanced maths too: high dimensional statistics, metric theory, and optimization theory are some. (Most researchers aren't doing this btw) But if you do go down this path you'll also be able to understand the full spectrum of generative models and have a clearer picture. But I should also say that there is still a black box element to these models as they are so large that they are near impossible to analyze. But it is definitely achievable to learn a 2 layer transformer autoregressive network and fully understand its inner workings. But programming skills alone won't get you there.
Thanks for the helpful advice. What would you recommend to someone who is interested in learning about diffusion models? I have a CS degree but I have 0 knowledge about AI. Things like Stable Diffusion have blown my mind and I’m really interested in learning about this field. Lots of courses out there but I lack the expertise to discern which one is good.
Yeah no problem, this is even closer to my area of focus! What do you know about physics and thermodynamics?
I'd say a good intro for low background is from Tomczak[0]. He has a book, but the blog posts are nearly identical. He did a post doc with Max Welling (someone you should learn about if you want to get deep, like I was suggesting before). So I'd switch things up slightly. I'd go Intro -> Autoregressive -> Flow -> VAE -> Hierarchical VAE -> Energy Based Models -> Diffusion. It is worth learning about GANs btw, but this progression should be natural and build up.
Continuing from there, you're going to want to learn about things Langevin Dynamics, Score Matching, and so on. Start with Yang Song's blogs[1]. Your goal should be to understand this paper[2]. Once you get there, you should be able to understand the famous DDPM paper[3]. But why we went through Tomczak wasn't just to get a good understanding of diffusion at a deeper level, but because you need these tools to understand Stable Diffusion which really is just Latent Diffusion[4]. This should connect back with Tomczak's 2 Improving VAE papers and you should also be able to understand NVAE.
This is probably the quickest way to get you to a good understanding but if you want to dig deeper, which I highly encourage (because there are major issues that people aren't discussing) then you'll need more time. But you'll probably have to tools to do so if you go through this route. Other people I suggest looking into: Diederik Kingma, Ruiqi Gao, Stefano Ermon, Jonathan Ho, Ricky T. Q. Chen, and Arash Vahdat.
I would start with a fastai course such as practical deep learning for coders.
After doing one of the fastai courses you will have some applied Python project experience and you can hone in deeper on a particular part of the project you are more interested in intellectually.
How long has it been since you studied/used university-level math? Calculus and linear algebra in particular.
I ask because it’s pretty difficult to get through the math of backprop without a firm grasp of these. The Python part is trivial by comparison, the main difficulty being the matching of dimensions.
While many learn calculus in high school, many also don't get it till uni. Not everyone is at your level, or took your same path, and that's okay. Don't shame people for not knowing when they're trying to learn.
I think you are asking specifically about practical LLM engineering and not the underlying science.
Honestly this is all moving so fast you can do well by reading the news, following a few reddits/substacks, and skimming the prompt engineering papers as they come out every week (!).
I strongly recommend playing with the OpenAI APIs and working with langchain in a Colab notebook to get a feel for how these all fit together. Also, the tools here are incredibly simple and easy to understand (very new) so looking at, say, https://github.com/minimaxir/simpleaichat/tree/main/simpleai... or https://github.com/smol-ai/developer and digging in to the prompts, what goes in system vs assistant roles, how you guide the LLM, etc.
This might be taboo, but I used ChatGPT to educate me on some basic concepts, then deeper concepts - and put together a learning-plan and a syllabus for me with also a glossary of terms....
The cool thing, is it helped me put a more structured thought process on how I should pursuing AI leanings...
I couldnt find anything concise out there - and this helped me to better think through everything:
If anything - its a good primer for getting your own thought process on the subject going...
Did you click the 'share' link on the left (which generates a public URL for a snapshot of the chat), or did you just copy the URL from the address bar (which will only work for you)?
The shareable URLs contain the word 'share' in the path.
Define “deep understanding” here? You certainly have to lean python, at least because you are gonna need it for data manipulation and cleaning no matter what you do in this field.
There are a lot of courses out there on AI from esteemed institutions at that. What do people recommend as a curriculum for someone with a formal univ education in CS albeit from a while ago and who has programmed extensively though not in Python.
The goal at the end is to have a deep understanding of the LLM space and its adjacencies.