Hacker News new | past | comments | ask | show | jobs | submit login
Reading list to join AI field from Hugging Face cofounder (thomwolf.io)
114 points by triyambakam 8 months ago | hide | past | favorite | 27 comments



Thanks for the list.

Interesting to see that the book by the late David MacKay made it to the list, it is available to download in several formats from the book's website.

Information Theory, Inference, and Learning Algorithms:

https://www.inference.org.uk/mackay/itila/

The lectures based on the book by David MacKay himself:

https://youtu.be/BCiZc0n6COY

Probably it's not a surprise that the father of information theory Claude Shannon is the earliest one to propose stochastic LLM based on Markov chain.


I've always wondered if Hugging Face is meant to evoke the feeling of being attacked by a Facehugger.


It's meant to evoke the feeling of being in therapy. It started as an attempt to be an AI therapy company, and they pivoted to "github for AI" around 2019.


Apparently, the name goes back to the unicode emoji for a hug ():

https://qz.com/hugging-face-microsoft-artificial-intelligenc...


Dumb question, but... Is AI and ML the same thing? I ask because I am seeing a lot of my ML peers claim to be AI experts with years of experience, even though the term AI wasn't really used at all until OpenAI got big.


This is the venn diagram I keep seeing when this question is asked: https://www.researchgate.net/publication/363930739/figure/fi...

GenAI ⊆ Deep Learning ⊆ Machine Learning ⊆ AI, where ⊆ means "Is a subset of."


Also, originally I think Machine Learning was more related to the field of Statistics than other AI methods that were more traditional CS.


AI as a field has been around since the 50s and gone through multiple paradigms, but modern AI (since its resurgence around 2010) is effectively all machine learning.


Not really though.

Modern "publicly known and commercial" AI is effectively all machine learning

For a sample of non-ML AI that ML still has trouble solving, see papers published in:

https://ijcai-23.org/main-track-accepted-papers/


What "AI" means has always been shifting; right now when most people hear it they assume it means generative AI and deep learning, but it's really a very broad category.

Early incarnations/uses of the term include what is now sometimes referred to as Good Old Fashioned AI (GOFAI), with such things as expert systems and ontological classification systems; these still technically fall under the umbrella of "AI". After GOFAI came other forms of technology, including the precursors to our current deep learning models, including much simpler and smaller neural networks. Again, these are still "AI", even if that's not what the public thinks when they hear the term.

See: https://en.wikipedia.org/wiki/GOFAI and https://en.wikipedia.org/wiki/Symbolic_artificial_intelligen...


Im not a programmer, but my understanding is that back in the '80s investors got in a frenzy when the idea of AI was big. They threw money at it and... it was decades away. Then they tried the "e" phase that became the dot com boom... and it was overhyped. This pattern has occured several times with programs.

Technology has been progressing and the definition of AI is loose at best when marketing is nearby. I have a question: what kinds of ways do people describe "unemployed" on linkedin? It is probably glossed over and hyped based on whatever marketing is difficult to verify and is at least adjacent to reality. I think this is similar to ML researchers classifying themselves as AI researchers. HOWEVER, there may be a lot more overlap than simply "adjacent", so someone please correct me if the term "machine learning" or "AI" are regulated terms for public advertisement.


Ok, I hear a lot of 'AI is a superset of ML' definitions, and yes, that's historically true, but today AI is shorthand for LLM's and others (image generation, embedding, DL-based time series). I'd draw the line at systems using emergent phenomena that aren't realistically reducible to understandable rules.

Should we say 'this uses AI' for a Prolog/expert system/A*/symbolic reasoning/planning/optimization system today? Idunno; I had scruples even about calling classical and Bayesian statistical models 'machine learning,' reserving that for models prioritizing computational properties over probabilistic interpretation.


In my experience they mean basically the same thing in practice most of the time. The industry used AI for a long time (up through the 2000s iirc) but it developed a bit of a reputation for going nowhere. Then there was a revival in the 2010s "The Social Network" funding environment when "machine learning" was preferred, I think to shake off some of the dust associated with the old name. Now the pendulum seems to be swinging back to "AI", driven by several high profile and well funded though leaders who have rewarmed the old AGI narrative thanks to some major leaps forward in models for certain applications.


Machine Learning is a newer term than AI. AI has been used for decades.


If we want to be precise (and it seems like you do):

AI = machines doing human-like tasks.

ML = machines learning to do stuff.

Doing financial modeling or code-base security auditing 100x better than a person is good ML and not AI at all.


Eh, a lot of it’s more down to marketing. ‘AI’ as a term has periodically come in and out of fashion; for instance, is OCR AI? Well, not 20 years ago, certainly, the term being unfashionable at the time, but now: https://learn.microsoft.com/en-us/azure/ai-services/computer...

(Computer vision, in particular, is basically always classified as AI today, but the term was mostly avoided in the industry until quite recently.)


> Probabilistic Graphical Models Specialization from coursera (https://www.coursera.org/specializations/probabilistic-graph...)

Are Probabilistic Graphical Models still being used? They don't get much visibility these days.


I have read all of them and I have watched almost all of the training videos mentioned and that, is the list I would give anybody who I would want to discourage from joining the field. The textbooks I mean COME ON. They are comprehensive and neatly academic but not pragmatic at all. There are playlists in youtube that would cut through all the unnecessary content and will get you building RAG systems in a couple of hours. Kudos to the Langchain team for all the content online.


The fastest way to learn is always to go after a difficult problem you find fascinating. Learning by the books is the second fastest, and is suitable for those without the experience to have a fascinating problem to motivate them. This applies to many but not all, for example a university student might have a vague interest in AI but not know where to start. If you’re already in the field though, it’s best to just keep hacking forwards, even if you can’t prove every theorem from the major textbooks.


Feel free to post those playlists for those discouraged by the list of academic resources :)


I second this!


Bengio & Goodfellows book, while a great reference, is certainly not what I'd recommend for beginners. It's great for experienced folks IMHO, but otherwise it's quite opaque. I've used it to teach in the past and beginners really struggled with it.

Still an amazing reference though!


he probably felt the need to prescribe dense theoretical books given his position.

prbably look like a fraud if he recommended ' grokking machine learning on manning.com'


Are you implying that he is adjusting his reading list to fake competence? Thomas Wolf has the second highest number of commits on HuggingFace/transformers. He is clearly competent & deeply technical

https://github.com/huggingface/transformers/


This comment seems unnecessarily personal. The difference between the reading list and Grokking Machine Learning appears to be the layer of application.

‘Grokking’s description indicates it is for a very applied role: “teaches you how to apply ML to your projects using only standard Python code and high school-level math”.

The linked reading list appears to be targeted at one level up the stack so to speak. Instead of learning how to string a few Python libraries together (which is useful, not knocking it), the goal would possibly be writing the Python library itself for a ML architecture.

Given the author’s position, it makes sense why they found the content of the latter useful in getting to that position.


Aren't the LLMs on hf good enough to learn from?


I appreciate simplicity, but this could use take advice from https://perfectmotherfuckingwebsite.com/




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: