Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Overview of Large Language Models (aman.ai)
1 point by OJFord on July 25, 2023 | hide | past | favorite | 1 comment


I don't have the experience to know if this is excellent, or riddled with factual errors, but I was looking for something like it; ideally a book.

For all the discussion of doing this or that - fine-tuning parameters or re-training a model - I haven't seen much in the way of step-back explainer, what are all the pieces, when can I use a canned binary model (cheap) and when do I need to train my own (expensive), what can I do with them, why have impressive image/video 'generative AI' come suddenly at the same time as ChatGPT and the like when, naively, there's no clear link to 'LLM' or NLP other than the relatively trivial piece of 'understanding' the request. (If that were a breakthrough, surely we would for years have had 'generate me a random Midjourney-esque image', or some selectable pieces.)

What's good 'I took one "Intro to ML" course years ago, just enough to grasp it conceptually as applied stats' reading material?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: