Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

RLHF is arguably a bigger jump than LLMs, at least from my perspective beginning to study NLP in 2015/16.

Well what exactly is RLHF, practically? The ability to go from 8 google search snippets to correctly rank and rewrite the top one into agreeable, cohesive, grammatical and helpful english is just incredible and allows so much more and the real step change from these models that lead to virality. It also increases consistency, which was always the worry of business use cases.

Why is that more noteworthy than the base GPT-3? A lot of the LLM scale --> more correct autoregression prediction progress was predictable - RLHF on text was not (the early sparks coming for most of us in the release of T5 with it's multiple tasks-in-text).

What else could be a big idea coming up? There is a ongoing wave of innovation in embeddings that has largely been missed by the hype curve but increasingly GPT embeddings and useful for compression, much much more accurate KNN search for tasks like matching curriculums to learning content (even multilingually - see the recent Kaggle competition with performance which is outstanding and due to similarity-based embeddings from the last 3 years). This wave may lead to the partial replacement of some anthropomorphic computing concepts like files, as information is much more addressable, combinable and useful as various sized embeddings, to some extent. More vitally, embeddings can be aligned across different models and modalities to get better results (e.g. the Amazon ScienceQA paper showed text questions about physical situations increased in accuracy when images of the situation were used during training - even if held out afterwards). Now this multimodality thing has always been on the AI radar (not necessarily ML), but these embeddings based on similarity, and also GPT embeddings (they behave differently and are sensitive in different ways) are getting us there much quicker than would have been expected.

Ignoring the engineering and techniques improvements (e.g. scaling up data, learning encodings rather than pre-programmed/sinu-positional embeddings), there are lots of things like capsule networks that could be big, like energy-based models (seeking predictable comfortableness rather than maximising gains). However, like you mentioned, a lot of these are years old and regularly come and go. If you want somebody who is pushing for more exploration here and decries GPT a little, checkout Yann Lecun.



it's an interesting example of how much seemingly superficial, non-fundamental things can matter.

A lot of AI experts are asserting (probably correctly) that Open AI really has done nothing new and is just putting a shiny sticker on what was already known and published research.

But human perception being what it is, having ChatGPT produce a beautifully formed, polite and friendly sentence seems massively better to lay people than a response that has a more terse, unpolished output. It won't surprise me if there is already a giant layer of heuristics pasted on the end of the Transformer model for ChatGPT cleaning up all sorts of ugly corner cases which researchers would hold highly impure and completely value-less while it actually is responsible for a large amount of ChatGPT's success.

I think there is a bit of a lesson there in terms of how much academia does undervalue the polishing part of research work, even if fundamentals ultimately drive progress.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: