Hacker News new | past | comments | ask | show | jobs | submit login

I struggled to get an intuition for this, but on another HN thread earlier this year saw the recommendation for Sebastian Raschka's series. Starting with this video: https://www.youtube.com/watch?v=mDZil99CtSU and maybe the next three or four. It was really helpful to get a sense of the original 2014 concept of attention which is easier to understand but less powerful (https://arxiv.org/abs/1409.0473), and then how it gets powerful with the more modern notion of attention. So if you have a reasonable intuition for "regular" ANNs I think this is a great place to start.



Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: