Hacker News new | past | comments | ask | show | jobs | submit login

This is cool, I highly recommend Jay Alammar’s Illustrated Transformer series to anyone wanting to get an understanding of the different types of transformers and how self-attention works.

The math behind self-attention is also cool and easy to extend to e.g. dual attention




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: