Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I enjoy a good visualization, but at best they're high level graphical powerpoints, and in this case I found the animations more distracting than useful.

Also, if you're going to do a 30k foot view of a technical topic, you might want to tell people what GPT3 is somewhere in there.



I agree that in this case the animated parts of the graphics were not needed, it's an easy pitfall to be distracted by the beautiful aspects of visualisations when crafting them.

I feel the need to defend the author though, it's hard to make research accessible while still distilling valuable insight. I think his post on transformer networks [1] did a good job for example, and you'll appreciate the lack of animations.

[1] https://jalammar.github.io/illustrated-transformer/


Yes this seems like an early work in progress, compared to Jay's previous Transformer articles.

In addition to your link, I've found a really good Transformer explanation here (backed by a Github repo w/ lively Issues talk): http://www.peterbloem.nl/blog/transformers

Additionally, there's a paper on visualizing self-attention: https://arxiv.org/pdf/1904.02679.pdf


Can't edit the post anymore so adding it here - further research reading on improving the current attention model: https://www.reddit.com/r/MachineLearning/comments/hxvts0/d_b...


That's a good complement, thank you for the links


I few this comment is overly negative. Just to provide a counter-datapoint, I have seen quite a bit of GPT3 on HN lately but could not understand the research papers at all. It’s too abstract, and I often fail to see what they really mean.

This article and the animations definitely helped me a lot in understanding this. I learned quite a few things, so thanks a lot to the author!


It explains a sequence to sequence model which, granted, is a class of models that GPT-3 falls under.

But these animations/diagrams are so high level that they could be used for Explaining all sorts of NLP models from the past 5 years.


Openning OP's page on a slow 4G connection via hotspotting from my smartphone, the whole page makes no sense because I can't know if I should wait for something to move or carry on.


My head was getting dizzy and had to stop mid way. People were smart enough to create animations but not sensitive enough to know whether it is too much.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: