I enjoy a good visualization, but at best they're high level graphical powerpoin...

m3at · on July 28, 2020

I agree that in this case the animated parts of the graphics were not needed, it's an easy pitfall to be distracted by the beautiful aspects of visualisations when crafting them.

I feel the need to defend the author though, it's hard to make research accessible while still distilling valuable insight. I think his post on transformer networks [1] did a good job for example, and you'll appreciate the lack of animations.

[1] https://jalammar.github.io/illustrated-transformer/

ypcx · on July 28, 2020

Yes this seems like an early work in progress, compared to Jay's previous Transformer articles.

In addition to your link, I've found a really good Transformer explanation here (backed by a Github repo w/ lively Issues talk): http://www.peterbloem.nl/blog/transformers

Additionally, there's a paper on visualizing self-attention: https://arxiv.org/pdf/1904.02679.pdf

ypcx · on July 28, 2020

Can't edit the post anymore so adding it here - further research reading on improving the current attention model: https://www.reddit.com/r/MachineLearning/comments/hxvts0/d_b...

m3at · on July 28, 2020

That's a good complement, thank you for the links

stingraycharles · on July 28, 2020

I few this comment is overly negative. Just to provide a counter-datapoint, I have seen quite a bit of GPT3 on HN lately but could not understand the research papers at all. It’s too abstract, and I often fail to see what they really mean.

This article and the animations definitely helped me a lot in understanding this. I learned quite a few things, so thanks a lot to the author!

wodenokoto · on July 28, 2020

It explains a sequence to sequence model which, granted, is a class of models that GPT-3 falls under.

But these animations/diagrams are so high level that they could be used for Explaining all sorts of NLP models from the past 5 years.

poutrathor · on July 29, 2020

Openning OP's page on a slow 4G connection via hotspotting from my smartphone, the whole page makes no sense because I can't know if I should wait for something to move or carry on.

maxwin · on July 30, 2020

My head was getting dizzy and had to stop mid way. People were smart enough to create animations but not sensitive enough to know whether it is too much.