I always wonder what people expect when they talk about twitter's algorithm as i...

axg11 · on April 22, 2022

I work on FAANG-scale recommender systems. Most of the time, due to the scale of the problem, the algorithms powering these systems are not anywhere near state of the art. On the other hand, they are highly optimised for the situation. You can bet that every aspect of the Twitter algorithmic feed has been thoroughly debated, thought through and most importantly A/B tested.

formerkrogemp · on April 22, 2022

People speak of "the algorithm" as if it were an animal spirit or American god to appease and to implore. I suppose this is our zeitgeist. Your explanation helps give me and others a conceptual grasp on the nature of the thing, so thank you.

firsttimeplayer · on April 22, 2022

Would you be open to have a discussion about FAANG-scale recommender systems? I have a small website that could heavily profit such a recommender system. With 200k daily visitors there should be enough data to train a model, but our current implementation sucks and barely beats random.

If yes, how can I contact you? Twitter?

andrewingram · on April 22, 2022

Not the person you're asking, but i'm curious what your current approach is. In my first job (ecommerce site) I built a (very slow) recommendation system based on k-means clustering and it did a pretty good job of clustering customers based on interest and suggesting products, definitely better than random.

Building it today would be much faster because there are actually proper libraries/programs for doing this, rather than my inefficient vanilla Python implementation.

axg11 · on April 22, 2022

For sure - @arshamg_ on Twitter. DMs are open.

ceeplusplus · on April 22, 2022

I worked at one of the FAANGs and the algorithms were definitely state of the art in some ways. Maybe they didn't use the latest models but the sheer size of, for example models, was huge. Easily 500+ features inputted and hundreds of millions of parameters for deep models.

transitivebs · on April 22, 2022

Yes and no.

Twitter's scale and real-time nature make it a difficult beast. Their network graph contains hundreds of millions of nodes and billions of edges.

And it's constantly updating. So any graph ML algorithms you want to use have to deal an underlying graph that's eventually consistent at best — and oftentimes very sparse in terms of feature availability.

transitivebs · on April 22, 2022

Also yeah, most people talk about "Twitter's algorithm" but have no idea what it is they're talking about — that's exactly why I wanted to write on this topic :)

mountainriver · on April 22, 2022

Yup and temporal graphs are a super hard problem without a lot of great ML solutions currently

neals · on April 22, 2022

I wouldn't be suprised if it's just a handcrafted spagetti of condition blocks in PHP .