I always wonder what people expect when they talk about twitter's algorithm as if this some sort of Deus Ex level AI developed in the depths of Area 51. If anyone wants twitter's algorithm I guess all you need is to download some arxiv papers on state of the art recommender systems and it's probably going to look like that.
I work on FAANG-scale recommender systems. Most of the time, due to the scale of the problem, the algorithms powering these systems are not anywhere near state of the art. On the other hand, they are highly optimised for the situation. You can bet that every aspect of the Twitter algorithmic feed has been thoroughly debated, thought through and most importantly A/B tested.
People speak of "the algorithm" as if it were an animal spirit or American god to appease and to implore. I suppose this is our zeitgeist. Your explanation helps give me and others a conceptual grasp on the nature of the thing, so thank you.
Would you be open to have a discussion about FAANG-scale recommender systems? I have a small website that could heavily profit such a recommender system. With 200k daily visitors there should be enough data to train a model, but our current implementation sucks and barely beats random.
Not the person you're asking, but i'm curious what your current approach is. In my first job (ecommerce site) I built a (very slow) recommendation system based on k-means clustering and it did a pretty good job of clustering customers based on interest and suggesting products, definitely better than random.
Building it today would be much faster because there are actually proper libraries/programs for doing this, rather than my inefficient vanilla Python implementation.
I worked at one of the FAANGs and the algorithms were definitely state of the art in some ways. Maybe they didn't use the latest models but the sheer size of, for example models, was huge. Easily 500+ features inputted and hundreds of millions of parameters for deep models.
Twitter's scale and real-time nature make it a difficult beast. Their network graph contains hundreds of millions of nodes and billions of edges.
And it's constantly updating. So any graph ML algorithms you want to use have to deal an underlying graph that's eventually consistent at best — and oftentimes very sparse in terms of feature availability.
Also yeah, most people talk about "Twitter's algorithm" but have no idea what it is they're talking about — that's exactly why I wanted to write on this topic :)