Hacker News new | past | comments | ask | show | jobs | submit login
Trending at Instagram (instagram-engineering.tumblr.com)
173 points by mikeyk on July 1, 2015 | hide | past | favorite | 12 comments



Differentiation between popularity and novelty? What a succinct explanation. Incidentally Twitter originally used popularity, but switched to novelty[1] because Justin Bieber regularly dominated the trending tweets list.

[1]http://mashable.com/2010/05/14/twitter-improves-trending-top...


Surprisingly, I've applied most of the mentioned algorithms on real world data while working on some side projects. Realized that even big organizations use fundamental Machine Learning algorithms to get their tasks done. This feels quite reassuring as you don't need a MS/PhD to solve such complex/interesting problems.


Very interesting read, I wonder if Twitter does it in a similar way.

Anyone who knows where to learn more about the system design they use, with the pre-processor, parser, scorer, and ranker?

How do these interact? What's the respective input and output?

Links to some similar system with more in-depth on this area would be appreciated.


I would imagine its all closed source.


Nice. I learned a ton with this. Definitely adding the various types of calculations to my reading list. I haven't googled for this yet, but is there an open source hashtag trends detector around? If not, would be pretty cool to use the teachings of this post to build one.


This was quite good.

Interesting that most hashtags don't get more than 3 posts per hour - that breakdown would be a good graph to see! It's also curious that they don't apply some type of seasonal decomposition to the timeseries data; though perhaps that that makes sense with Instagram's particular data.


Google [zipfian], and I bet that's what the distribution looks like...


This is an awesome post! I really appreciate the level of detail they go into.

Also, I found a typo:

> ... to requests comping from the app

(I think that should be 'coming from the app')


I couldn't help but notice S(h, t). Never the less it was a very good article.


I agree.

"Every few minutes, a line with the current value for S(h, t) is emitted."

Hard to believe the author didn't notice this...

Maybe just some humor for the trending classification overview?


I doubt it was done deliberately to be funny - using S for Score, h for Hashtag and t for time are about as standard as you could get.


True. I guess it's just coincidental humor for some readers.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: