Hacker News new | past | comments | ask | show | jobs | submit login

This wouldn't scale. If we replaced twitter with this right now, that's:

500 million tweets per day. 140 bytes (140 characters * 8-bit ASCII) per tweet.

140 bytes * 500 million = 70GB

Thats 70GB per day before metadata. Use this social network for a month and we've exceeded the 1TB mark, twice.

Remember this isn't just 70GB per day on one server, this is 70GB per day on every users PC.




how do you read 500 million tweets per day?

the idea is to only replicate who you care about+a couple of their friends.


In that case it would work. I was under the impression that you'd be syncing with everyone so someone in New Zealand could contact someone in Canada.

The average person has 208 twitter followers. So lets say you have 208 'friends' + a couple additional 'friends' for each of your original friends. That's 624 people total.

There are 100 million active twitter users each day and 500 million tweets per day, that's 5 tweets per person.

5 * 624 = 3120

That's 3120 posts you'll be processing per day. Multiply this by 140 bytes per post and you have 436800 bytes per day or 159.5 MB per year.

That's manageable.


Except for pubs, which are helpful broadcasters of both private and public stuff, but it makes sense to host them on an infrastructure that can ingest 70 GB a day and have a couple days of retention


Pubs don't follow everyone on the network. People set them up and give out invites to their friends.


Which means even less requirements for such "private" pubs. However if SSB is ever to replace Twitter, I would guess there would be other, "public" pubs that try to get all the content possible.


You don't need to store everything uncompressed. Just gz everything and you gain several orders of magnitude.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: