Hacker News new | past | comments | ask | show | jobs | submit login

Cool technology, good explanation. Legitimate questions below.

What you're describing (uploading photos + storing metadata) sounds like something which Facebook has tech talked at length about at multiple venues. Their solution was to use distributed FS for images (such as HDFS, though FB uses their internal "Haystack") and then use HBase for the metadata. To be honest, your solution while it works now, looks like a weak home-grown HBase, but leveraging PL/PGSQL for unique IDs. Why not go the snowflake+hbase route? While it may add Ops complexity, it is a fairly battle-proven stack, and JVM ops is pretty well documented.

Or, if you insist on using an RDBMS for metadata, why not just throw money (and not that much) at the problem and buy an SSD for your DB? Increase your iops from 100 or 200 up to 30,000 or 40,000 with a cheap drive, and call it a day. Surely this would be less expensive than the engineering effort that went into (and will continue to go into) this project. This has the added benefit of having no impact on Ops complexity and should scale to quite a staggering number of QPS.

Thanks!




Valid questions!

We're on EC2, which has its set of limitations but means we can run a 10 million + user system with two-and-a-half engineers (and no ops team / overhead). So while we hear about more and more folks using SSDs in their DBs, it's not an option in our near-term future.

For SQL vs HBase/Haystack, we don't really have to worry about the photo storage itself, since S3 handles all of it. The data we shard out is more suited to an RDBMS, and since we're way more familiar with that world than with HBase and similar, it was the choice that let us make the most progress in a short time with a small team. Hope that's a helpful description of how we thought about it.


From what I am understanding from the post is that you are actually sharding other tables like photos, likes, comments but not the main users table, right?

So, if one day you want to shard the users table, it will render all current sharding useless, right?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: