Hacker News new | past | comments | ask | show | jobs | submit login

Interestingly the chart at the end suggests implementations for everything except the raw data, but generating batch views from an arbitrary store and keeping your raw data reliably are hard problems. The charts motivate dumping this stuff in Hadoop (or some distributed file system), but any reliable store would do. (@nathanmarz: would love recommendations)



A distributed filesystem, such as HDFS or MapR, is ideal for the master dataset.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: