Hacker News new | past | comments | ask | show | jobs | submit login
Driving down the cost of Big-Data analytics (allthingsdistributed.com)
15 points by DanielRibeiro on Aug 19, 2011 | hide | past | favorite | 3 comments



What open-source tools are people using to build real-time big data analysis systems?

Jeff Jonas of IBM doesn't recommend batch systems such as Hadoop (http://jeffjonas.typepad.com/jeff_jonas/2011/04/the-data-is-..., http://techcrunch.com/2010/10/27/big-data/) for real-time context accumulation systems. I have heard that IBM is making use of topological embedding algorithms to build distributed graph databases capable of real-time analysis, but very little of that research is public.


The production systems I know of are:

- Yahoo's S4 (http://s4.io/).

- Twitter nee BackType's Storm (http://engineering.twitter.com/2011/08/storm-is-coming-more-...).

- Esper (http://esper.codehaus.org/).

While Amazon's announcement is great news for Hadoop users, I do think the future is in these real-time systems. However there isn't a clear winner yet, so it is appropriate that AWS is yet to bundle one up as an offering.


The non-open-source one I keep hearing about is Vertica (http://www.vertica.com/resources/videos/), which was recently acquired by HP and was founded by Mike Stonebraker (http://en.wikipedia.org/wiki/Michael_Stonebraker), the inventor of PostgreSQL.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: