Hive - A Petabyte Scale Data Warehouse using Hadoop

rjurney · on June 11, 2009

I love it. I hate it. I love it because its a very powerful way to run SQL on petabytes of data. I hate it because SQL needs to die.

Personally, I'm really looking forward to Apache Pig having both a SQL and dataflow abstraction available.

bjclark · on June 11, 2009

There's nothing stopping you from just running Map/Reduce scripts. Hive just compiles the SQL down to Map/Reduce.

bjclark · on June 11, 2009

Hive is great, but it's noted that like most Hadoop things, it's alot better when you have 100 machines than when you have like, 2.