Hacker News new | past | comments | ask | show | jobs | submit | icaromedeiros's comments login

I was wondering about Spotify's contributions to Hadoop stack such as Snakebite and Luigi. What will happen to these projects if they're moving to BigQuery?


Spotify are not moving to BigQuery alone. They are also using Dataproc, Google's managed Hadoop service (which co-incidentally went GA yesterday), which lets them get value out of all the orchestration tools they already use, like Luigi. (They are also adopting Cloud Dataflow, Cloud Pub/Sub and other parts of our Big Data stack.)

Luigi has had BigQuery support for some time: https://github.com/spotify/luigi/pull/1002

Spotify will be talking more about the specifics of their data pipelines at GCP Next in SF in March, - I would expect to see a lot more on the new Google Cloud Big Data blog at https://cloud.google.com/blog/big-data/ too.

[Disclosure, GCP guy, etc]


Google Cloud has a managed Hadoop service called Dataproc: https://cloud.google.com/dataproc/


Nothing in this move should affect contributions of Snakebite and Luigi. If anything, it will just make them easier to use with cloud environments in addition to bare metal.

With over 20K jobs/day, Hadoop will be a part of Spotify's data processing stack for quite a while. BigQuery is just a (awesome) piece of the full puzzle.

[spotifier]


Great provocation. I really missed the references for the algorithms, dataset publication and breakthrough announcements. Interesting follow-up here https://twitter.com/drewconway/status/699728784455573504


It is rewarding working at globo.com with such skilled engineers and great solutions, in almost every area. Congrats to Leandro, Juarez, and everybody involved.


Feedback is welcome


Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: