Yea - it does seem a bit high. We use Spark for our adtech data pipeline and we'... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

dangoldin on March 17, 2017 | parent | context | favorite | on: Scaling Financial Reporting at Airbnb

Yea - it does seem a bit high. We use Spark for our adtech data pipeline and we're handling tens of billions of events a day in less time. It may be a function of how much data they're pulling in from other systems or dumping the data back into a variety of systems. Spark itself is parallelizable so in theory can be sped up just by running more nodes.

sheeshkebab on March 17, 2017 [–]

financial processing is typically sequential - can't calculate some metric until some other thing was calculated (or pulled data for)... not well parallelizable in other words. or so it is with some systems I deal with.

Consider applying for YC's Spring batch! Applications are open till Feb 11.
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact