My team is Observability at Twitter; we work on monitoring and we’re looking for distributed systems engineers and full-stack engineers. We are one of the largest monitoring stacks in the industry, writing up to 15 million metrics per second for all production services at Twitter. We also have a front-end service that is used every day by most engineers at Twitter. We write our services in Scala, use a state-of-art Cassandra-like database called Manhattan, and if you join, you’ll get to work on challenging problems from day one.
Here are some of the things we’ve done in the past 12 months:
- Made our alerting execution service seamlessly fail over across datacenters
- Implemented a temporal set membership service for our database to keep track of metric groupings
- Added tiering policies for metrics based on their automatically-derived significance
- Added hybrid online/offline processing of data for different use cases
- Optimized the time-series query language to make reads more efficient
- Made an asynchronous query processor to support expensive queries with lower latency requirements
- Wrote a client-side agent that collects and reports metrics to the storage system
My team is Observability at Twitter; we work on monitoring and we’re looking for distributed systems engineers and full-stack engineers. We are one of the largest monitoring stacks in the industry, writing up to 15 million metrics per second for all production services at Twitter. We also have a front-end service that is used every day by most engineers at Twitter. We write our services in Scala, use a state-of-art Cassandra-like database called Manhattan, and if you join, you’ll get to work on challenging problems from day one.
Here are some of the things we’ve done in the past 12 months:
- Made our alerting execution service seamlessly fail over across datacenters
- Implemented a temporal set membership service for our database to keep track of metric groupings
- Added tiering policies for metrics based on their automatically-derived significance
- Added hybrid online/offline processing of data for different use cases
- Optimized the time-series query language to make reads more efficient
- Made an asynchronous query processor to support expensive queries with lower latency requirements
- Wrote a client-side agent that collects and reports metrics to the storage system
Our team is 12 people, including back-end, front-end, full-stack and reliability engineers. You can find out more by reading last year’s article here: https://blog.twitter.com/2013/observability-at-twitter. More formal list of requirements for the position is here: https://about.twitter.com/careers/positions?jvi=oO0WXfwr,Job.
Engineers from foreign countries, H-1B initiations and transfers are welcome. You can reach out to me directly at yann@twitter.com.