We are hiring for quite a few engineering and non-engineering positions, but I would like to highlight the roles available on Chango's Large Scale Data team where we have two open positions for intermediate/senior software engineers.
At Chango, we process 20 billion or so unique data points per day, and are trying to minimize the latency between times when we collect the data and make it available for the real-time distributed systems that need to use it.
We use traditional SQL databases (Postgresql), columnar data warehouses (Vertica), networked/local key/value stores (AeroSpike, KyotoCabinet, LMDB, memcache), as well as a many node petabyte scale map/reduce cluster (Disco/DDFS).
You will be programming in Python, Cython and C with forays into Go and Erlang. You are comfortable with your code running on 100s of nodes in multiple data centres. You have solid knowledge of Linux internals and tools, networking, disk, cache and memory subsystems as well as techniques to measure and optimize all of the above.
You have thorough familiarity with relational technology/theory, ACID, transaction processing, networked and local key/values stores, data warehousing and columnar databases. You should have experience with map/reduce and distributed file systems (Hadoop/HDFS, Disco/DDFS).
You have a firm understanding of the various and sometimes exotic data structures and algorithms used for processing large data sets. You will use bloom filters, hyperloglog, skip lists, b+ trees, prefix tries, hashing in all it's multitudinous forms, MVCC, two-phase commit, data compression/encoding, etc.
You are familiar with and are eager to build distributed systems. You are familiar with shared-nothing architectures, replication, fault tolerance, queueing, messaging models, CAP theorem, and distributed locking.
We are hiring for quite a few engineering and non-engineering positions, but I would like to highlight the roles available on Chango's Large Scale Data team where we have two open positions for intermediate/senior software engineers.
At Chango, we process 20 billion or so unique data points per day, and are trying to minimize the latency between times when we collect the data and make it available for the real-time distributed systems that need to use it.
We use traditional SQL databases (Postgresql), columnar data warehouses (Vertica), networked/local key/value stores (AeroSpike, KyotoCabinet, LMDB, memcache), as well as a many node petabyte scale map/reduce cluster (Disco/DDFS).
You will be programming in Python, Cython and C with forays into Go and Erlang. You are comfortable with your code running on 100s of nodes in multiple data centres. You have solid knowledge of Linux internals and tools, networking, disk, cache and memory subsystems as well as techniques to measure and optimize all of the above.
You have thorough familiarity with relational technology/theory, ACID, transaction processing, networked and local key/values stores, data warehousing and columnar databases. You should have experience with map/reduce and distributed file systems (Hadoop/HDFS, Disco/DDFS).
You have a firm understanding of the various and sometimes exotic data structures and algorithms used for processing large data sets. You will use bloom filters, hyperloglog, skip lists, b+ trees, prefix tries, hashing in all it's multitudinous forms, MVCC, two-phase commit, data compression/encoding, etc.
You are familiar with and are eager to build distributed systems. You are familiar with shared-nothing architectures, replication, fault tolerance, queueing, messaging models, CAP theorem, and distributed locking.
Competitive salary, equity and benefits.
contact: tim AT chango.com or on the site