Scaling CouchDB

smharris65 · on June 11, 2010

The nice thing about CouchDB is that a single server is still ACID compliant. And replication can be done with a simple script using wget. It also very easy to use because the data format is Javascript and you can access easily with REST in any language.

However, keep in mind that all your map/reduce "queries" generate indexes on disk. This results in fast queries, but, when you update your query the entire index will be rebuilt. This will be time consuming if your queries return a "large" set of data. I performed a simple test with about 30k-40k rows and it took about 5-6 minutes to update the index. This would be acceptable in my case since I would update the query at the same time as my server software update, so I would manually force the index update as part of post-load testing.

Also, try not to emit your entire document in a query. This basically results in the entire document being included in the index.

Understand CouchDB and you'll find it very useful and easy to work with. I have no regrets so far.

dantheman · on June 11, 2010

There is no need to emit the entire document in a query because as part of the query access you can request the full documents (using the "include_docs" docs url parameter.

Also avoid building large hashtables in your views as performance will be fine for 100k or so documents, and then as hashtables merge in the rereduce step performance starts to crawl.

sofuture · on June 11, 2010

Not a bad (albeit brief) overview. We're just starting out with CouchDB and haven't gotten to the point where we need to scale past a single instance -- though I can see it coming soon :)

I'm pretty head over heels with Couch -- it fills it's niche extremely well.

alexpopescu · on June 11, 2010

Indeed it's a bit brief... more of an overview. I'll definitely try to find some time to work on something more detailed.

emehrkay · on June 11, 2010

How much data and what are you guys doing? if you dont mind me asking

j2d2 · on June 11, 2010

Is that because you haven't tried redis or because you haven't tried mongo?

sofuture · on June 11, 2010

With no followup, that's not very compelling. Enlighten me!