Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
CouchDB vs. MongoDB (panoply.io)
88 points by yanivleven on June 15, 2017 | hide | past | favorite | 52 comments


The article is strangely outdated on the CouchDB side, like someone wrote it two years ago.

It fails to mention that CouchDB now has Mango, which is a MongoDB-compatible query language.

Since 2.0, CouchDB also has Dynamo-like clustering thanks to Cloudant's open sourcing of the BigCouch code.

I wonder if the MongoDB side of the comparison is more up-to-date, or equally stale.


Mango seems interesting, thanks for that. I got thrown into a tight spot at work, where I'm building a lease calculator for the accountants, I need to build the whole thing as a self-contained application, due on Monday. Long story is that it wasn't as easy as I thought, and I ended up having to replicate Journal entries, a trial balance and financial statements. Even though Mongo doesn't have transactions, I found a good way to make my journals balance. I'm using aggregating framework to post journals to TB and FS.

The last part I was worried about is the self-contained bit, so I'm going to give CouchDB a try, and use Mango to prevent redoing everything


You don't need a different NoSQL database, you need a relational database.


I would assume a SQL database would be at least a tad easier to do accounting like tasks?


No mention of Mongo "/dev/null as a service" DB's infamous issues with data loss?

Apparently their new replication protocol isn't as utterly broken as the old one, but my understanding is that data recovery after server crashes is still a problem.


I ran into a doozy in our production Mongo 3.2 replica set just this week, where initial sync completely failed because the oplog was growing too fast. It seems this class of errors was actually fixed ( https://docs.mongodb.com/manual/core/replica-set-sync/#initi... ) in 3.4, which also fixed a number of the data reliability issues you mentioned (see https://jepsen.io/analyses/mongodb-3-4-0-rc3 ). But since we hadn't upgraded, the only way to fix things was to throttle back our write-heavy background tasks during a period of low usage - definitely not the kind of thing you want to have to explain to users.

At the end of the day (see my other comment in this thread), I'm bullish about the developer experience of Mongo/Meteor, and I think that 363 days a year of significantly increased developer productivity is worth the 1 day of devops hell that might ensue from that stack choice, and 1 day of realizing that your performance problems all stem from Mongo query performance being much more reliant on manually creating indices on fields vs. an unindexed Postgres table of the same size. Sigh. On the reliability end, it helps that our (largely append-only) data model is such that inconsistencies are possible to clear up manually if needed. But on those couple days a year, I'm definitely not a Mongo fan.


I've seen this 'increased productivity' argument for Mongo over and over but I just don't see it in practice.

I've dealt with many apps and teams that use Mongo, and the amount of time spent tracking annoying bugs because of a lack of schema, writing migration scripts and the abysmal performance of the database leads me to believe that the only reason Mongo devs think they are more productive is because they can get 3 lines of code to make a new collection + record in the database in 5 seconds, and ignore the 5 months of time they spent down the line.


Developer productivity vs devops hell is a compelling argument for MongoDB for fast iteration when building out. I'm fortunate to deal with relatively low scale with my Mongo deployment, but I'm hoping by the time the seams show in Mongo for our needs, we'll be large enough to abide the long road to a hardened and reliable data infrastructure.

MongoDB 3.4 seems like it has a lot of great stuff and I hope to move to it soon.


> Mobile support: CouchDB stands out, in that it can run on an Android or iOS mobile device. In addition to being mobile, the database can also synchronize with a remote master database, allowing the data to be shared easily between mobile devices and servers.

Meteor actually provides exactly this for MongoDB; it has a "minimongo" package in the browser that supports Mongo's query language, running it synchronously against an in-memory copy of the collection [0]. And with Meteor, you can specify "subscriptions" declaratively that enable bidirectional synchronization while their owner components are in scope.

Mongo certainly has some reliability issues (see other comments here) but I've yet to find a full-stack system so painless to develop in, especially if you need realtime support. With things like ToroDB Stampede [1] and a general approach of "write all your code in React with Meteor dependencies factored out into containers," there's a clear migration path towards the relational-based separate-backend-frontend world when you need to go there.

[0] https://guide.meteor.com/collections.html

[1] https://www.torodb.com/stampede/docs/1.0.0-beta2/about/


The CouchDB one is actually a fully persisted database, not just an in-memory cache. Both are useful, but it's not quite the same thing.


Plus the CouchDB ecosystem has Couchbase for iOS/Android (with their own tradeoffs versus mainline CouchDB) and PouchDB which is a JS version of CouchDB that runs in the browser on just about any platform and can store directly in IndexedDB.


CouchDB replication has got to be among the easiest and nicest in the industry. Setting up master/master is a breeze.


> Snapshots: Any changes to a document occur as a revision and appends the information to the file. This means you can grab a “snapshot” of the file and copy it to another location even while the database is running without having issues with corruption.

This is the main feature I sell when pushing CouchDB.

Use it to project events and you'll see what I mean.


On the other hand, with v2 sharding it's acctualy harder to pass this file(s) around and can be documented better how to do so successfully.


I want to make an obvious point, but I know it confused me quite a bit for a while, so just in case it helps anyone else: There's CouchDB and then there's Couchbase which is similarly named but completely different.

I recommend anyone shopping for databases with easy master/master replication with eventual consistency and no single points of failure to consider Couchbase as well. It's not the same will have it's own set of pros and cons.


Couchbase is not eventually consistent


I really liked this article for this kind of thing :

https://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis


> MongoDB is schema-free, allowing you to create documents without having to first create the structure for that document. At the same time, it still has many of the features of a relational database, including strong consistency and an expressive query language.

The author clearly has a different definition of "strong consistency" than most. I don't see how any claims of consistency (in a data usage, not CAP sense) can be made of a database that can't properly store a number as a number or even guarantee that it's a number at all.

Also, does anyone actually like the Mongo query language? It was cute when I first saw it but I pity anyone trying to do anything complicated by manually writing those JSON strings.


MongoDB absolutely stores numbers as numbers. Always has.

And yes in a schemaless database you are required to manage the schema in your application layer as opposed to within the database. If we wanted a database with a rigid schema we would just use a SQL database.


He said can it guarantee that the field "length" will always be a number and not..eg a string, or a list, or an object.


Schemaless by definition means this is not guaranteed. The same property name can hold different types of data from one object to the next. Several other key-value/wide-column have the same exact setup.


Why would it guarantee that ? It's a schema less database. It's not supposed to do that.

And my point is that if you query a single document and the field is a number then it will be returned as a number i.e. it physically stores and understands numbers.


Just to be clear, a better term would be "dynamic schema". You can ask for data type validation since 3.2.


Elasticsearch is schemaless as well, however it can guarantee that a field is a number once used.


Elasticsearch is not schemaless. It typically requires you to define a mapping (schema) upfront. You can set it to automatically infer the mapping based on the available data but this does not make it truly schemaless.

MongoDB allows a free form schema for every document. You're again comparing apples and oranges.


> a database that can't properly store a number as a number or even guarantee that it's a number at all.

What do you mean?


I meant it doesn't have a decimal type but looks like as v3.4 it does: https://docs.mongodb.com/manual/tutorial/model-monetary-data...


Ok, but even prior to 3.4, in what sense did you mean it couldn't guarantee something was a number at all?


Not having an enforced scheme means that there is no guarantee a given key exists or that if it does exist, the data type matches what you'd expect it to be. One errant piece of code could insert a bad record that could break some other of piece of code.


Since version 3.2 you've been able to enforce data types with validation.


One huge benefit by adapting a CouchDB backend is you got replication free, with PouchDB on client side this opens so many opportunities.


PouchDB is one of CouchDB's killer features. During development you can effortlessly switch between storing things locally, in the browser, and then sync with a remote database.


For native mobile apps, there are also a variety of native libraries that can be used:

Couchbase-Lite: https://github.com/couchbase/couchbase-lite-ios https://github.com/couchbase/couchbase-lite-android/

Cloudant Sync: https://github.com/cloudant/sync-android https://github.com/cloudant/CDTDatastore

We've had a good experience with CouchDB with an Android native app, iOS native app, and a web app (using PouchDB with it there).


I'd like to see "vs Postgres".


Postgres' JSON support is very impressive.


We've found the Postgres json store works just fine for our purposes, thank you very much.


The reason I rejected PG's JSON store, was it's inability to update fields inside of a JSON doc, without replacing the whole document. In your case, does this constraint push you to use more types of smaller documents, or do you just read the whole doc, update it in the application, and then write it back to the DB?

Have you ever had an issue with conflicts, where multiple instances of the app read, modify and write different things to the same document at the same time?


JSONB data-type allows for specific field modifications in the JSON object. Operators look ugly in hand-crafted SQL but, works a treat.


> Have you ever had an issue with conflicts, where multiple instances of the app read, modify and write different things to the same document at the same time?

This problem is what pessimistic (select... for update) or optimistic (using a version column) locking is for. If you don't want any race conditions to sneak into your code, as a rule you should probably be using one or the other regardless of whether or not you use postgresql JSON.


Does PG's Jason support multi-master sync? Native db level support for that feature simplifies a lot of my use cases. It'd be interesting to see a SQLite/WebSQL <-> Postgres multi-master syncing system. It'd be the equivalent of CouchDB <-> PouchDB. Maybe even using the same CouchDB protocol! :-)

CouchDB and Couchbase both support only whole document updates. So you get document conflicts in those document stores as well, meaning your app needs to understand and handle 409's. But those conflicts are relatively easy to handle in most cases, at the cost of a new round trip. Mostly it's a matter of downloading the new document state and merging your change to it and re-post. If you're using Redux/Vuex/Event Sourcing this becomes trivial to support. Another way to handle it is to split a single large document into smaller pieces and write a map/reduce view that returns a composite document. That should be possible in Postgres as well with a prepared statement.


Mongo really isn't about storing JSON. At least that shouldn't be it's selling point. The selling point that is since it's basically just a glorified key value store it's very replicable and distributable. Postgres is very much not either of those things. The JSON is nice for a few things and I use it sometimes but Postgres is and will always be extremely difficult to cluster and replicate.


Postgres 10 added support logical replication. It makes horizontal scaling fairly painless.


Until you need to reseed the original master, want to do quick rolling restarts or want any automation in automatic failover. PG has a long way to go. A


Logical replication by default doesn't even handle DDL statements so something as simple as adding a new column requires extra process. Postgres is probably the weakest of all relational databases when it comes to scalability, both vertical and horizontal.


I don't agrre with the article saying MongoDB is better at reading.

From my experience MongoDB is fast, but CouchDB really shines when you have a read heavy application.

Also the article didn't mention Mango queries, which is a blessing (fast indexing as erlang views), but in my opinion this feature can be a lot better with stale results, for instance.


I use couch, pouch, elasticsearch and sql dbs. I never really got the "schemaless" selling point of nosql though.

I do see the point of storing documents rather than rows for some use cases, or dbs extra strong on searches, etc.

But what is the problem with an "add column" or "change datatype" operation is sql..?


Partly because MySQL requires a full table lock and full table copy to implement those operations. Much of the nosql movement originated from implementing schema changes on production MySQL databases. In many if not most cases, NoSQL is really NoMySQL.

Many other engines, PostgreSQL for example, can add a new column without constraints as a nearly instant metadata change only. Data type changes that do not require validation (expanding a VARCHAR vs CHAR to INT) are also rapid.


+1

In my experience, "schemaless" just means that I'll have to manage the schema manually through some "updater" scripts.

Any data without schema is just noise, so if you have any data, it means that you have schema for it somewhere.


I find MongoDB easier to deploy and maintain because it's easier to build and monitor. CouchDB with Erlang not so much (for our environment, it's totally alien).

EDIT: I made a statement about our preference, for our environment. I did not make broad claims about these DBs for other people. How am I upsetting HN?


Why do you build your DBMS from source?


Specifically for CouchDB, because packages provided for CentOS/RHEL were outdated and now they don't even exist.


Now do ArangoDB vs MongoDB. Given Arango's clustering tools recently, it should be a fair comparison.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: