Hosts online service seems to think deserving of medal for discovering that S3 b...

brongondwana · 2024-12-22T23:19:40 1734909580

Ceph is most certainly not the only game in town. It's good and stuff, but it's just tech. We're using protocol level replication for each of our data stores.

rob_c · 2024-12-23T10:54:00 1734951240

No, let's be honest. CEPH is the only solution for data management at this scale (sub to few PB). The solution which is independent of application or workload. The market share, fact IBM is moving people off other projects internally for this, and the massive backing shows this.

Yes you can have all or a bunch of these features like failure domains via other routes/products but none have all of the stuff together in one place like CEPH.

There's a reason people call it the "Linux of storage". The only alternatives are manage this at a higher level in your stack (reinventing the wheel) or buying PB level solutions from corporate which is like saying I'm buying Oracle and MS over Linux.

Protocol replication means you've reimplemented something which is storage related elsewhere in your stack. It's not incorrect to do so, but there exist better solutions and alternatives now.

brongondwana · 2024-12-23T11:49:07 1734954547

I mean, I'm happy to have this argument. CEPH is content agnostic and that's fantastic most of the time. Cyrus replication is data aware, so it's not just replicating the data, it's doing integrity checking and data model consistency handling.

Most of all, it's doing split brain recovery; which - if we wanted CP rather than AP then we wouldn't need, but that wasn't the original design.

If I was redoing this from scratch, I'd maybe do Ceph or similar and update Cyrus to work well with it, but that would be a big change from the current design.

Anyway, I'm happy to stipulate that Ceph is great tech, without going and telling other people that it's the only choice.

rob_c · 2024-12-23T19:45:58 1734983158

Do you honestly think CEPH isn't doing data consistency handling? I'll pay for your ticket to cephalocon if you'll speak to that effect(!)

Split brain stuff only happens when you're splitting a single threaded task and put it back together. MDS in CEPH has this problem but that's so far into the weeds here as to be off topic.

Again you're implementing something storage not in storage and taking any storage. Fine if you want to do it that way, but talk about _that_ not hecking ZFS being mah saviour. (Btw daily driving and love that too but an email provider _relying_ on it should raise eye brows)...

brongondwana · 2024-12-23T21:04:11 1734987851

I do believe we are talking past each other here. Of course ceph does data consistency, but it sure doesn't assert that a modseq is monotonically increasing or that an mailbox/uidvalidity/uid triple doesn't change digest, because it's not data-model aware.

sigh