> Second, Etcd and Fleet are themselves designed to be modular, so it’s easy to ...

jdoliner · on Feb 11, 2015

Hi, JD @ Pachyderm here.

Our feeling is that there's a big difference between having projects like this in the ecosystem and having a canonical implementation that comes preloaded in the software. "Batteries included but removable" is the best phrasing of this we've found (stolen from Docker). I've seen tons of pipeline management tools in the Hadoop ecosystem (even contributed to some) it's not that they're bad it's just that they're not standardized. This winds up being a huge cost because if 2 companies use different pipeline management it gets a lot harder to reuse code.

I think etcd is a pretty good example of this. Etcd is basically a database (that's the way CoreOS describes it) by database standards it's pretty crappy. But it's simple and it's on every CoreOS installation so you can use it without making your software a ton harder to deploy.

Hope this clears some stuff up!

on Feb 11, 2015

[deleted]

exacube · on Feb 11, 2015

Redis cannot replace etcd: etcd (like zookeeper and chubby) offer strong consistency, which Redis does not. These requirements are key for building things like a directory service (which is how CoreOS uses etcd).

_hyn3 · on Feb 11, 2015

Here's a Jepsen runthrough on etcd:

https://aphyr.com/posts/316-call-me-maybe-etcd-and-consul

nickysielicki · on Feb 11, 2015

what's not modern about Condor?

sgt101 · on Feb 11, 2015

Isn't Tez good for DAG's?

a8da6b0c91d · on Feb 11, 2015

> somebody should be building a modern HTCondor

A bug-free, cross platform, rock-solid HTCondor would be great. Sadly the existing HTCondor is a buggy turd. I've fixed some stuff in it and a lot of the code is eye-roll worthy. I can practically guarantee you hours of wondering why something that by any reasonable expectation should be working isn't working, and not giving any meaningful error message.

But like you said, a really solid HTCondor rewrite with the Stork stuff baked in would pretty much be all anyone should need, I think. It's not like you'd even really need to dramatically re-architect the thing. The problems with HTCondor really are mostly just code quality and the user experience.

lrm242 · on Feb 11, 2015

I use HTCondor every day. I find it to be rock solid with top notch documentation. It is cross-platform today as well, so I'm not sure where you're coming from on this.

chubot · on Feb 11, 2015

Interesting -- does HTCondor solve the same problem as Hadoop? I know it is for running batch jobs across a cluster.

How big are the data sets? How does it compare with the MapReduce paradigm?

I see there is this Dagman project which sounds similar to some of the newer Big Data frameworks that use a DAG model. I will make an uneducated guess that it deals with lots of computation on smaller sized data (data that fits on a machine). Maybe you have 1 GB or 10 GB of data that you want to run through many (dozens?) of transformations. So you need many computers to do the computation, but not necessarily to store the data.

I would guess another major difference is that there is no shuffle (distributed sort)? That is really what distinguishes MapReduce from simple parallel batch processing -- i.e. the thing that connects the Map and Reduce steps (and one of the harder parts to engineer).

In MapReduce you often have data much bigger than a single machine, e.g. ~100 TB is common, but you are doing relatively little computation on it. Also MapReduce tolerates many node failures and thus scales well (5K+ machines for a single job), and tries to reign in stragglers to reduce completion time.

I am coming from the MapReduce paradigm... I know people were doing "scientific computing" when I was in college but I never got involved in it :-( I'm curious how the workloads differ.

a8da6b0c91d · on Feb 11, 2015

Do you administer the clusters, or has somebody set everything up for you? Do you use it entirely on linux, or have you made the mistake of trying to get it to work on Windows? Which universe do you use: java, vanilla, or standard? I can imagine a perception that it's rock solid in a particular configuration and usage pattern, especially one where somebody else went through the pain of setting it up.

lrm242 · on Feb 11, 2015

I admin the cluster, so you're gonna need to try again.

a8da6b0c91d · on Feb 11, 2015

[flagged]

peteforde · on Feb 11, 2015

Why are you being so mean? You look like a huge asshole in this thread.