Hacker News new | past | comments | ask | show | jobs | submit login
Tank – A very high performance distributed log service (github.com/phaistos-networks)
136 points by olalonde on Dec 1, 2016 | hide | past | favorite | 29 comments



GH is now a magic land where "high performance" means "no numbers to back this up", "secure" means "we haven't actually been audited", and "awesome" something to the effect of "here, have this".


Some benchmarks added https://github.com/phaistos-networks/TANK/blob/master/README.... Will add consumer benchmarks later (our ops folks are on it). Apologies for not backing the "high performance" claim with numbers in the first place.


No performance numbers. No tests. Nothing about how it manages concurrency, how it actually writes to disk, how it does replication.

It appears to be a server not a lib. Why would someone use this instead of kafka?


It is a service, and a library(C++). You can always check the implementation, but obviously that's the wrong answer to that question :)

The README states that replication is not implemented yet. I encourage you to also check this issue https://github.com/phaistos-networks/TANK/issues/14 for some details on the I/O semantics, and the Wiki for answers to other questions: https://github.com/phaistos-networks/TANK/wiki


So single threaded, blocking file io. Not very impressive or useful (as a lib).


Yes, single threaded, because the contention to the various files and the cost of serialisation would likely negate the benefits of using multiple threads.

Network I/O is obviously asynchronous - you may want to check the codebase. Disk I/O is synchronous, but: - uses sendfile() and read readahead() to reduce or eliminate the likelihood of blocking reads: see here please https://github.com/phaistos-networks/TANK/issues/14#issuecom...

- AIO is either broken or supported on XFS (depending on kernel release, and also in the past, appending to a file on an XFS fs could block and degrade performance). But not using DirectI/O so writes end up in memory and only get flushed if it hits the commit count, or periodically - so, particularly for local disks, HDDs or SSDs, this in practice never blocks for more than a few ms when flushing, if at all.


> Yes, single threaded, because the contention to the various files and the cost of serialisation would likely negate the benefits of using multiple threads.

Multithreaded does not mean multiple files.

> But not using DirectI/O so writes end up in memory and only get flushed if it hits the commit count, or periodically - so, particularly for local disks, HDDs or SSDs, this in practice never blocks for more than a few ms when flushing, if at all.

Unless you are actually using it under a high throughput scenario which is why you would use a lib like this. It will work great until you hit the actual flush point, then possibly block for seconds even minutes.

If your performance needs are high, mandating XFS is not unreasonable.


It's totally unreasonable to mandate a particular FS for modern general purpose software.

Multiple files does mean multiple threads. Unless you manage to run this without a modern operating system.


> It's totally unreasonable to mandate a particular FS for modern general purpose software.

That is completely debatable. We are talking about a "a high performance distributed log service". Not a word processor.

> Multiple files does mean multiple threads.

Which is something entirely different from what I said...


You are making all kinds of assumptions without knowing anything about the implementation.


They address this in the README by saying you should use Kafka.


They say tank is a good choice of you need simplicity and performance. Yet they do not provide any evidence of it being faster than Kafka or any indication of it being fast at all.

If you're not mentioning kernel calls, allocations, thread synchronization, io mechanism etc. you're not going to be very believable.


Please see earlier comment. Specifically re: sys calls and concurrency you may want to start from this issue https://github.com/phaistos-networks/TANK/issues/14


Starting from that, you may not want to dismiss AIO & O_DIRECT in the span of a paragraph.


I don't see any information other than a claim that it's "engineered for optimal (very high) performance": https://github.com/phaistos-networks/TANK/wiki/Why-Tank-and-...


No JVM required. Way easier to set up than Kafka. I'm sold :)


Is headless install of the Oracle JVM that hard? Isn't it just a Puppet or a Chef snippet?


I think I, or someone else maybe?, needs to run some benchmarks and produce some meaningful comparison metrics -- as pointed out by commenters here, and elsewhere. I suppose I should have done that already. Apologies for the lack of concrete numbers. There's some information about performance in the Wiki and Issues though.


I'm looking for something like kafka with an at least once guarantee. I believe this can be achieved with the kafka java client (not sure on that) but librdkafka (C++ client) doesn't seem to support this guarantee. Performance is secondary to messages not getting dropped in my use cases.

What kind of guarantees does tank make?


The Tank Client will immediately publish to Tank (it doesn't buffer requests). You get at least once semantics /w Tank(exactly once pretty much means at least once but with dupes-detection).


So if I have a subscriber that simply publishes a transformed message onto another topic I can have a guarantee that if the publish fails it wont move on to the next message in the subscription?


The consumers applications (which interface with Tank brokers via a Tank client library) are responsible for maintaining the sequence number they are consuming from.

Suppose one such application is consuming every new event that's publishing ("tailing the log"). As soon as another application successfully publishes 1 or more new messages, the consumer will get them immediately. If the application that attempted to publish failed to do so, or didn't get an ACK for success, then you are guaranteed that no new message(s) were published(i.e no partial message content).

I am not sure if that answers your question, if not, can you please elaborate?


I believe so. I suppose I'm asking for an abstraction that makes maintaining the sequence number simple and fails safely in the presence of errors.

I'd basically like to be able to map messages from one topic to another with a guarantee that none of those messages will be lost; even when some error occurs (either a programming error, system downtime, or network partitions). I'd prefer the application to stop producing messages than lose any of them.

It sounds like that is possible with Tank so I may end up giving it a try.


FWIW exactly once is being worked on now: https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+E...


Looks like it is abandoned; first commit was June and last September.


I am the core developer. It's not abandoned at all. There are updates that haven't been pushed upstream, but no new features. Was going to support replication via an external consensus service (etc, consul, etc) -- but looks like implementing Raft directly into Tank for interfacing with other cluster nodes is a better idea, all things considering (no external deps., simplicity).

The reason this hasn't happened yet is that, other than lack of free time to pursue it, we(work) haven't really needed that feature yet. We run a few instances and they are very idle and we can also mirror to other nodes (via tank-cli).


Can you please please please use the etcd implementation[1] of raft and not the normal go-raft or consul raft implementations? They've done some serious business fault injection and integration testing with etcd as part of google's hosted kubernetes (GKE). There are still some lingering issues with consul at scale that make me a bit gunshy. Mesosphere did some of this work themselves: https://mesosphere.com/blog/2015/10/26/etcd-mesos-kubernetes... , but I know that Google engineers have done tons of work on this as well.

[1] https://github.com/coreos/etcd/tree/master/contrib/raftexamp...


Thank you - yes, I was planning to base the implementation on etcd's. I appreciate the heads up:)


Cool, thanks for clarifying!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: