Hacker News new | past | comments | ask | show | jobs | submit login
A multithreaded fork of Redis that is faster (keydb.dev)
450 points by ericblenkarn on Oct 7, 2019 | hide | past | favorite | 159 comments



Any time these "much faster than Redis" databases come up, the sysadmin in me wonders how many people have had actual performance limitation issues with Redis. I've seen Redis servers handle hundreds of GB of traffic per hour. I've worked at companies where Aerospike and others are proposed as replacements for Redis because "they're faster" - and I point out the 98% idle CPUs on the Redis server, and the near-100%-usage CPUs on the app server fleet and mouth "But... why?"

Replacing Redis with "something faster" is a bit like removing the doors on a car because "lighter means faster!". It might look good on a racetrack, but it's about as pragmatic as climbing through a window every morning before setting off for work.


As the Sidekiq maintainer, I’ve seen many customers need to shard Redis around 5000-10000 jobs/sec. Sharding is a major operational headache so this could be very useful to heavy job processors if it does prove to scale better.

I also find it interesting that the BSD license enables this 3rd party company to fork Redis and build closed source commercial software on top of it. One of the trade offs to consider when licensing a project.


I'm not sure if you're referring to KeyDB or something else as "closed source commercial software", but KeyDB is BSD licensed like Redis: https://github.com/JohnSully/KeyDB/blob/unstable/COPYING


They sell closed source extensions to Redis AFAICT.

https://keydb.dev/modules-redis.html


May have been better to use GPL2?


Better how? If using GPLv2 means companies stay away from the project instead of building on top of it, that's not a win for anyone.


This way is a win for a company selling closed source software, that save R&D money and contributed nothing back.

As user of commercial software I am fine with it, not so sure if FOSS advocates at large will be so happy when only non-copyleft licenses survive and we are back in the shareware/pd libraries days.


> and contributed nothing back

Why do we assume closed-source software vendors contribute nothing back?

Speaking as an employee at a company that produces a closed-source software product that uses open-source libraries, I've contributed plenty back to various libraries, including publishing some of my own.

Libraries with permissive licenses get more users, and more users mean more opportunities for receiving contributions. Something like GPLv2 is really only truly effective at soliciting contributions that it wouldn't have received otherwise if there's no viable alternative.

Or to give another example, Rust is dual-licensed under the Apache License, Version 2.0 and MIT. This permissive license made it really easy for lots of people (including myself) to contribute to it. If it were released instead using the GPL, it would likely be a shadow of the language it is today, if even still alive at all.


Because that has been my experience in like, almost every employer and customer I have work for since the mid-90's.


Working in places that were based off permissively licensed software, the default position we had was "no upstreaming" and if I wanted to upstream something I need to ask for permission.

when you're contributing to GPL software, you don't need your employer's permission for it to be upstreamed. If they release the code, and don't violate the GPL, then there's nothing the employer can do to stop it from going upstream.


> when you're contributing to GPL software, you don't need your employer's permission for it to be upstreamed

Yes, you do; if you are contributing to it, you need to have exactly that permission. If you are working on a derivative, you need that permission before you can contribute it to anyone else, including upstream.

> If they release the code

Plenty of people work on internal code for their employers, so this is not a given.


> when you're contributing to GPL software, you don't need your employer's permission for it to be upstreamed.

Yes you do. Except the permission in this case is "permission to use the library" in the first place, which is a much higher bar than permission to upstream changes to a permissively-licensed library because using a GPL library has much farther-reaching implications than merely contributing changes back.


> If using GPLv2 means companies stay away from the project instead of building on top of it, that's not a win for anyone.

Like companies staying away from Linux or GCC? It's debatable if these projects would have been as successful using MIT/BSD.


> If using GPLv2 means companies stay away from the project instead of building on top of it, that's not a win for anyone.

Companies having problems with GPL is a problem of companies and not a problem of the license. If the library you want is GPL, then why blame the project and not your company's legal department?


Ok, what is a job/second?


Look up what Sidekiq is. It is awesome.


Aw, thanks! I was mis-parsing this as '5000 Redis jobs per second' instead of the intended 'needed to use sharding when sidekiq is scheduling more than 5000 jobs/second'

Thanks!


Jobs per second


Facebook just published a blog about moving petabytes per hour. I read it, I find the design interesting. I don't need it. But if I do, it's good to know that it's possible, that someone has solved, to have a reference and to possibly use their tool if it's free. So relax, I for one welcome whatever folks are working on out there. If it's true, we can learn from it and apply it in order domains too.


> Facebook just published a blog about moving petabytes per hour.

For the curious: https://engineering.fb.com/data-infrastructure/scribe/

Edit: HN thread: https://news.ycombinator.com/item?id=21181982


Absolutely astounding to me, petabytes an hour? That's in the region of a meg to several megs per user per hour looking at their monthly active user figures.


It's mostly telemetry data. /s or not /s not sure.


Mmm it's not only "telemetry data". It's that (e.g. Scuba) and other types of logs, and not only Facebook (e.g. Instagram as well).

Basically, everything that needs logging and post-processing by both real-time systems (e.g. Puma) and batch processing (e.g. all of the data that's ingested and sent to the data warehouse) goes through Scribe.

(disclaimer: I work in Scribe)

Scuba: https://research.fb.com/publications/scuba-diving-into-data-...

Puma: https://research.fb.com/publications/realtime-data-processin...


Does that include Instagram and WhatsApp? Or just FB branded stuff like Messenger and the network app...


Not everything that flows through Scribe is tied to an (external) user, though. Tons of internal systems use it as well, notably anything that logs to Scuba (which is pretty much everything at Facebook. Wide-structured system logs are awesome).


You should post that.


>Facebook just published a blog about moving petabytes per hour.

But it's not like that's through one pipe. We don't talk about how many zillions of tons of steel per minute are moved on freeways.


We're using a Redis alternative called SSDB [1] in production because the entire data set doesn't need to fit in RAM. This saves us thousands of dollars per month in server costs. Like you say, latency gains is nominal. The main reason I see alternative databases used is one missing key feature that's custom to a specific use case.

[1]: https://github.com/ideawu/ssdb


Did you test out ARDB [1] when you chose SSDB? I've been considering these for a while since I know I'm going to hit that choice point though that's not happened yet.

[1] https://github.com/yinqiwen/ardb


Looked at it, but setup for SSDB was easier and already was using SSDB in production. Hesitant to change something that's working so well. Do have periodic connection issues with SSDB but so infrequent it's easy to work around client-side. Using the python redis [1] client with Flask.

[1] https://pypi.org/project/redis/


I've personally written a (somewhat hacky but simple) setup for multiplexing high-throughput job queues for a Rails app across multiple Redis instances because our chosen job framework at the time relied a lot on Lua operations on large sets in Redis, and these could easily result in 100% CPU utilization for the Redis process.

We picked the job queue framework long before we started moving tens of thousands of jobs a second through it with a later feature, and probably exceeded its design constraints - in that instance we effectively were using Redis + jobs as a pauseable and throttleable write buffer for MySQL.

With major tooling changes we likely could have come up with something a lot more elegant, I'm not longer with that company, but our plan was always to get rid of the need for that system to hit MySQL at all, which would eliminate a lot of our need to control throughput with our Redis buffer. Of course, doing that would have taken a lot more time and effort, and in our case doing "the simplest possible thing that would work reliably and serve our customers" meant that we probably could have used a faster Redis instead of multiplexing requests and workers across multiple Redis instances on the same hardware.

Sometimes an "improved" version of a tool you're already using can be really useful, if you find yourself in a bind and the alternative is major architectural changes.


The idle CPU might very well be due to redis being single threaded. It might be hitting is limits and still show low CPU utilization.

Also, you didn't mention that, but I've seen this happening often, you should never use redis as a primary data store if your data is important.


You'd expect a 1/n where n is number of cores available utilization in the case of redis being CPU bound.

I guess I don't know how redis deals with the network, but I'd assume that it handles concurrent requests. If not, then I guess that's the case where you'd see less than 1/n cpu utilization and it could still be CPU bound.


Redis handle all request serially, because of that it is much simpler to be correct and doesn't have to worry as much about distributed gotchas.

Because redis handles all requests serially you need to be very careful if the redis cluster is shared between different apps. You want to share the cluster among apps with similar usage patterns.


If the perf isn’t needed we also have Active Replication, subkey expiries, and S3 integration. Multithreading is how we got our start and the reason most people use us though.


True, it's rarely just raw performance. KeyDB has other advantages like multi-threading and disk-based persistence (instead of being limited by RAM) that makes it better at utilizing your server resources and handling larger scales.


> ...it's rarely just raw performance. KeyDB has other advantages like multi-threading...

What are the advantages of "multi-threading" other than for performance?


Taking advantage of all your cores. Redis is awkward at high capacity since most servers scale CPU with RAM and you'll end up with most cores doing nothing. You can manually run multiple instances on the same server but now they're separate databases with operations and sharding overhead.

Multithreading IO also reduces the latency hit from disk-persistence and provides more concurrency and throughput, which is a great tradeoff when you don't really need sub-millisecond performance but want the same API across a larger dataset limited by disk instead of RAM space.


That's just performance spelled differently.


Ok fair. Disk being primary storage is still a ops and scalability benefit.


>> What are the advantages of "multi-threading" other than for performance?

> Taking advantage of all your cores.

Uh, is there any reason you'd want to "take advantage of all your cores" other than performance?

The question/proposal/dialog we started from, don't forget, was:

1. Redis is already as performant as I want, it's not even close to being a bottleneck, so I have no need for multi-threading, and this is a _very very common_ case, as redis is performant enough for a lot.

2. There are other advantages of multi-threading than performance. (Ie, other reasons you'd want it despite 1).

You seem to just be going around in circles. If 1, why would you care about "taking advantage of all your cores"?


Sure, I concede multithreading is for performance. Still, the second point of having Redis structures available over a dataset that isn't limited by RAM is a big advantage.


> Taking advantage of all your cores.

That means nothing at all, if performance is not your concern.


But isn't the point of Redis is to be used as a caching layer instead of persistent layer


But what if you could have the speed of RAM-based storage, and still have persistence? Sounds pretty appealing to me, personally.


You already get this with either Redis or Keydb. The problem is that to avoid the disk being the bottleneck you must write to RAM, then later to the disk (you can't wait for confirmation that it was written) which means that even though a key may successfully be written to the DB, it hasn't been confirmed to persist.

Basically redis can't necessarily guarantee data safety - which is perfectly fine for it's normal use case as a caching layer and data you don't necessarily care if you lose (for example rate limiting relies on tracking the number of requests/s within say an hour - if you lose the data you don't really care). User sessions too, worst case people have to log in again, meh whatever.


The major bottleneck for most applications using Redis is the RAM capacity, not the write performance. Disk persistence can be more than fast enough as proven by dozens of key/value stores out there.


Sounds like an embedded key value store, ie. RocksDB. I've used it with great success in constrained environments and also when pushing cache to the edge for stable, low-latency, high volume APIs.


Redis is a data structure server rather than a simple key/value store. It's very useful in lots of scenarios, even if the persistence means it doesn't have ACID semantics.

It would be great to remove limitations like RAM-only capacity in exchange for a slight performance hit, and while also gaining better core utilization. We used ScyllaDB (a very fast cassandra clone) in the past for the cpu/disk scalability but always felt Redis offered better APIs. Now it's a real option.


You can use redis as a write-back cache that serves spikes well and streams the state to disk (or an RDBMS) in a rate-limited manner in the background.


If redis isn't multi-threaded the more cores and threads you have, the less CPU usage the machine would show, even if Redis is being stressed.


Except if you monitor per CPU usage, as you should.


Except that you can't, because most operating systems will not use the same CPU core consistently for a single thread. This is done for a variety of reasons, including spreading thermal loads to ensure that turbo boost can consistently kick in.


You're right but still, I don't think it is difficult to realize that a particular process is only using one core. If you think it is, you should probably consider another application to monitor system usage.


If you use cloudwatch, the “should” part is tricky. “Should” you be able to and “should” you do it if metrics were reasonably priced based on their basic required storage / computational footprint? Yes.

However, “should” you in a world where you only have cloudwatch because you don’t want to roll your own monitoring or bring in a third party vendor? Probably not. It’s not a good thing, but it’s a reality. 1/n approximation is your go to here.


For some apps, where you can easily scale app servers horizontally and be stuck in a single redis (because redis cluster mode is not well supported by drivers, or because it can't access some atomic LUA operations) you can get redis to be your number 1 bottleneck. Specially because you can throw more cores at the PostgreSQL so Redis quickly becomes the only thing that doesn't scale.


Not only that, assuming we're not trying to hyperoptimize before code ever hits production, typically as a dev I look for performance gains when a bottleneck/slowdown seems to occur. Redis is a possibility I guess, but it's also not in the top 20 places I'm going to look first. It's plenty fast for nearly every use case I've come across and while I'm sure someone out there needs something faster, I agree - I can't imagine there's broad market appeal for it.


Happens more often than you think with larger tech companies. Especially if one has hot keys.

However yeah, 99% of all deployments could get by with SQLite/MySQL/Postgres performance amounts.


The ability to scale vertically instead of sharding is a very nice sysadmin feature that although isn't solving "must faster than redis" is related to it due to multi-threading.


Two HN threads from around when the project started:

Show HN: KeyDB – A Multithreaded Fork of Redis, https://news.ycombinator.com/item?id=19257987

KeyDB: A Multithreaded Redis Fork, https://news.ycombinator.com/item?id=19368955


So this year on RedisConf Antirez demoed threaded version for Redis (only transport needs to be multi-threaded, core remains single threaded). Numbers were already amazing. I will pick the community version of Redis any day over forks.


I feel like this take is somewhat unfair. Antirez was against threading until jdsully proved that it worked in exactly the same way that you're saying, multi-threaded transport with a single threaded core. In addition Keydb allows users to use ssds in addition to ram only when that is a paid feature for Redis Lab's enterprise support.

jdsully spurred the implementation of two of the best changes that you might see in redis in the near future and you consider even using keydb pointless.

Yea, ok.


This is historically not correct, the threaded I/O branch was started 1.5 years ago, you can check the history online, it's all public. Moreover before me, and after me, a number of individuals did the same work many times, including Alibaba team, AWS team, and so forth. It's an obvious feature, the reason for not doing this, or doing this in a very limited fashion, is philosophical rather than technical. Btw Keydb way of doing threading has nothing to do with the trick used by Redis I/O threading AFAIK, of just fanning out to N threads only in the hot places.


The Redis version does not thread query parsing. Its limited to only IO which is why its performance is much lower.


The Redis threaded I/O does parsing as well, but parsing does not change the obtained speedup a lot, so it is disabled by default (but you can enable it via config).


So he changed his mind regarding threads? Will it actually be committed? or are they more or an experiment.


It is live on unstable which should become stable by the end of year: https://github.com/antirez/redis/commits/unstable?after=c653...


> only transport needs to be multi-threaded

Could he have gone with a non-blocking approach like libuv?


Redis is single threaded because it heavily relies on non-blocking IO like libuv.

The multithreaded option is also based on non-blocking IO. You can think of it as one Event Loop per CPU core, rather than only one Event Loop per computer.


Isn’t that basically what KeyDB does?


I’m sure it’s the same. I was mostly outlining the pattern in general, not trying to imply it’s redis specific.

A sibling post says that the keydb implementation also lets the multithreaded executor perform parsing. So I guess that is one difference.


I've always considered the single-threaded nature of Redis to be one of its greatest features.


Only one thread can access the data at any given time, so it seems like most of the things you'd expect to be guaranteed by a single thread still are. I found this comment particularly interesting

   Unlike most databases the core data structure is the
   fastest part of the system. Most of the query time
   comes from parsing the REPL protocol and copying data
   to/from the network.
I wonder if anyone in the Redis ecosphere has explored a binary client server protocol, something that could be parsed/compiled on the client and then executed without parsing on the server, if the above is really true seems like that might offer even more perf gain than multithreading on the server.


From having played with/worked on profiling and optimizing Redis in the 2.6 timeframe, I can confirm that at least for small/simple operations, this is true, the data structure access is a small fraction of the cost.

One related choice that Redis makes (or made at the time) is to rely extremely heavily on the malloc implementation, rather than doing work to manage it's memory internally. Even a very trivial, naive free list provided a modest speed-up, for example.

There are a lot of these choices in the code base, largely owing to maintainability concerns (though antirez can surely speak for himself). Given how easy it is for an otherwise uninitiated C programmer such as myself to hack on it, I struggle to disagree with the prioritization. :)


The excerpted comment in a format mobile readers can see without left/right scrolling:

"Unlike most databases the core data structure is the fastest part of the system. Most of the query time comes from parsing the REPL protocol and copying data to/from the network."


The human-readable/writable protocol is one of my favorite things about Redis, tbh.

I can see cases where a really optimized system could benefit from a binary protocol, but I suspect it'd be a loss for most people.


Why not just offer both?


That was my thinking as well, though taking a peek at the actual code suggests that there's a pretty deep expectation that the client is speaking strings, e.g. in code that handles the ZRANGE command[1] I see

    if (c->argc == 5 && !strcasecmp(c->argv[4]->ptr,"withscores"))
and a quick grep suggests that's a common pattern

    % grep argv src/*.c | grep -c -e 'str\(case\)*cmp'
    482
I guess this means someone would have to tackle creating an intermediate binary format first, rewriting the command handlers to expect that format, and then making client libraries that can produce the format. Perhaps still worth it in the end, but not trivial.

[1] https://github.com/antirez/redis/blob/unstable/src/t_zset.c#...


Is this really "unlike most databases"? I remember MySQL posting profiling data years ago showing that for looking up by primary-key, 3/4 of the time was spent parsing SQL. (They went on to introduce support for querying with the Memcached protocol to address this)


That's really surprising if true, considering the SQL should only need to be parsed once.

    SELECT foo FROM Table WHERE key = @mykey;
Then you bind the parameter to whatever you're interested in.


Prepared statements are per-connection and a lot of time you want to use connections from a single pool that's used for all you different queries, so you can't really use them.


Even with that, the SQL would be parsed once per connection? So, the costs should be de minimis, unless the benchmark were short indeed?


> Even with that, the SQL would be parsed once per connection?

In a webserver-like context it's once per query one way or another - the server process is stateless-ish between page loads, so each page load is either a from-scratch connection or a connection taken from a pool, but even if you're pooling you can't use prepared statements in practice (you can't leave a prepared statement on a connection that you return to the pool because you'll eventually exhaust the database server's memory that way, and you'd have to resubmit the prepared statement every time you took a connection out of the pool anyway because there's no way to know whether this connection has run this page already or not).

If you assume a page that's just displaying one database row, which is not the only use case but a common one, then each page load is one query and that query will have to be parsed for each page load, short of doing something like building a global set of all your application's queries and having your connection-pool logic initialise them for each connection.


In a database product I'm familiar with, the prepared statements are cached according to their content and those cached objects are shared between connections. Only if they fall out of the cache do they have to be re-parsed. I had assumed that's how all databases worked.

I'm somewhat surprised at the mechanism you're describing, but now I read the documentation it does seem to be the case. I wonder if a small piece of middle-ware might be sufficient to replicate the behavior I'm describing on a connection pool, and whether that would be desirable.


See my post for essentially a "binary" interface: For my key-value store (I wrote because I thought that writing the code would be faster than understanding Redis, :-)), a client uses just standard TCP/IP sockets to send a byte array. The array has the serialization of an instance of a class. Then my key-value store receives the byte array and deserializes to get a copy of the client's instance of the class. So, with the byte array, maybe can count the interface as "binary"? I'm unsure of the speed of de/serialization.


Can you elaborate on why? As someone who works on parallel/concurrent algorithms and data structures a single-threaded system design seems anathema to most research over the past decade (maybe excluding H-Store, which uses local single-threadedness in an interesting way).


This reminds me of LMAX, who found the fastest way to build their stock exchange was to make the core logic single-threaded, surrounded by multithreaded I/O:

https://www.martinfowler.com/articles/lmax.html

I believe other (grown-up/legacy!) exchanges work the same way.

I wonder how much of the direction of concurrency research is driven by the fact that there is much more publishable work to be done in managing concurrency rather than avoiding it!


Disruptor benefits from lock free access using a ring buffer. You can get amazing results with a single thread but that aspect isn't central to the pattern.


The disruptor isn’t single threaded at all. Maybe you mean it can pin a core to a consumer that just busy waits on the queue. That is a common technique used in very low latency systems when not running on a real time OS.


The matching engine is single threaded. It uses disruptors to communicate with IO threads.


Not parent but the single threaded approach really shines when executing Lua scripts in Redis. Except for some very specific edge cases, the execution of a script is atomic and since no command is ever processed concurrently you can rule out data races.

I am sure it would be possible to provide these guaranties while offering concurrent execution but most likely at the expense of a simple design.


KeyDB provides that guarantee via a lock. However that does mean long running scripts don’t benefit from the multithreading. Ditto for modules.

The goal is 100% Redis compatibility so I can’t compromise on atomicity.


It shines right up until the moment someone writes a Lua command that takes more than a trivial amount of time, now your entire server is blocked in a pile-up of connections waiting on some relatively trivial amount of code being executed.


Couldn't you just rollback the change or fix the lua script?


Or just embrace modern hardware and use threads. This is the experience many people have running Redis in production with 21st-century hardware: https://twitter.com/rbranson/status/540565059337195520


That's a different issue. If they're in that situation without any custom scripts and the latency is cropping up, they should consider using a different tool or changing their codebase.


You can run multiple instances of redis on the same machine, which gets you parallelism from multiple processes instead of multiple threads. There's some extra deployment effort in doing so, but it's not generally a big issue.

It also isn't exactly single threaded. It calls fork when it wants to persist data to disk, which is functionally similar to starting up a thread to do disk IO.


Going from a single processes holding all your data to having your data sharded across multiple processes can be more than a little effort.

Also, it might now have any benefit for you. Imagine a certain key is particularly hot. Having one multithreaded redis process handling access to it might speed things up. Running multiple sharded redis processes won't, since only one of them will have that key.


Most redis users are running multiple instances regardless. That some of them happen to be on the same machine is largely a detail.

It would make a little difference if there was a single hot key, but that's a bit unusual. Typically there's some subset of keys that are hot, and you can get them to hash across instances. People also tend to cache those values on the clients, as banging on redis constantly is a waste.


> but that's a bit unusual

It's pretty common when using a single Redis stream as a lightweight Kafka.


Doesn't that end up duplicating lots of data and work since the processes are independent? They don't share data do they?


fork() on Linux (and many other OSes) shares pages between the parent/child processes in a copy-on-write scheme


That doesn’t mean multiple long running processes will share memory though unless they actually create a shared mapping, which I think the gp is referring to. In addition to tremendous memory waste this also kills cache performance.


I'd presume the persistence / I/O process would be pretty short lived: its job is to save a snapshot to disk so there's not much reason to let it live long.


How does that share data between multiple Redis processes?


Is it because of the consistency guarantees you can get from single threaded operation?

Because it seems like this guarantees that. It mostly parallelizes the command parsing and networking side. The actual core hash table is guarded by a global lock, so you could still get all those single-threaded guarantees.


Won't it potentially allow out of order updates? Or am I missing something.


Oh... Tell me more why being not multithreaded is a failure for Redis?


I would like to hear from somebody who is using this in production. We started to use Dynomite recently after hearing good things. Would like to hear comparisons


Same! I'd also be curious to hear about production scenarios that would really benefit from Redis going 5x faster. It's pretty darn fast to start with!


At my job we're currently re-building our website in django, and we make heavy use of redis caching.

Our website is definitely not "high traffic" but we get somewhere around 300,000 requests a day, mostly concentrated around business hours (We're a local clothing wholesaler).

I haven't tested it under production loads, but just swapping our redis for keydb (THANKS DOCKER!) I saw no improvement in my artificial load tests.

I didn't expect to see much real improvement for this use case, but I just thought it was worth mentioning that it isn't necessarily faster for all workloads.


We currently do nearly a million requests per minute at peak, on a non clustered redis pair for caching and rate limiting (so at least one read/write per request). This design won’t hold up forever, but we’ve got at least a few years headroom before we need to think about anything more complicated


You would probably need 30,000 million a day but then other parts of your setup would fail. Redis being the bottleneck is an unusual case.


Sorry we didn’t improve your use case. If you are bottlenecked by Redis I would be curious about your workload.


I was thinking of the same thing and looks like someone else already experiment with them: https://medium.com/@diego_pacheco/running-multithreaded-redi...


Since this has been around for awhile, why hasn't Redis adopted this strategy into core?


From Antirez's (Redis Maintainer) blog:

> Another thing to note is that Redis is not Memcached, but, like memcached, is an in-memory system. To make multithreaded an in-memory system like memcached, with a very simple data model, makes a lot of sense. A multi-threaded on-disk store is mandatory. A multi-threaded complex in-memory system is in the middle where things become ugly: Redis clients are not isolated, and data structures are complex. A thread doing LPUSH need to serve other threads doing LPOP. There is less to gain, and a lot of complexity to add.

http://antirez.com/news/126


This should be top comment. I came here to chat about potential downsides introduced by complexity of having multiple threads accessing the internal data structure. Until someone runs this in production where they actually use the performance it delivers over and above vanilla redis, I'll probably hold off. I'd like to know it's stable under very high load with contention.

No offense to the creator(s) and I have a ton of gratitude and respect for them pushing the boundaries. But I also know Antirez is a smart dude and Redis has delivered insane performance thus far with few issues.


No offense taken. KeyDB has different goals than Redis so you’ll see us try things Redis might not. I’m willing to make the code more complex if it makes the user’s life easier in some way.


That KeyDB exists and others like it is great! I likely won’t be working on projects that will outstrip the performance of Redis but if I did I’d be looking at the work of people much, much smarter than I for alternatives and from what I’ve seen KeyDB looks really compelling. It’s definitely a project I’ll be following.


Is there a milestone the multi-threaded implementation could achieve that would make it a candidate to merge into core Redis?


I mean the first requirement would be a reason for needing multithreading in the first place. If you can then demonstrate that this implementation solves the initial problem, while not adversely affecting existing performance and stability, then it would probably be a candidate to merge into upstream.


It's a decision choice to be single-threaded. Using threads requires locking or sophisticated concurrent data structures which can sometimes outweigh benefits in both code correctness/maintainability and performance.


Redis is actually exploring multithreading for IO. Here is antirez/Salvatore talking about the performance results in the write path: https://twitter.com/antirez/status/1110973404226772995


https://news.ycombinator.com/item?id=19368955

Previous discussion about keydb, commented by antirez.


Redis unstable introduced threaded IO, but it's still not as fast in my benchmarks.


I'm really curious about the latency metric. I'm running elasticache redis and the round trip average of our Redis get commands is around 1ms. We may have 5-10 in a transaction and look for 10-20ms total time.

Anecdotal, but it leaves me slightly confused


Higher traffic loads result in latency increasing. This benchmark covers the maximum load so latency will be much higher.


Thanks for explaining.


Q. Interesting. For my Web site, I wrote a simple key-value store, I use for Web user session state, based on two standard .NET collection classes. My code is single threaded. Sure, multi-threading could be better, but with my code design then I'd have to use multi-threaded versions of the collection classes, IIRC, which ARE in .NET.

But, I'd guess that multi-threaded collection classes, due to the logic for locking or other means of concurrency control, would be slower and not faster.

So, any thoughts on why, how multi-threaded could be so much faster, e.g., the OP's 5X, not just for the OP here but in general and maybe general enough to apply to my code?

By the way, with my code I get a weak version of multi-threaded because the software interface to my key-value store is just via standard TCP/IP sockets moving byte arrays from object instance de/serialization. So, I'm taking advantage of the standard TCP/IP FIFO (first in, first out) queue for the incoming work to be done. I.e., more than one Web server can be sending a key-value request to my key-value server at the same time; TCP/IP handles that muli-threading; and I get a weak version of multi-threading. Broadly I'm wondering if having my actual code and the collection classes multi-threaded have any chance of being faster: Okay, the server has 8 cores so that MIGHT be the key to being faster.


Friendly reminder that Kyoto Tycoon might be an option. Has real persistence (not just dump everything / reload everything, or cripple performance by turning on aof), is multi-threaded, scriptable via lua, amazing performance.


KT has been out of maintenance for years and doesn't have all the higher-level useful data structures and operations that Redis does. There are lot of options if you just wanted fast key/value with persistence from ScyllaDB, Tarantool, LMDB, RocksDB, etc.


Eh, some of those are embedded, kt is a server like Redis or memcached. Kt was good enough for cloudflare, if that's enough of an endorsement.


I know, my point is that performance is rarely the need. Usability and useful APIs are more important which Redis excels at.

Cloudflare stopped using KT because it wasn't much else other than simple and fast, and was missing a lot of other features.


This post / the topic was performance


Normally, when choosing components for a software stack, infra engineers are pretty good about choosing the components with the fewest features that fit the semantics that the software's design demands. You wouldn't choose Postgres where sqlite would do; you wouldn't choose Kubernetes where a single machine running Docker would do; etc. Components with simpler semantics are not only lower-maintenance, but usually can be more highly optimized due to having fewer "gotcha" requirements going into building them.

Which is to say, if someone is looking for a "faster Redis", it's probably because they originally went with Redis as the "least software they can get away with" for their particular design needs.

Any, therefore, any software that has more narrow semantics than Redis itself is not, in fact, a viable "faster Redis", for anyone but those who had no reason to be running Redis in the first place.


This is pretty interesting. At work we run a few dozen cache nodes serving several million QPS during peak. We were debating switching some of this workload to memcached from redis - but this may be a viable alternative.


This looks promising.

But I still love Redis.

1. It is one of the most reliable software I used in production.

2. Not everyone wants 'faster than light' software.

3. Simplicity of redis blows my mind. It is very easy to maintain and almost never have a single issue.

4. These days everyone talks about scale but very few projects need FAANG-level scale. Point is there are so many small projects where current redis fits seamlessly. Just making case for Redis.


While I get the performance improvement with multi-threading, won't you lose ordering of updates?

In many use cases that's not a problem, but I imagine for some it's a dealbreaker.


One big thing I like about redis is that it is single threaded.


On 32 cores? Doesn't seem like particularly great scaling.


In contrast running N instances scales linearly. That's why Redis multi threading just attempts to get the low hanging fruit of the write calls. For serious scalability it's better to orchestrate multiple instances in our vision.


It looks like 8 to me. The 32 core mention was the test apparatus used to generate the requests.


I guess I can no longer use Redis. I mean what can the use case of normal Redis possibly be?


It works, it's simple, and it's battle tested.


Does anyone not read into the sarcasm? This is the same cliche format stating that x is better than y, in such a manner to supposedly shame people still using y.


Nobody is shaming anyone.


No one is shaming ANYONE?

Look around. Who is not shaming the oil and coal industry right now?


There is a guideline on Hacker News (and this would apply to any online conversation):

"Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith."

https://news.ycombinator.com/newsguidelines.html


I upvoted you to get you back up a bit but sarcasm is not an HN thing :) unless you complete it with more details.


You're definitely right that sarcasm isn't an HN thing. However I still get seriously worried about the personalities of the programmers here who seem unable to recognise sarcasm.


It's not about being unable. Readers here recognize sarcasm, but they also recognize what happens to a web forum where it is allowed to proliferate. That's why https://news.ycombinator.com/newsguidelines.html asks commenters not to be snarky on HN.

Lame internet humor tends to grow like kudzu, and the users who post it underestimate how lame it is, as scott_s pointed out well long ago: https://news.ycombinator.com/item?id=7609289. Sarcasm isn't quite the same but it's related.


Thanks dang. I still have one question though. If what you say is a sufficient explanation, then wouldn't we expect HN users to respond to sarcasm with "Please don't post sarcasm here", or a link to the community guidelines, rather than <response that takes the sarcastic statement at face value>?

I do however agree about the kudzu and that it is better to just keep it at zero. The purpose of my two posts here is to better understand the community within which I work, not to argue for HN policy change.


I think it's really just a numbers game. Of the thousands of people reading any comment, there's always going to be someone who for whatever reason doesn't interpret it in the obvious way. It's more likely for random reasons (e.g. not paying attention) than somehow characterological (e.g. not getting sarcasm).


You went to the effort of forking Redis but didn’t bother to fix the glaring lack of TLS support?

What is the fucking point of a replicated KV store if the connections between nodes aren’t encrypted.


We have it in Redis 6, ETA for RC1 is end of year. However we did it the right way with a socket abstraction layer, one of the reasons it took so long.


Is there any relevant documentation/discussion available?


Yes, the implementation is in this PR: https://github.com/antirez/redis/pull/6236


Kudos to you for a calm response to an angry question.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: