Why would you choose to use a system that doesn't scale by default? Single user ...

tehbeard · on June 12, 2021

> Why would you choose to use a system that doesn't scale by default?

By all accounts Postgres seems to be a pain to scale off a single machine, much more so than redis.

hardwaresofton · on June 12, 2021

Postgres is not as automatic as other tools but is mostly an artifact of it being around so long, and focus being on other things. Few projects have been around and stayed as relevant as postgres.

Most of the time, you really don't need to scale postgres more than vertically (outside of the usual read replicas), and if you have tons of reads (that aren't hitting cache, I guess), then you can scale reads relatively easily. The problem is that the guarantees that postgres gives you around your data are research-level hard -- you either quorum or you 2pc.

Once you start looking into solutions that scale easily, if they don't ding you on performance, things get murky really quick and all of a sudden you hear a lot of "read-your-writes" or "eventual consistency" -- they're weakening the problem so it can be solved easily.

All that said -- Citus and PostgresXL do exist. They're not perfect by any means, but you also have solutions that scale at the table-level like TimescaleDB and others. You can literally use Postgres for something it was never designed for and still be in a manageable situation -- try that with other tools.

All that said, KeyDB[0] looks pretty awesome. Multithreaded, easy clustering, and flash-as-memory in a pinch, I'm way more excited to roll that out than I am Redis these days.

[0]: https://github.com/EQ-Alpha/KeyDB

victor106 · on June 12, 2021

KeyDB is really good. We use it in production to achieve millisecond response times on millions of requests per second.

hardwaresofton · on June 12, 2021

It really looks absolutely amazing, I feel guilty because I want to run a service on it, there's almost no downside to running it everywhere you'd normally run Redis.

Also in the cool-redis-stuff category:

https://github.com/twitter/pelikan

Doesn't have the feature set that KeyDB has but both of these pieces of software feel like they could the basis of a cloud redis product that would be really efficient and fast. I've got some plans to do just that.

killingtime74 · on June 12, 2021

Are you Google search? How do you have millions of requests per second?

manigandham · on June 12, 2021

Lots of industries and applications can get to that scale. My last few companies were in adtech where that is common.

killingtime74 · on June 13, 2021

Thanks, I had no idea!

qeternity · on June 12, 2021

It's likely millions of internal requests, which as another comment mentions, is common in a number of industries.

cbsmith · on June 12, 2021

Which PostreSQL scaling pain point would you be referring to? Citus?

arpa · on June 12, 2021

redis is not a database. It's a key-value based cache. If you're using it as a database, you're gonna have a bad time.

cube2222 · on June 12, 2021

Why so? It has persistence and I'm not aware of any reported data loss happening with it.

It's also got loads of complex and useful instructions.

jasonwatkinspdx · on June 12, 2021

Redis is inherently lossy as a matter of basic design, and that's not even touching on the many other issues born of NIH solutions rampant within it. You may not hit the behavior until you push real loads through it. If you talk to anyone who has, I'm confident they'll agree with the criticism that while it may be an excellent cache, it should never be treated as a ground truth database. It's excellent as a slower memcachd with richer features. It's not a database. You can also read Aphyr's reports over the years, which to be utterly frank, bent over backwards to be charitable.

arpa · on June 12, 2021

Data loss can occur between flushes to disk, for example (by default every 2 seconds / every I_FORGOT megabytes). Perhaps (most likely) it is possible to fine-tune the configuration to have redis as a very reliable data store, but it doesn't come with such settings by default, unlike most of RDBMSes.

miohtama · on June 12, 2021

Not all use cases require reliable data storage and it is ok lose few seconds of data. Think simple discussion forums, internal chat applications. There are some scenarios where ease of use and a single server scalability pays off in the faster development and devops cost.

arpa · on June 12, 2021

GP was asking why redis is not a reliable storage solution/database. Redis is great as an unreliable (not source-of-truth) storage.

kwdc · on June 12, 2021

For that temporary use case, how does it compare to memcached?

jasonwatkinspdx · on June 12, 2021

Mostly boils down to Redis having a richer API, and memcached being faster / more efficient. The new EXT store stuff allows you to leverage fast ssd's to cache stupid huge datasets. Memcached is also one of the most battle tested things out there in open source. I've used them both plenty over the years, but tend to lean towards memcached now unless I really need some Redis API feature.

charrondev · on June 12, 2021

I work on a SaaS community forums service and I can assure you data loss is not acceptable to our clients.

As a result we use MySQL w/ memcached, although we are considering a swap to redis for the caching layer.

EvilEy3 · on June 12, 2021

> Not all use cases require reliable data storage and it is ok lose few seconds of data. Think simple discussion forums, internal chat applications.

That is definitely not ok. I'd be really pissed as a user if I wrote a huge comment and it suddenly disappeared.

miohtama · on June 12, 2021

It only disappears if there is a catastrophic failure. The likelihood for such thing to happen when you write a huge comment are less than jackpot in Las Vegas, a sensible risk tradeoff for better development experience and cost.

obstacle1 · on June 12, 2021

> a sensible risk tradeoff

Note the tradeoff doesn't make sense as soon as you're operating at a meaningful scale. A small likelihood of failure at small scale translates to "I expect a failure a million years from now", whereas at large scale it's more like "a month from now". Accepting the same percent risk of data loss in the former case might be OK, but in the latter case is irresponsible. Provided whatever you're storing is not transient data.

miohtama · on June 13, 2021

You are correct. I wrote the original comment as "single server" so I assume it does not mean a meaningful scale and can be more effectively dealt with a support ticket. Not everything needs to be growth trajectory SaaS.

TheCoelacanth · on June 12, 2021

How is that a sensible tradeoff compared to just using something that was actually designed to be a database when you need a database?

miohtama · on June 13, 2021

You do not need a relational database for a simple chat/forum application.

Hendrikto · on June 12, 2021

> sure, it's (arguably...) step up from just using sqlite

How so? What‘s wrong with SQLite?

wokwokwok · on June 12, 2021

I suppose it's a bit more suitable to networked services than sqlite is, since it's natively a web api, and sqlite is natively a local-only solution.

...but, I started writing about clustering and the network API, but, I can't really articulate why those are actually superior in any meaningful way to simply using sqlite, and given the irritation I've had in maintaining them in production in the past...

I guess you're probably right. If I had to pick, I'd probably use sqlite.

zigzag312 · on June 12, 2021

I would say Redis with RediSearch is a database.