Landlord: Per-Tenant Stats in Postgres with Citus

sam0x17 · on July 31, 2018

Love Citus's technology, but once again $99/month is ridiculous for a "dev" plan. Small startups want to use the same database from the pre-MVP dorm room phase all the way through their Series A, and while your technology would theoretically allow that, your pricing closes a lot of doors. No one wants to switch databases once their product is doing well enough to warrant $99/month for a database, so you are scaring away a lot of potential customers such as myself.

Once again I implore you to consider adding a "starter" or "hobbyist" tier for $20/month or even $5/month. If you want to really be competitive, your product should be free to use until you have more than 200mb of data stored in your database, at which point usage-based pricing kicks in based off of the number of rows used.

craigkerstiens · on July 31, 2018

It's helpful feedback, and something we'd love to work towards some day. The short of it is because of Citus architecture it does require multiple underlying nodes and there is some overhead that until you get to a larger scale Citus doesn't necessarily make sense. If you've only got 10 GB of data a single node instance on RDS should suit you just fine.

We understand some of the costs of switching and have actively been working to build tools for you to plan to use something like Citus, but then migrate when the time comes. Libraries like activerecord-multi-tenant or django-multi-tenant ensure you're ready for Citus and work perfectly well on a single node Postgres database.

Then when the time comes, which could be at $100 of RDS or $500 of RDS or larger, we have Citus warp [1] that allows you in a fully online way to replicate from RDS directly into a Citus cluster. Using it we've had customers with > 4 TB of data cutover with less than a minute of impact to their database.

I know that's not as simple as starting on Citus and continuing to scale, so hopefully we can address this better in the future, but for now it's the best answer we have at the moment.

[1]. https://www.citusdata.com/blog/2017/12/08/citus-warp-pain-fr...

sam0x17 · on Aug 4, 2018

You could, for example, offer a plan that starts out extremely cheap, uses usage-based pricing, and only has Citus technology kick in when you reach the relevant scale. So < 10 GB it would just be a normal PostgreSQL node. I would use this for everything if it was like that.

manigandham · on Aug 1, 2018

$99 is pretty good value for what it is, especially if you're using it for a scalable core database.

For us the bigger issue is the steep jump in price for prod level plans and the lack of GCP hosting.

iKevinShah · on Aug 1, 2018

If price point for Citus cloud is a roadblock, you can easily set up Citus at your environment provided you have some server administration experience. I did the same while doing a PoC for the community edition and all it took was 3 minimal size servers, which would actually be around 20$ a month but can it be used till Series A, not sure, depends on the data.

Also just a side note : The community edition is lacking some very important features like shard rebalance which is available in Enterprise edition to be fair.

danharaj · on July 31, 2018

I respect Citus for the way they consistently praise the PostgreSQL community when promoting themselves.

jedberg · on July 31, 2018

Heh I thought this was for landlords to track tenant data. I was excited for a second!

mipmap04 · on July 31, 2018

Just curious - what sort of data would you want to track on tenants?

dominotw · on July 31, 2018

Noise levels would be good. When I am moving this noise is a total wildcard and luck. I would pay a premium to move in with less noise neighbours and possibly quite tenants can be rewarded with that premium.

mipmap04 · on Aug 1, 2018

There's actually a start up that does this for AirBnBs called NoiseAware (I have no affiliation with them). I could definitely see that being interesting for more permanent rentals as well. Probably hard to get tenant consent though for that purpose.

krisroadruck · on July 31, 2018

I know I'd pay extra not to have to live next to neighbors with small children or yappy dogs. Still annoyed that it's illegal for there to be childless apartment communities.

dragonwriter · on July 31, 2018

> Still annoyed that it's illegal for there to be childless apartment communities.

It's not, you are just probably not old enough for the ones that are legal.

tpetry · on Aug 1, 2018

Who doesn‘t love a session of bingo at the afternoom? :)

jasonmp85 · on Aug 6, 2018

You're in luck: you can! It's called buying a single-family detached home.

krisroadruck · on Aug 7, 2018

I love the sarcasm, truly, but that doesn't actually accomplish the goal. If the goal is to live near a group of similar aged folks, all buying a single-family home does is separate the walls a bit. You are now locked into a neighborhood in which your neighbors can easily have children who will run around screaming in the area, often ignoring who's yard belongs to whom.

Source: Bought a house before, neighbors moved in with 4 kids who ran around screaming loudly and ignored that my yard wasn't theirs to play in. They also had 2 rather large labs that barked loudly day and night. Working from home with that sort of constant noise pollution is super irritating.

jedberg · on Aug 1, 2018

I honestly have no idea. Part of my curiosity was what kind of data they were collecting.

hemancuso · on July 31, 2018

I’d love to know if anyone here has done a deep eval of Citus vs CockroachDB. They seem to be the two most promising solutions for horizontal scale-out Postgres and both are under very active development.

manigandham · on Aug 1, 2018

This is a good question. We did, and they are very different databases.

CRDB: Pros: natively distributed so HA and scalability are built-in, simple deployment and configuration, can run on Kubernetes for automated HA operations. Scaling tables is automatic and single-key and small-range OLTP performance is very good. Supports JSON and most data types for compatibility with most things that use Postgres.

Cons: Still maturing and has bugs like `select unnest(some_array_col)` not working. Obviously cant run any Postgres extensions so SQL w/JSON is all you get. Performance on large scans is slow, they're working on this but the distributed consensus required for queries means they will never match the latency of a single-node Postgres. Advanced queries are either very slow or unsupported or every slow.

CITUS: Pros: pure Postgres including extensions so you have access to advanced functionality. If you use shard key for queries, lack of distributed consensus gives low-latency performance just like single-node, but distributed transactions are still possible. Citus scales queries across all CPUs (on nodes holding the accessed data) so greatly improves query performance.

Cons: only distributes data in "distributed" tables (sharded) or "reference" tables (full replicas on all nodes). All other data just sits on single master node. HA uses Postgres streaming replication, requiring an inefficient 2x increase in costs, and is not seamless with failovers. Generally requires much more maintenance because it is still Postgres. Sharding does not accept multiple columns. No columnstores so large scans can still be slow, but they have ZFS in beta.

--

Summary:

CRDB for simpler OLTP with very low ops overhead and great availability, scaling, and durability.

Citus for advanced OLTP or OLAP, low-latency sharded access, and full access to all Postgres features.

hemancuso · on Aug 1, 2018

Thanks! Did you consider any other options?

manigandham · on Aug 1, 2018

What scenario are you looking for?

TiDB is another competitor but mysql dialect and still early, missing lots of features.

For pure data-warehousing, we used MemSQL which is incredibly fast but can be expensive.

SQL Server is a great all around database if you want in-memory tables, columnstores, native graph queries, full-text search, and very high performance and can live with a single-node design (with optional HA cluster).

hemancuso · on Aug 1, 2018

Mostly an easy and performant HA story that can scale to a huge number of queries per second for an app that is largely CRUD and vanilla webapp saas queries. Cockroach seems perfect but the performance seems pretty scary in certain scenarios.

Being able to just drop the DB into a k8s cluster and not worry too much about failover gotchas and leader election has a lot of value. As does being able to throw more nodes on for more performance. Complicated OLAP queries aren't in scope.

manigandham · on Aug 1, 2018

Either are a good fit for you, but since you have simpler queries than CRDB will be easier to run.

Performance is fine, why do you say it's scary? OLAP will just be slow, but it's also distributed and unoptimized. Highly concurrent OLTP can't really get slow unless you're trying to stretch the cluster over multiple geo regions.

qaq · on July 31, 2018

?? outside of CockroachDB using PostgreSQL wire protocol they focus on very different use cases it seams. from CocroachDB website: "When is CockroachDB not a good choice? CockroachDB is not a good choice when very low latency reads and writes are critical; use an in-memory database instead.

Also, CockroachDB is not yet suitable for:

Heavy analytics / OLAP"

I think Citus is actually very well suited for Heavy analytics / OLAP

craigkerstiens · on July 31, 2018

At Citus we tend to focus on two use cases:

1. Analytical, but less data warehousing and more of a HTAP (hybrid transactional/analytical processing). In this case you're often ingesting a lot of data, often times sensor or log data from many endpoints, and then providing analytics across that data. The analytics needs to be up to date within minutes, and responsiveness of reports within seconds. You can see how Algolia (which powers the search for HN) uses Citus for this in their blog post - https://blog.algolia.com/building-real-time-analytics-apis/

2. Transactional. For a couple of years now Citus has had full transactional support when targeting a single node. Single node transactions can actually cover a breadth of use cases because it can span across tables as long as tables are co-located within the same node. We often see this is the case for multi-tenant/SaaS applications. In recent releases we also added support for distributed transactions. These transactions do have a higher overhead, but can often be hard to detangle from an existing application, thus us building support for distributed deadlock detection then adding distributed transactions.

Generally we're continuing to improve and support both of those use cases and have our usage base actually pretty evenly split between the two.

hemancuso · on Aug 1, 2018

Can you say anything to help understand the differences between cockroach and Citus?

hemancuso · on July 31, 2018

Yes but they both target horizontial scale out and OLTP is a target for both.

And yes, cockroach isn’t Postgres but it has SQL, versus something like Mongo or Cassandra.

qaq · on July 31, 2018

Citus claims OLTP but for a large cluster 2 phase commit is not a very viable approach.

mslot · on Aug 1, 2018

I don't think you can get around using 2PC in a distributed OLTP database, e.g. Spanner also uses 2PC for distributed transactions across shards. Fortunately, the overhead is not really that high because the prepare and commit messages are sent to all nodes in parallel. It only adds one extra network round trip. It does lower per session throughput a bit, but you can always get better throughput by creating more sessions.

mslot · on Aug 1, 2018

> Spanner runs consensus for each key range and does not need all nodes to be available to make progress also my understanding since again it has a leader for each key range writes scale better.

Spanner uses Paxos (consensus) for replication within a key range (shard), but two-phase commit across shards: https://ai.google/research/pubs/pub39966

Citus relies on PostgreSQL's streaming replication, which gives higher throughput than Paxos, but Paxos has better availability characteristics. On the other hand, Paxos with leader leases as used by Spanner is similar to streaming replication both in terms of performance characteristics and short downtime during failover.

qaq · on Aug 1, 2018

It's not similar in infrastructure requirements though Spanner does not have 1/2 the nodes in standby mode for HA they are actually doing work.

qaq · on Aug 1, 2018

Spanner runs consensus for each key range and does not need all nodes to be available to make progress also my understanding since again it has a leader for each key range writes scale better.

elvinyung · on Aug 1, 2018

It's not pure 2PC, it's 2PC on a subset of shards (layered on top of Raft). If it's true that most workloads are primary-key-based or touches a small amount of shards, it's fine.

qaq · on Aug 1, 2018

Could you point to info about them using RAFT? "In Citus, we looked into the 2PC algorithm built into PostgreSQL. We also developed an experimental extension called pg_paxos. We decided to use 2PC for two reasons. First, 2PC has been used in production across thousands of Postgres deployments. Second, most Citus Cloud and Enterprise deployments use streaming replication behind the covers. When a node becomes unavailable, the node’s secondary usually gets promoted within a seconds. This way, Citus can have all participating nodes be available most of the time."

elvinyung · on Aug 1, 2018

Oops sorry, thought you meant Cockroach!

The part about 2PC still holds.

manigandham · on July 31, 2018

what do you mean?

qaq · on Aug 1, 2018

You need all worker nodes to be available for 2pc to succeed. So the solution you have a standby for each node which is not very viable for a large cluster.

manigandham · on Aug 1, 2018

Yes, the fact that Postgres is not natively distributed means that HA/replication will be very inefficient to implement.

Citus is best used when transactions don't cross shard boundaries, in which case they execute on a single node and give you the low-latency to match.

iKevinShah · on Aug 1, 2018

Not fully qualified to answer this but one thing which helped me select Citus over Cockroach was Citus being natively postgres and hence extensions (like PostGIS) being supported out of the box.

brightball · on July 31, 2018

I wonder if/when Citus will have a GCP option?

craigkerstiens · on July 31, 2018

Hi, Craig from Citus here. We're always continuing to explore other platforms beyond Citus Cloud on AWS. Stay tuned for the future as we'll add support for others. The input and feedback on which platforms people prefer is very helpful, so always feel free to drop us a note directly on that.

brightball · on July 31, 2018

GCP and Digital Ocean would be excellent. :)

sam0x17 · on July 31, 2018

I also would be interested in GCP and digital ocean

manigandham · on Aug 1, 2018

+100 for GCP

ci5er · on Aug 1, 2018

Why? Yes, AWS can be expensive for performance, but it scales well and is automatable. I 'get' why no Azure for greenfield, but why do you love GCP?

brightball · on Aug 1, 2018

You don’t even have to love GCP to want quality database options outside of AWS.

As far as I can tell, it would be a great move for Citus because right now they are competing against Aurora PG within AWS. On GCP and Digital Ocean they have no equivalent competitor in the datacenter.

With the rest of a company infrastructure on GCP, you’re not likely to make a move just for Citus sake either.

thangngoc89 · on Aug 1, 2018

when the rest of your infrastructure is already on GCP, running the database on the same cloud would reduce latency and bandwidth cost

ci5er · on Aug 1, 2018

Well, sure.

I guess the broader question would be: Why use GCP? I like them well enough, but I find it easier to prototype on AWS. Why would a developer develop the MVP on GCP vs AWS and then go to scale there? (I realized that K8S has made this, recently, a moot point, and it comes down to cost ... but beside that is there anything?)

EDIT: I realize that as a neo-Luddite, I just might be used to AWS, and would be open (mostly) to arguments to move to GCP if it is better (I'm not not talking about $/Gbps)

EDIT2: As a (mostly) fan of Google - I probably would have gotten myself acclimatized to their platform if it was any good 10 years ago, but it wasn't, so I didn't, and I guess I'm asking: is there a good reason to learn how to switch?

manigandham · on Aug 1, 2018

That's a broad question for a side topic...

GCP has better performance (especially raw storage, compute and network), cheaper and simpler pricing, and more consistent experience. The raw primitives to build with are better designed and integrated, and scale without any tuning.

We're not building MVPs (which are just as easy) but have a working product and only use GKE and VMs. If you need all the managed services then AWS is the right choice, but K8S has made that a non-issue for us and allows us to use VMs (which are the most reliable part of any cloud), along with spot/preemptible pricing and better colocation for efficiency.

ci5er · on Aug 1, 2018

I don't use managed services, because I want portability, but you are right (and as I think I said upstream) K8S has made many of the issues in production moot.

You say that you find that prototyping is as easy with GCP as with AWS? Do you believe it is because you are familiar with GCP? Or do you believe that they are truly at par with prototyping productivity?

(And yes - we've gone far afar of the upstream topic - but I do appreciate your feedback)

manigandham · on Aug 3, 2018

I dont know what you consider as prototyping, but yes they both can run a basic app with minimal effort.

Look into AppEngine which is a few CLI commands for deploying. Same with Firebase which is widely considered a great mobile platform with everything you need. You can also use the Serverless.com framework to build apps on Cloud Functions, and they recently announced running any container in a serverless fashion (similar to Azure's ACI).

If you still want to run servers, then they have the best VM platform around.

tpetry · on July 31, 2018

They seem to currently be to deep invested in the AWS architecture. But i am still waiting for Citus Cloud on GCP too.

ddorian43 · on July 31, 2018

I wonder if/when they'll reoease multi-master ?

craigkerstiens · on July 31, 2018

When you say multi-master can you clarify a bit of what you're looking for? We have Citus MX which allows you to read and write from any node removing a coordinator as a bottleneck on any write throughput. Not sure if it would meet your needs or if you're looking to solve some other use case.

ddorian43 · on Aug 1, 2018

Yes, I meant mx.