Yet another Prometheus/time-series backend project.
And yet again, it would be far better to just export the data from Prometheus into a distributed columnstore data warehouse like Clickhouse, MemSQL, Vertica (or other alternatives). This gives you fast SQL analysis across massive datasets, real-time updates regardless of ordering, and unlimited metadata, cardinality and overall flexibility.
Prometheus is good at scraping and metrics collection, but it's a terrible storage system.
Yup, and that's kinda intentional. The design is to be a monitoring system, not a generic TSDB. What it does needs to be simple, fast, and reliable so that your alerts get sent out.
The original design inspiration, borgmon, also was a terrible storage system and had an external long-term store layered on top of it.
This isn't a design flaw, it's an intentional trade off to make the core use case as bulletproof as it can be. Having seen "monitoring systems" based on something like Cassandra, aka distributed storage, is cringe inducing. The first thing to crash a the first sign of network trouble is distributed storage.
My point is that monitoring != storage and there are plenty of great storage systems to use so there's not much reason to create another one. For some reason developers love to create home-made (time-series) databases.
Ze founder here. We adapted the scraper to our needs, and figured others have probably wanted to do what you're suggesting, so we decided to share it. In fact we pick up the scrape and ingest it into a distributed column store ourselves, where we use it with logs and do anomaly detection with it. I think Prometheus is pretty good at what it does. But like you're saying, depending on what you want to do with the data, sometimes a different backend is useful.
The ability to export from prometheus to a full-featured SQL database is definitely one aspect. In addition, for scaling, efficient transport from the scraper to the data store becomes pretty important.
Beyond just compression, having a transport protocol that takes advantage of the fact that much of the data does not change across successive scrapes, and the data that does change is often incremental (e.g. counters) makes a big difference.
And yet again, it would be far better to just export the data from Prometheus into a distributed columnstore data warehouse like Clickhouse, MemSQL, Vertica (or other alternatives). This gives you fast SQL analysis across massive datasets, real-time updates regardless of ordering, and unlimited metadata, cardinality and overall flexibility.
Prometheus is good at scraping and metrics collection, but it's a terrible storage system.