Hacker News new | past | comments | ask | show | jobs | submit login

Yet another Prometheus/time-series backend project.

And yet again, it would be far better to just export the data from Prometheus into a distributed columnstore data warehouse like Clickhouse, MemSQL, Vertica (or other alternatives). This gives you fast SQL analysis across massive datasets, real-time updates regardless of ordering, and unlimited metadata, cardinality and overall flexibility.

Prometheus is good at scraping and metrics collection, but it's a terrible storage system.




Yup, and that's kinda intentional. The design is to be a monitoring system, not a generic TSDB. What it does needs to be simple, fast, and reliable so that your alerts get sent out.

The original design inspiration, borgmon, also was a terrible storage system and had an external long-term store layered on top of it.

This isn't a design flaw, it's an intentional trade off to make the core use case as bulletproof as it can be. Having seen "monitoring systems" based on something like Cassandra, aka distributed storage, is cringe inducing. The first thing to crash a the first sign of network trouble is distributed storage.


My point is that monitoring != storage and there are plenty of great storage systems to use so there's not much reason to create another one. For some reason developers love to create home-made (time-series) databases.


I agree. Prometheus is very good at what it was designed to do.


Ze founder here. We adapted the scraper to our needs, and figured others have probably wanted to do what you're suggesting, so we decided to share it. In fact we pick up the scrape and ingest it into a distributed column store ourselves, where we use it with logs and do anomaly detection with it. I think Prometheus is pretty good at what it does. But like you're saying, depending on what you want to do with the data, sometimes a different backend is useful.


The big innovation here is the data compression. The lack of metric type for the standard remote storage interface is a good note.

Wouldn't adding that let you switch back to the main project and lower the local storage buffer to as small as possible?


Hi.

I would be interested in a demo of the logs AI stuff using real data. Something like, https://www.honeycomb.io/play/ would do.

Do you have such?


Disclosure: Zebrium employee.

The ability to export from prometheus to a full-featured SQL database is definitely one aspect. In addition, for scaling, efficient transport from the scraper to the data store becomes pretty important.

Beyond just compression, having a transport protocol that takes advantage of the fact that much of the data does not change across successive scrapes, and the data that does change is often incremental (e.g. counters) makes a big difference.


The compression part is definitely great work. It would be a valuable addition to the main prometheus repo if they accept it.


Thanks, yes. It makes our life lot easier too, if they accept it, so that we do not have to constantly maintain this forked version.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: