Hacker News new | past | comments | ask | show | jobs | submit login

There are services that might be used rarely, but they're still critical. Therefore you either implement a heartbeat-like approach where the monitored system indicates regularly that it's alive, or use pull. Using exporters might sound clunky, but in many cases you need to implement some kind of pull-based system anyway, because sometimes the monitored system does not support the monitoring infrastructure directly. Example: most database systems don't have Graphite or Prometheus support, they just expose their stats. So you'll end up writing or using a component that pulls these stats regularly, then pushes them. Also, when you use pull then you only need to configure the monitoring system, not the monitored services separately (e.g. if you relocate the monitoring system to a different host, then you don't need to update every single monitored service to talk to the new host).

I found that people who're new to larger-scale monitoring favor push, because that's somehow more intuitive; but pull really works very well, it's not clunky at all.




You are conflating metrics collection and metrics storage.

The metrics collection needs to be done by a local agent installed on the system, for the reason you gave, that's the only place the data is available.

The metrics storage is somewhere else.

Prometheus does the storage. For the collection, you still have to install collectd, statsd or similar on your hosts. Sure, prometheus could do a HTTP check remotely, but that won't get cover much of anything.


Prometheus does collection, though.

The Prometheus pattern with central things like a database (e.g. monitoring Postgres) is to let the exporter (the thing that acts as an intermediary to expose metrics for Prometheus to fetch) run anywhere it wants. It absolutely does not need to be a "local agent".

(In fact, if you're using something like Kubernetes, the only thing that needs local access to a node is the exporter that exposes node-specific metrics. Everything else can chat over the network.)

The benefit here is that if you have 10 Postgres databases, you can still run just a single exporter and have it extract data from all of them. Or you can run one exporter per database; there's conceptually little difference.

On Kubernetes, we usually run an exporter as a sidecar container, which means it can talk to Postgres or whatever on localhost and just live alongside the process that it's exporting metrics for, and we rely on Prometheus' automatic discovery to make Prometheus pull from it. Start a new Postgres instance and its metrics are almost immediately fed into Prometheus.

You don't need collectd etc. with Prometheus. There are exporters around for just about anything.


We are both describing the same thing, you end up with "agents" spread on various systems to collect the metrics you are interested in, which is the point I wanted to outline.

They can be called agents, exporters, collectors, whatever, the name is not important, the design pattern is.

A system that would be exclusively pull-based from the prometheus server does not work practically.


No, you don't generally end up with agents or exporters everywhere. You might have node_exporter running on each host to collect system metrics (load, iops, etc.) but those are exceptions. The rule is that the software under instrumentation is mounting /metrics endpoints and exposing Prometheus metrics to the Prometheus server directly.

Unless you take the presence of node_exporter to invalidate the premise (which would be stupid) a system that's exclusively pull-based is entirely practical.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: