postfix prometheus monitoring is lacking bounce/rejection tracking
When I originally setup the Prometheus/Grafana architecture, I evaluated the various exporters and dashboards available which could replace the ones that typically ship with Munin (which is a lot), in #30028 (closed). I naturally gravitated towards the postfix_exporter because it is packaged in Debian but, as it turns out, that package has a number of issues:
- some metrics are missing, particularly the number of rejected/bounced emails out going (which is critical for our needs, e.g. #33037 (moved))
- it hasn't seen a release since feb 2020, over two years ago
So let's see if we can find an alternative to at least get rejection rates and better health metrics.
Launch checklist:
-
adapt the mtail program to our needs, requires: -
make it work with our metrics (small patch) -
add queue tracking to node exporter (hack, but makes us totally free from the postfix exporter) -
deploy and test mtail on crm-int-01 -
make or adapt Grafana dashboard to mtail program -
deploy and test on eugeni -
deploy everywhere
Edited by anarcat