Skip to content

re-enable webPassword field

Quote from TPA-RFC-33:

Authentication

To unify the clusters as we intend to, we need to fix authentication on the Prometheus and Grafana servers.

Current situation

Authentication is currently handled as follows:

  • Icinga: static htpasswd file, not managed by Puppet, modified manually when onboarding/off-boarding
  • Prometheus 1: static htpasswd file with dummy password managed by Puppet
  • Grafana 1: same, with an extra admin password kept in Trocla, using the auth proxy configuration
  • Prometheus 2: static htpasswd file with real admin password deployed, extra password generated for [prometheus-alerts][] continuous integration (CI) validation, all deployed through Puppet
  • Grafana 2: static htpasswd file with real admin password for "admin" and "metrics", both of which are shared with an unclear number of people

Originally, both Prometheus servers had the same authentication system but that was split in 2019 to protect the external server.

Proposed changes

The plan was originally to just delegate authentication to Grafana but we're concerned this is going to introduce yet another authentication source, which we want to avoid. Instead, we should re-enable the webPassword field in LDAP, which has been mysteriously in userdir-ldap-cgi's 7cba921 (drop many fields from update form, 2016-03-20), a trivial patch.

This would allow any tor-internal person to access the dashboards. Access levels would be managed inside the Grafana database.

Prometheus servers would reuse the same password file, allowing tor-internal users to issue "raw" queries, browse and manage alerts.

Note that this change will negatively impact the prometheus-alerts CI which will require another way to validate its rulesets.

We have briefly considered making Grafana dashboards publicly available, but ultimately rejected this idea, as it would mean having two entirely different time series datasets, which would be too hard to separate reliably. That would also impose a cardinal explosion of servers if we want to provide high availability.

So, TL;DR: revert the patch in userdir-ldap-cgi that removed the webPassword field so that it can be set by users again.