properly monitor mailman 3
in #40471 (closed), we focused on upgrading the mailman server from legacy, which we did. in doing so, we kept some of the current warts, which include that the service itself is not particularly well monitored.
in donate-neo, we acquired significant baggage in monitoring django applications. we should leverage this to monitor mailman similarly. we could check latency, but more importantly exceptions, which are normally sent by email but that we've disabled because too noisy.
-
make sure that our HTTP(S) monitoring is working properly -
research other attempts at monitoring mailman with prometheus -
consider setting up django-prometheus in the django web service -
get other metrics about the backend services?
Edited by anarcat