With the new monitoring system, we have a value of 0 for the timestamp of the last backup until the first full backup of a machine is run.
To avoid useless alerts right after provisioning a new host, we should modify our fabric scripts to trigger the addition of a silence for the backup alerts of that host (e.g. alias=$fqdn and job=bacula).
The expiration delay that we need is not super clear to me. Is 2 days about right?
Designs
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
i think i'll need some date parsing to get this finished, but it's pretty much almost done, yay for documented APIs.
anarcatchanged title from A silcence should be added for backups on newly-provisioned machines to A silence should be added for backups on newly-provisioned machines
changed title from A silcence should be added for backups on newly-provisioned machines to A silence should be added for backups on newly-provisioned machines
okay, so we now have natural date parsing too, i think the magic command is now:
fab silence.create --comment="machine waiting for first backup" --matchers job=bacula --matchers alias=idle-dal-02.torproject.org --ends-at "in 2 days"
@anarcat nice, I was able to create and update a silence with the two new tasks.
Maybe the only thing that might become annoying in the future is the default author being named "tor". If we only create short-lived silences then it's not such a big issue but we'll have to keep in mind that if we want longer-lived silences we should find a way to get the author from an environment variable or some configuration file somewhere..
Maybe the only thing that might become annoying in the future is the default author being named "tor". If we only create short-lived silences then it's not such a big issue but we'll have to keep in mind that if we want longer-lived silences we should find a way to get the author from an environment variable or some configuration file somewhere..