add pager playbooks for every alert in Prometheus

marked this issue as related to team#41633 (closed)

changed milestone to %TPA-RFC-33-B: Prometheus server merge, more exporters

added Roadmap::Future label

mentioned in issue team#41633 (closed)

this commit (0f9f0d4c) documents a bit a playbook we could write, about disks full...

changed the description

moved from team#41659 (moved)

mentioned in issue #15

marked this issue as related to #15

note that we have #15 that is very similar to this, but concerns other labels.

this ticket is separate because it actually involves writing a lot of documentation, and might never be completed. but at a minimum we should add a check in CI.

changed the description

we now have a check to enforce the presence of the runbook annotation. I've added the annotation with a value of "TODO" on all non-TPA rules.

marked the checklist item enforce a valid URL in alerting rules through CI as completed

the list of checks that need a runbook is currently comprehensive. but since things are still moving, I'm not ticking the item to "make a list" just yet.

changed the description

changed title from add runbooks for every alert in Prometheus to add pager playbooks for every alert in Prometheus

changed the description

marked this issue as related to team#41671

changed the description

added Icebox label and removed Roadmap::Future label

marked this issue as related to team#41816 (closed)

mentioned in issue team#41816 (closed)

added Technical Debt label

mentioned in merge request !61 (merged)

mentioned in commit f28317e5

for what it's worth, i started adding dashboard annotations in Disk* alerts. i am of the opinion this will make it much easier to handle incidents, see f28317e5 for an example.

i'm not sure those should be mandatory, because if playbooks are any good example, that might cause us to want to write a lot of grafana dashboards, and i don't want this to explode anymore than it already is.

removed the relation with team#41671

mentioned in commit wiki-replica@dfb33cb5

marked this issue as related to #26 (closed)

add pager playbooks for every alert in Prometheus

Designs

Child items ...

Activity