Verified Commit e29bf157 authored by anarcat's avatar anarcat
Browse files

add a pager playbook for today's static mirror outage

Related to team#40432.
parent 2c3cc814
Loading
Loading
Loading
Loading
+20 −4
Original line number Diff line number Diff line
@@ -169,11 +169,27 @@ If we do *not* want to keep a vanity site, we should also do this:

## Pager playbook

TODO: add a pager playbook.
### Out of date mirror

<!-- information about common errors from the monitoring system and -->
<!-- how to deal with them. this should be easy to follow: think of -->
<!-- your future self, in a stressful situation, tired and hungry. -->
If you see an error like this in Nagios:

> mirror static sync - deb: CRITICAL: 1 mirror(s) not in sync (from oldest to newest): 95.216.163.36

It means that Nagios has checked the given host
(`hetzner-hel1-03.torproject.org`, in this case) is not in sync for
the `deb` component, which is <https://deb.torproject.org>.

In this case, it was because of a prolonged outage on that host, which
made it unreachable to the master server ([tpo/tpa/team#40432](https://gitlab.torproject.org/tpo/tpa/team/-/issues/incident/40432)).

The solution is to run a manual sync. This can be done by, for
example, pushing to Jenkins or running `static-update-component` by
hand, see [doc/static-sites](doc/static-sites).

In this particular case, the solution is simply to run this on the
static source (`palmeri` at the time of writing):

    static-update-component deb.torproject.org

## Disaster recovery