Verified Commit eed52f36 authored by anarcat's avatar anarcat
Browse files

document UNEXPECTED-TIMELINE fix for rude in team#41398

parent d9e1c713
Loading
Loading
Loading
Loading
+29 −0
Original line number Diff line number Diff line
@@ -1266,6 +1266,35 @@ including the former were removed by hand. Then a full backup was
performed. The reason why the BASE backup was missing is this was
following a failed upgrade (see [tpo/tpa/team#40809](https://gitlab.torproject.org/tpo/tpa/team/-/issues/40809)).

### UNEXPECTED-TIMELINE

If the backup check script is complaining like this:

    [rude, main] UNEXPECTED-TIMELINE: rude/main.WAL.000000020000010200000015

It's likely because the [timeline](https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-TIMELINES) was bumped, which can happen on
certain restore scenarios. The check script doesn't handle this very
well. You need to inform said script of the timeline change, by adding
a `timeline` entry in the `/etc/nagios/dsa-check-backuppg.conf`
script, for example, the entry for rude was changed from:

```
  rude:
    main: ~
```

To:

```
  rude:
    main:
     timeline: 2
```

Alternatively, a dump/restore will reset the timeline to the normal
"1", but then you'd need to move the directory out of the way and make
a new full backup.

### OOM (Out Of Memory)

We have had situations where PostgreSQL ran out of memory a few times