Loading tsa/howto/postgresql.mdwn +46 −4 Original line number Diff line number Diff line Loading @@ -422,11 +422,53 @@ If you get this kind of errors, it's because you forgot to restore the See also the "Direct restore procedure" troubleshooting section, which also applies here. Dealing with Nagios warnings ---------------------------- Monitoring warnings ------------------- TODO: there's some more information about backup handling in the [Debian DSA documentation](https://dsa.debian.org/howto/postgres-backup/). ### WAL-MISSING-AFTER Example message: [troodi, main] WAL-MISSING-AFTER: troodi/main.WAL.00000001000000D9000000AD This means that a WAL file is missing after the specified file. Specifically, in the above scenario, the following files are present, in chronological order: -rw------- 1 torbackup torbackup 16777216 May 10 05:08 main.WAL.00000001000000D9000000AA -rw------- 1 torbackup torbackup 16777216 May 10 05:47 main.WAL.00000001000000D9000000AB -rw------- 1 torbackup torbackup 16777216 May 10 06:20 main.WAL.00000001000000D9000000AC -rw------- 1 torbackup torbackup 16777216 May 10 06:26 main.WAL.00000001000000D9000000AD -rw------- 1 torbackup torbackup 16777216 May 10 13:57 main.WAL.00000001000000D9000000B5 Notice the jump from `...AD` to `...B5`. We're missing `AE`, `AF`, `B1`, `B2`, `B3`, `B4`, specifically. We can also tell that something happened between 6:26 and 13:57 on that day. It could be that the backup server went down during that time. 1. List the files in chronological order: ls -ltr /srv/backups/pg/troodi/ | less 2. Find the file warned about, using `/` then the filename (`main.WAL.00000001000000D9000000AD`), above 3. Look for a `.BASE.` file *following* the missing file, using `/` again 4. Either: * if a `.BASE.` backup is present after the missing files, it is harmless insofar as the missing timeframe is not necessary. TODO: how do we fix the warning anyways? TODO: there's some information about backup handling in the [Debian DSA documentation](https://dsa.debian.org/howto/postgres-backup/). * if a `.BASE.` backup is *not* present after the missing files, the backup integrity is faulty, and a new base backup needs to be performed. See [Running a full backup](#running-a-full-backup) above. Reference ========= Loading Loading
tsa/howto/postgresql.mdwn +46 −4 Original line number Diff line number Diff line Loading @@ -422,11 +422,53 @@ If you get this kind of errors, it's because you forgot to restore the See also the "Direct restore procedure" troubleshooting section, which also applies here. Dealing with Nagios warnings ---------------------------- Monitoring warnings ------------------- TODO: there's some more information about backup handling in the [Debian DSA documentation](https://dsa.debian.org/howto/postgres-backup/). ### WAL-MISSING-AFTER Example message: [troodi, main] WAL-MISSING-AFTER: troodi/main.WAL.00000001000000D9000000AD This means that a WAL file is missing after the specified file. Specifically, in the above scenario, the following files are present, in chronological order: -rw------- 1 torbackup torbackup 16777216 May 10 05:08 main.WAL.00000001000000D9000000AA -rw------- 1 torbackup torbackup 16777216 May 10 05:47 main.WAL.00000001000000D9000000AB -rw------- 1 torbackup torbackup 16777216 May 10 06:20 main.WAL.00000001000000D9000000AC -rw------- 1 torbackup torbackup 16777216 May 10 06:26 main.WAL.00000001000000D9000000AD -rw------- 1 torbackup torbackup 16777216 May 10 13:57 main.WAL.00000001000000D9000000B5 Notice the jump from `...AD` to `...B5`. We're missing `AE`, `AF`, `B1`, `B2`, `B3`, `B4`, specifically. We can also tell that something happened between 6:26 and 13:57 on that day. It could be that the backup server went down during that time. 1. List the files in chronological order: ls -ltr /srv/backups/pg/troodi/ | less 2. Find the file warned about, using `/` then the filename (`main.WAL.00000001000000D9000000AD`), above 3. Look for a `.BASE.` file *following* the missing file, using `/` again 4. Either: * if a `.BASE.` backup is present after the missing files, it is harmless insofar as the missing timeframe is not necessary. TODO: how do we fix the warning anyways? TODO: there's some information about backup handling in the [Debian DSA documentation](https://dsa.debian.org/howto/postgres-backup/). * if a `.BASE.` backup is *not* present after the missing files, the backup integrity is faulty, and a new base backup needs to be performed. See [Running a full backup](#running-a-full-backup) above. Reference ========= Loading