Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
Wiki Replica
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Container Registry
Model registry
Operate
Environments
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
The Tor Project
TPA
Wiki Replica
Commits
a4747f47
Verified
Commit
a4747f47
authored
5 years ago
by
anarcat
Browse files
Options
Downloads
Patches
Plain Diff
review and itemize the direct restore procedure, which now seems to work
parent
58cadd0c
No related branches found
Branches containing commit
No related tags found
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
tsa/howto/postgresql.mdwn
+103
-100
103 additions, 100 deletions
tsa/howto/postgresql.mdwn
with
103 additions
and
100 deletions
tsa/howto/postgresql.mdwn
+
103
−
100
View file @
a4747f47
...
...
@@ -184,116 +184,119 @@ harmless.
Direct restore procedure
------------------------
TODO: this procedure does not work.
The above procedure assumes a bare-bones recovery, on a new server,
but it's also possible to sync an existing server from backups. Th
e
following, therefore, assume postgres is
alre
ady configured, with
something like:
but it's also possible to sync an existing server from backups. Th
is
is also an adaptation of the [offici
al
re
covery
procedure](https://www.postgresql.org/docs/9.3/continuous-archiving.html#BACKUP-PITR-RECOVERY).
ap
t install
postgres-11
1. Firs
t install
the right PostgreSQL version:
Make sure you run the SAME MAJOR VERSION of PostgreSQL than the
backup! You cannot restore across versions. This might mean installing
from backports or an older version of Debian.
apt install postgres-9.6
Make sure you run the SAME MAJOR VERSION of PostgreSQL than the
backup! You cannot restore across versions. This might mean
installing from backports or an older version of Debian.
On the postgres server:
2. On that new PostgreSQL server, show the `postgres` server public
key, creating it if missing:
[ -f ~postgres/.ssh/id_rsa.pub ] || sudo -u postgres ssh-keygen
cat ~postgres/.ssh/*.pub
Then on the backup server:
3. Then on the backup server, allow the user to access backups of the
old server:
echo "command="/usr/local/bin/debbackup-ssh-wrap --read-allow=/srv/backups/pg/$OLDSERVER $CLIENT",restrict $HOSTKEY" > /etc/ssh/userkeys/torbackup.more
This assumes we connect to a *previous* server's backups, named
`$OLDSERVER` (e.g. `dictyotum`). The `$HOSTKEY` is the public key
found on the postgres server above.
Warning: the above will fail if the key is already present in
`/etc/ssh/userkeys/torbackup`, edit the key in there instead in
that
case.
Then you need to find the right `BASE` file to restore from. Each
`BASE` file has a timestamp in its filename, so just sorting them
by
name should be enough to find the latest one. Uncompress the
`BASE`
file in place, as the `postgres` user:
sudo -u postgres -
i
sudo -u postgres ssh torbackup@$BACKUPSERVER $(hostname) retrieve-file pg $OLDSERVER bacula.BASE.$BACKUPSERVER-20191004-062226-$OLDSERVER.torproject.org-$CLUSTERNAME-9.6-backup.tar.gz | tar -C /var/lib/postgresql/9.6/main -x -z -f -
Add a `pv` before the `tar` call in the pipeline for a progress bar
with large backups, and replace:
1. `$BACKUPSERVER` with the backupserver name and username (currently
`bungei.torproject.org`)
2. `$OLDSERVER` with the old server's (short) hostname
(e.g. `dictyotum`)
3. `$CLUSTERNAME` with the name of the cluster to restore
(e.g. usually `main`)
TODO: The above might hang for a while, but it should complete. It
`retrieve-file` sends a header which includes a `sha512sum` which
takes a while to compute. If it doesn't work, use the indirect
procedure to restore the BASE, which there is hopefully space for
without the logs..
.
Make sure the `pg_xlog` directory doesn't contain any files.
Then you need to create a `recovery.conf` file in
`/var/lib/postgresql/9.6/main` that will tell postgres where to
find
the WAL files. At least the `restore_command` need to be
specified. Something like this should work:
restore_command = '/usr/local/bin/pg-receive-file-from-backup $OLDSERVER $CLUSTERNAME.WAL.%f %p'
... where:
* `$OLDSERVER` should be replaced by the previous postgresql
server
name (e.g. `dictyotum`)
* `$CLUSTERNAME` should be replaced by the previous cluster name
(e.g. `main`, generally)
You can specify a specific recovery point in the `recovery.conf`,
see
the [upstream documentation](https://www.postgresql.org/docs/9.3/recovery-target-settings.html) for more information.
Make sure the
file is owned by postgres:
$EDITOR /var/lib/postgresql/9.6/main/recovery.conf
chown postgres /var/lib/postgresql/9.6/main/recovery.conf
Then start the server and look at the logs to follow the recovery
process:
service postgresql start
tail -f /var/log/postgresql/*
You should see something like this:
2019-10-09 21:17:47.335 UTC [9632] LOG: database system was interrupted; last known up at 2019-10-04 08:12:28 UTC
2019-10-09 21:17:47.517 UTC [9632] LOG: starting archive recovery
2019-10-09 21:17:47.524 UTC [9633] [unknown]@[unknown] LOG: incomplete startup packet
2019-10-09 21:17:48.032 UTC [9639] postgres@postgres FATAL: the database system is starting up
2019-10-09 21:17:48.538 UTC [9642] postgres@postgres FATAL: the database system is starting up
2019-10-09 21:17:49.046 UTC [9645] postgres@postgres FATAL: the database system is starting up
2019-10-09 21:17:49.354 UTC [9632] LOG: restored log file "00000001000005B200000074" from archive
2019-10-09 21:17:49.552 UTC [9648] postgres@postgres FATAL: the database system is starting up
2019-10-09 21:17:50.058 UTC [9651] postgres@postgres FATAL: the database system is starting up
2019-10-09 21:17:50.565 UTC [9654] postgres@postgres FATAL: the database system is starting up
2019-10-09 21:17:50.836 UTC [9632] LOG: redo starts at 5B2/74000028
2019-10-09 21:17:51.071 UTC [9659] postgres@postgres FATAL: the database system is starting up
2019-10-09 21:17:51.577 UTC [9665] postgres@postgres FATAL: the database system is starting up
2019-10-09 21:20:35.790 UTC [9632] LOG: restored log file "00000001000005B20000009F" from archive
2019-10-09 21:20:37.745 UTC [9632] LOG: restored log file "00000001000005B2000000A0" from archive
2019-10-09 21:20:39.648 UTC [9632] LOG: restored log file "00000001000005B2000000A1" from archive
2019-10-09 21:20:41.738 UTC [9632] LOG: restored log file "00000001000005B2000000A2" from archive
2019-10-09 21:20:43.773 UTC [9632] LOG: restored log file "00000001000005B2000000A3" from archive
... and so on.
Then remove the temporary SSH access on the backup server, either
by
removing the `.more` key file or restoring the previous key
configuration:
rm /etc/ssh/userkeys/torbackup.more
This assumes we connect to a *previous* server's backups, named
`$OLDSERVER` (e.g. `dictyotum`). The `$HOSTKEY` is the public key
found on the postgres server above.
Warning: the above will fail if the key is already present in
`/etc/ssh/userkeys/torbackup`, edit the key in there instead in
that
case.
4.
Then you need to find the right `BASE` file to restore from. Each
`BASE` file has a timestamp in its filename, so just sorting them
by
name should be enough to find the latest one. Uncompress the
`BASE`
file in place, as the `postgres` user:
sudo -u postgres
ssh torbackup@$BACKUPSERVER $(hostname) retrieve-file pg $OLDSERVER bacula.BASE.$BACKUPSERVER-20191004-062226-$OLDSERVER.torproject.org-$CLUSTERNAME-9.6-backup.tar.gz | sudo -u postgres tar -C /var/lib/postgresql/9.6/main -x -z -f
-
Add a `pv` before the `tar` call in the pipeline for a progress bar
with large backups, and replace:
* `$BACKUPSERVER` with the backupserver name and username
(currently `bungei.torproject.org`)
* `$OLDSERVER` with the old server's (short) hostname
(e.g. `dictyotum`)
* `$CLUSTERNAME` with the name of the cluster to restore
(e.g. usually `main`)
The above might hang for a while, but it should complete. The
"hang" is because `retrieve-file` sends a header which includes a
`sha512sum` and it takes a while to compute. If it doesn't work,
use the indirect procedure to restore the `BASE` file.
5. Make sure the `pg_xlog` directory doesn't contain any files
.
rm -f -- /var/lib/postgresql/9.6/main/pg_xlog/*
6.
Then you need to create a `recovery.conf` file in
`/var/lib/postgresql/9.6/main` that will tell postgres where to
find
the WAL files. At least the `restore_command` need to be
specified. Something like this should work:
restore_command = '/usr/local/bin/pg-receive-file-from-backup $OLDSERVER $CLUSTERNAME.WAL.%f %p'
... where:
* `$OLDSERVER` should be replaced by the previous postgresql
server
name (e.g. `dictyotum`)
* `$CLUSTERNAME` should be replaced by the previous cluster name
(e.g. `main`, generally)
You can specify a specific recovery point in the `recovery.conf`,
see
the [upstream documentation](https://www.postgresql.org/docs/9.3/recovery-target-settings.html) for more information.
Also
make sure the
file is owned by postgres:
$EDITOR /var/lib/postgresql/9.6/main/recovery.conf
chown postgres /var/lib/postgresql/9.6/main/recovery.conf
7.
Then start the server and look at the logs to follow the recovery
process:
service postgresql start
tail -f /var/log/postgresql/*
You should see something like this:
2019-10-09 21:17:47.335 UTC [9632] LOG: database system was interrupted; last known up at 2019-10-04 08:12:28 UTC
2019-10-09 21:17:47.517 UTC [9632] LOG: starting archive recovery
2019-10-09 21:17:47.524 UTC [9633] [unknown]@[unknown] LOG: incomplete startup packet
2019-10-09 21:17:48.032 UTC [9639] postgres@postgres FATAL: the database system is starting up
2019-10-09 21:17:48.538 UTC [9642] postgres@postgres FATAL: the database system is starting up
2019-10-09 21:17:49.046 UTC [9645] postgres@postgres FATAL: the database system is starting up
2019-10-09 21:17:49.354 UTC [9632] LOG: restored log file "00000001000005B200000074" from archive
2019-10-09 21:17:49.552 UTC [9648] postgres@postgres FATAL: the database system is starting up
2019-10-09 21:17:50.058 UTC [9651] postgres@postgres FATAL: the database system is starting up
2019-10-09 21:17:50.565 UTC [9654] postgres@postgres FATAL: the database system is starting up
2019-10-09 21:17:50.836 UTC [9632] LOG: redo starts at 5B2/74000028
2019-10-09 21:17:51.071 UTC [9659] postgres@postgres FATAL: the database system is starting up
2019-10-09 21:17:51.577 UTC [9665] postgres@postgres FATAL: the database system is starting up
2019-10-09 21:20:35.790 UTC [9632] LOG: restored log file "00000001000005B20000009F" from archive
2019-10-09 21:20:37.745 UTC [9632] LOG: restored log file "00000001000005B2000000A0" from archive
2019-10-09 21:20:39.648 UTC [9632] LOG: restored log file "00000001000005B2000000A1" from archive
2019-10-09 21:20:41.738 UTC [9632] LOG: restored log file "00000001000005B2000000A2" from archive
2019-10-09 21:20:43.773 UTC [9632] LOG: restored log file "00000001000005B2000000A3" from archive
... and so on.
8.
Then remove the temporary SSH access on the backup server, either
by
removing the `.more` key file or restoring the previous key
configuration:
rm /etc/ssh/userkeys/torbackup.more
### Troubleshooting
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment