Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
Wiki Replica
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Container Registry
Model registry
Operate
Environments
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
The Tor Project
TPA
Wiki Replica
Commits
5d54dfe1
Unverified
Commit
5d54dfe1
authored
5 years ago
by
anarcat
Browse files
Options
Downloads
Patches
Plain Diff
document the new reboot procedures (#33406)
parent
0f143ad0
No related branches found
Branches containing commit
No related tags found
No related merge requests found
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
tsa/howto/ganeti.mdwn
+6
-6
6 additions, 6 deletions
tsa/howto/ganeti.mdwn
tsa/howto/upgrades.mdwn
+11
-17
11 additions, 17 deletions
tsa/howto/upgrades.mdwn
with
17 additions
and
23 deletions
tsa/howto/ganeti.mdwn
+
6
−
6
View file @
5d54dfe1
...
@@ -562,18 +562,18 @@ use case.
...
@@ -562,18 +562,18 @@ use case.
## Rebooting
## Rebooting
Those hosts need special care, as we can accomplish zero-downtime
Those hosts need special care, as we can accomplish zero-downtime
reboots on those machines. The
re's a script (`ganeti-reboot-cluster`)
reboots on those machines. The
`reboot` script in `tsa-misc` takes
deployed in the ganeti cluster that can be ran on the master to
care of the special steps involved (which is basically to empty a
migrate all instances around and perform a clean reboot
.
node before rebooting it)
.
Such a reboot should be ran interactively, inside a `tmux` or `screen`
Such a reboot should be ran interactively, inside a `tmux` or `screen`
session, and takes over 15 minutes to complete right now, but depends
session, and takes over 15 minutes to complete right now, but depends
on the size of the cluster (in terms of core memory usage).
on the size of the cluster (in terms of core memory usage).
Once the reboot is completed, all instances might end up on a single
Once the reboot is completed, all instances might end up on a single
machine, and the cluster might need to be rebalanced
. This is
machine, and the cluster might need to be rebalanced
, see
automatically scheduled by the `ganeti-reboot-cluster` script and will
below. (Note: the update script should eventually do that, see [ticket
be done within 30 minutes of the reboot
.
#33406](https://trac.torproject.org/projects/tor/ticket/33406))
.
## Rebalancing a cluster
## Rebalancing a cluster
...
...
This diff is collapsed.
Click to expand it.
tsa/howto/upgrades.mdwn
+
11
−
17
View file @
5d54dfe1
...
@@ -105,20 +105,17 @@ this:
...
@@ -105,20 +105,17 @@ this:
#### Rebooting guests
#### Rebooting guests
If this is only a virtual machine, and the only one affected, it can
If this is only a virtual machine, and the only one affected, it can
be rebooted directly. This is a useful pipeline that will reboot the
be rebooted directly. This can be done with the `tsa-misc` script
host and make sure it comes back within a certain delay:
called `reboot`:
HOST=foo.torproject.org &&
./reboot -H test-01.torproject.org,test-02.torproject.org
ssh root@$HOST /sbin/shutdown -r +5 new kernel &&
echo "waiting 5 minutes for reboot to happen..."
sleep 5m &&
echo "waiting for host to go down for 30 seconds..." &&
sleep 30 &&
echo "waiting up to 2 minutes for $HOST to come back..." &&
date &&
ping -c 10 -w 120 $HOST ; ssh $HOST uptime && echo "check uptime above"
(Update: the above script is now in `tsa-misc/reboot-guest`.)
By default, the script will wait 2 minutes before hosts: that should
be changed to *30 minutes* if the hosts are part of a mirror network
to give the monitoring systems (`mini-nag`) time to rotate the hosts
in and out of DNS:
./reboot -H mirror-01.torproject.org,mirror-02.torproject.org --delay-nodes 1800
If the host has an encrypted filesystem and is hooked up with Mandos, it
If the host has an encrypted filesystem and is hooked up with Mandos, it
will return automatically. Otherwise it might need a password to be
will return automatically. Otherwise it might need a password to be
...
@@ -155,7 +152,7 @@ complains about `libvirt-qemu` processes, use the
...
@@ -155,7 +152,7 @@ complains about `libvirt-qemu` processes, use the
#### Rebooting Ganeti clusters
#### Rebooting Ganeti clusters
This is documented in the [[ganeti]] section, but it's basically
This is documented in the [[ganeti]] section, but it's basically
running the
`ganeti-reboot-cluster`
and `hbal` commands.
running the
above `reboot` sript
and `hbal` commands.
#### Generic upgrade routines
#### Generic upgrade routines
...
@@ -167,9 +164,6 @@ LDAP hosts have information about how they can be rebooted, in the
...
@@ -167,9 +164,6 @@ LDAP hosts have information about how they can be rebooted, in the
rebooted one at a time
rebooted one at a time
* `manual` - needs to be done by hand
* `manual` - needs to be done by hand
The scripts (in `tsa-misc`?) `torproject-reboot-rotation` and
`torproject-reboot-simple` take care of the latter two.
### Example runs
### Example runs
Here's an example run of the upgrade tool:
Here's an example run of the upgrade tool:
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment