... | @@ -61,54 +61,53 @@ It's also possible to do a manual mass-upgrade run with |
... | @@ -61,54 +61,53 @@ It's also possible to do a manual mass-upgrade run with |
|
|
|
|
|
cumin -b 10 '*' 'apt update ; apt upgrade -yy ; dsa-update-apt-status'
|
|
cumin -b 10 '*' 'apt update ; apt upgrade -yy ; dsa-update-apt-status'
|
|
|
|
|
|
### Restarting services
|
|
### Restarting services by hand
|
|
|
|
|
|
After upgrades, there's a Nagios check that might trigger and tell you
|
|
After upgrades, there's a Nagios check that might trigger and tell you
|
|
that some services are running with outdated libraries. For example,
|
|
that some services are running with outdated libraries. Normally,
|
|
after a Bacula upgrade:
|
|
[needrestart](https://github.com/liske/needrestart) runs after upgrades and takes care of restarting
|
|
|
|
services, but it can't actually deal with everything. In Nagios, you
|
|
|
|
will see a warning like:
|
|
|
|
|
|
The following processes have libs linked that were upgraded: bacula: bacula-fd (1787)
|
|
[web-chi-03] needrestart is WARNING: WARN - Kernel: 5.10.0-15-amd64, Services: 1 (!), Containers: none, Sessions: none
|
|
|
|
|
|
While the entire host can be rebooted (using the procedure below) to
|
|
The detailed status information will show you which service it fails
|
|
fix this problem, it's sometimes less disruptive to just restart that
|
|
to restart:
|
|
one process.
|
|
|
|
|
|
|
|
For this purpose, `needrestart` is installed on all machines, and it makes sure to
|
|
WARN - Kernel: 5.10.0-15-amd64, Services: 1 (!), Containers: none, Sessions: none
|
|
restart services. It can also be useful to restart services manually, for
|
|
Services:
|
|
example with:
|
|
- cron.service
|
|
|
|
|
|
ssh root@cupani.torproject.org needrestart -u NeedRestart::UI::stdio -r a
|
|
|
|
|
|
|
|
(Note that earlier versions of needrestart showed spurious warnings in
|
|
|
|
this mode, see [bug #859387](https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=859387), fixed in buster.)
|
|
|
|
|
|
|
|
If you cannot figure out why the warning happens, you might want to
|
|
If you cannot figure out why the warning happens, you might want to
|
|
run the check by hand:
|
|
run the check by hand:
|
|
|
|
|
|
/usr/lib/nagios/plugins/dsa-check-libs
|
|
needrestart -v
|
|
|
|
|
|
The `--verbose` flag also shows which file trigger the warning.
|
|
|
|
|
|
|
|
Some services will have `cron` as a parent, and will make
|
|
There are a few scenarios here:
|
|
`needrestart` want to restart cron which is, of course,
|
|
|
|
ineffective. The only "proper" way to restart those services is to
|
|
|
|
reboot the host.
|
|
|
|
|
|
|
|
Services setup with the new systemd-based startup system documented in
|
|
|
|
[doc/services](doc/services) can be restarted with:
|
|
|
|
|
|
|
|
systemctl restart user@1504.service
|
|
|
|
|
|
|
|
There's a feature request ([bug #843778](https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=843778)) to implement support for
|
|
|
|
those services directly in needrestart.
|
|
|
|
|
|
|
|
### Packages blocked from automatic upgrades
|
|
* `cron.service`: typically services that should run under `systemd
|
|
|
|
--user`, reboot the box or ask the service admin to restart their
|
|
|
|
services
|
|
|
|
|
|
Those packages are currently blocked from automatic upgrades in `unattended-upgrades`:
|
|
* `cron.service`, special case: sometimes, userdir-ldap's
|
|
|
|
`ud-replicate` leaves a multiplexing SSH process lying
|
|
|
|
around. logging into the LDAP server (currently `alberti`) and
|
|
|
|
killing all the `sshdist` process will clear those:
|
|
|
|
|
|
|
|
pkill -u sshdist ssh
|
|
|
|
|
|
- **Open vSwitch** (`openvswitch-switch` and `openvswitch-common`, [bug
|
|
* `ganeti.service`: typically this is an OpenSSL upgrade that affects
|
|
34185](https://bugs.torproject.org/34185)): to upgrade manually, empty the server, restart, OVS,
|
|
qemu, and restarting ganeti (thankfully) doesn't restart VMs. to
|
|
then migrate the machines back.
|
|
fix this, migrate all VMs to their secondaries and back:
|
|
|
|
|
|
|
|
gnt-node migrate fsn-node-XX
|
|
|
|
for instance in $instances_migrated_above; do
|
|
|
|
gnt-instance migrate $instance
|
|
|
|
done
|
|
|
|
|
|
|
|
* **Open vSwitch** (`openvswitch-switch` and `openvswitch-common`,
|
|
|
|
[bug 34185](https://bugs.torproject.org/34185)): to upgrade manually, empty the server, restart,
|
|
|
|
OVS, then migrate the machines back.
|
|
|
|
|
|
1. on the Ganeti master, list the instances on the Ganeti node:
|
|
1. on the Ganeti master, list the instances on the Ganeti node:
|
|
|
|
|
... | @@ -130,12 +129,12 @@ Those packages are currently blocked from automatic upgrades in `unattended-upgr |
... | @@ -130,12 +129,12 @@ Those packages are currently blocked from automatic upgrades in `unattended-upgr |
|
|
|
|
|
Note that this might be fixed in Debian bullseye, [bug 961746](https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=961746) in
|
|
Note that this might be fixed in Debian bullseye, [bug 961746](https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=961746) in
|
|
Debian is marked as fixed, but will still need to be tested on our
|
|
Debian is marked as fixed, but will still need to be tested on our
|
|
side first.
|
|
side first. Update: it hasn't been fixed.
|
|
|
|
|
|
- **Grub** (`grub-pc`, [bug 40042](https://gitlab.torproject.org/tpo/tpa/team/-/issues/40042)) has been known to have issues as
|
|
- **Grub** (`grub-pc`, [bug 40042](https://gitlab.torproject.org/tpo/tpa/team/-/issues/40042)) has been known to have issues as
|
|
well, so it is blocked. to upgrade, make sure the install device is
|
|
well, so it is blocked. to upgrade, make sure the install device is
|
|
defined, by running `dpkg-reconfigure grub-pc`. this issue might
|
|
defined, by running `dpkg-reconfigure grub-pc`. this issue might
|
|
actually have been fixed in the package, see [issue 40185](https://gitlab.torproject.org/tpo/tpa/team/-/issues/40185).
|
|
actually have been fixed in the package, see [issue 40185](https://gitlab.torproject.org/tpo/tpa/team/-/issues/40185).
|
|
|
|
|
|
Packages are blocked from upgrades when they cause significant
|
|
Packages are blocked from upgrades when they cause significant
|
|
breakage during an upgrade run, enough to cause an outage and/or
|
|
breakage during an upgrade run, enough to cause an outage and/or
|
... | @@ -149,6 +148,14 @@ Packages can be unblocked if and only if: |
... | @@ -149,6 +148,14 @@ Packages can be unblocked if and only if: |
|
* we have good confidence that future upgrades will not break the
|
|
* we have good confidence that future upgrades will not break the
|
|
system again
|
|
system again
|
|
|
|
|
|
|
|
Services setup with the new systemd-based startup system documented in
|
|
|
|
[doc/services](doc/services) can be restarted with:
|
|
|
|
|
|
|
|
systemctl restart user@1504.service
|
|
|
|
|
|
|
|
There's a feature request ([bug #843778](https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=843778)) to implement support for
|
|
|
|
those services directly in needrestart.
|
|
|
|
|
|
### Kernel upgrades and reboots
|
|
### Kernel upgrades and reboots
|
|
|
|
|
|
Sometimes it is necessary to perform a reboot on the hosts, when the
|
|
Sometimes it is necessary to perform a reboot on the hosts, when the
|
... | | ... | |