Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
Wiki Replica
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Container Registry
Model registry
Operate
Environments
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
The Tor Project
TPA
Wiki Replica
Commits
a861b918
Verified
Commit
a861b918
authored
4 weeks ago
by
anarcat
Browse files
Options
Downloads
Patches
Plain Diff
draft trixie upgrade process proposal (
team#41990
)
parent
e6fd0e3a
No related branches found
Branches containing commit
No related tags found
No related merge requests found
Pipeline
#251277
failed
4 weeks ago
Stage: build
Stage: test
Changes
2
Pipelines
1
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
policy.md
+1
-0
1 addition, 0 deletions
policy.md
policy/tpa-rfc-80-debian-bookworm-upgrade-schedule.md
+348
-0
348 additions, 0 deletions
policy/tpa-rfc-80-debian-bookworm-upgrade-schedule.md
with
349 additions
and
0 deletions
policy.md
+
1
−
0
View file @
a861b918
...
...
@@ -28,6 +28,7 @@ the Git repository for this wiki, run the command:
*
[
TPA-RFC-45: Mail architecture
](
policy/tpa-rfc-45-mail-architecture
)
*
[
TPA-RFC-47: Email account retirement
](
policy/tpa-rfc-47-email-account-retirement
)
*
[
TPA-RFC-66: Migrate to Gitlab Ultimate Edition
](
policy/tpa-rfc-66-gitlab-ultimate-program
)
*
[
TPA-RFC-80: Debian trixie upgrade schedule
](
policy/tpa-rfc-80-debian-trixie-upgrade-schedule
)
## Proposed
...
...
This diff is collapsed.
Click to expand it.
policy/tpa-rfc-80-debian-bookworm-upgrade-schedule.md
0 → 100644
+
348
−
0
View file @
a861b918
---
title: TPA-RFC-80
:
Debian trixie upgrade schedule
costs
:
staff, 4+ weeks
approval
:
TPA, service admins
affected users
:
TPA, service admins
deadline
:
TODO
status
:
draft
discussion
:
https://gitlab.torproject.org/tpo/tpa/team/-/issues/41990
---
# Background
Debian 13 "trixie", currently "testing" is going into freeze soon, which
means we should have a new Debian stable release in 2025. It has been
a long-standing tradition at TPA to collaborate in the Debian
development process and part of that process is to upgrade our servers
during the freeze. Upgrading during the freeze makes it easier for us
to fix bugs as we find them and contribute them to the community.
The
[
freeze dates announced by the debian.org release team
][]
are:
2025-03-15 - Milestone 1 - Transition and toolchain freeze
2025-04-15 - Milestone 2 - Soft Freeze
2025-05-15 - Milestone 3 - Hard Freeze - for key packages and
packages without autopkgtests
To be announced - Milestone 4 - Full Freeze
Even though we've just completed the Debian 11 ("bullseye") and 12
("bookworm") upgrades in late 2024, we feel it's a good idea to start
*and*
complete the trixie upgrades in 2025. That way, we can hope of
having a year or two (2026-2027?)
*without*
any major upgrades.
This proposal is part of the
[
Debian 13 trixie upgrade milestone
][]
,
itself part of the
[
2025 TPA roadmap
][]
.
[
Debian 13 trixie upgrade milestone
]:
https://gitlab.torproject.org/groups/tpo/tpa/-/milestones/12
[
2025 TPA roadmap
]:
https://gitlab.torproject.org/tpo/tpa/team/-/wikis/roadmap/2025
[
freeze dates announced by the debian.org release team
]:
https://lists.debian.org/debian-devel-announce/2025/01/msg00004.html
# Proposal
As usual, we perform the upgrades in three batches, in increasing
order of complexity, starting in 2025Q2, hoping to finish by the end
of 2025.
Note that, this year, this proposal also includes upgrading the Tails
infrastructure as well. To help with merging rotations in the two
teams, TPA staff will upgrade Tails machines, with Tails folks
assistance, and vice-versa.
## Affected users
All service admins are affected by this change. If you have shell
access on any TPA server, you want to read this announcement.
In the past, TPA has typically keeps a page detailing notable changes
and proposal like this one would link against the upstream release
notes. Unfortunately, at the time writing, upstream hasn't yet
produced release notes (as we're still in testing).
TODO: well the above sounds bad. maybe we shouldn't upgrade during
freeze after all?
## Upgrade schedule
The upgrade is split in multiple batches:
-
installer changes: TODO
-
low complexity: mostly TPA services and less critical Tails servers
-
moderate complexity: TPA "service admins" machines and remaining
Tails physical servers and VMs running services from the official
Debian repositories only
-
high complexity: Tails VMs running services not from the official
Debian repositories
-
cleanup: TODO
The free time between the first two batches will also allow us to
cover for unplanned contingencies: upgrades that could drag on and
other work that will inevitably need to be performed.
The objective is to do the batches in collective "upgrade parties"
that should be "fun" for the team. This policy has proven to be
effective in the previous upgrades and we are eager to repeat it
again.
### Batch 1: low complexity, April-May 2025
This is actually scheduled in two weeks: TPA boxes will be upgraded in
the last week of April, and Tails in the first week of May.
The idea is to start the upgrade long enough before the vacations to
give us plenty of time to recover, and some room to start the second
batch.
In April, Debian should also be in "soft freeze", not quite a fully
"stable" environment, but that should be good enough for simple
setups.
35 TPA machines:
```
archive-01.torproject.org
cdn-backend-sunet-02.torproject.org
chives.torproject.org
dal-rescue-01.torproject.org
dal-rescue-02.torproject.org
gayi.torproject.org
hetzner-hel1-02.torproject.org
hetzner-hel1-03.torproject.org
hetzner-nbg1-01.torproject.org
hetzner-nbg1-02.torproject.org
idle-dal-02.torproject.org
idle-fsn-01.torproject.org
lists-01.torproject.org
loghost01.torproject.org
mandos-01.torproject.org
media-01.torproject.org
minio-01.torproject.org
mta-dal-01.torproject.org
mx-dal-01.torproject.org
neriniflorum.torproject.org
ns3.torproject.org
ns5.torproject.org
palmeri.torproject.org
perdulce.torproject.org
srs-dal-01.torproject.org
ssh-dal-01.torproject.org
static-gitlab-shim.torproject.org
staticiforme.torproject.org
static-master-fsn.torproject.org
submit-01.torproject.org
vault-01.torproject.org
web-dal-07.torproject.org
web-dal-08.torproject.org
web-fsn-01.torproject.org
web-fsn-02.torproject.org
```
4 Tails machines:
```
ecours.tails.net
puppet.lizard
skink.tails.net
stone.tails.net
```
In the
[
first batch of bookworm machines
][]
, we ended up taking 20
minutes per machine, done in a single day, but warned that the second
batch took longer.
It's probably safe to estimate 20 hours (30 minutes per machine) for
this work, in a single week.
[
first batch of bookworm machines
]:
https://gitlab.torproject.org/tpo/tpa/team/-/issues/41251
Feedback and coordination of this batch happens in
[
issue batch 1 TODO
](
).
### Batch 2: moderate complexity, May-June 2025
This is scheduled for the last week of may for TPA machines, and the
first week of June for Tails.
At this point, Debian testing should be in "hard freeze", which should
be more stable.
40 TPA machines:
```
anonticket-01.torproject.org
backup-storage-01.torproject.org
bacula-director-01.torproject.org
btcpayserver-02.torproject.org
bungei.torproject.org
carinatum.torproject.org
check-01.torproject.org
ci-runner-x86-02.torproject.org
ci-runner-x86-03.torproject.org
colchicifolium.torproject.org
collector-02.torproject.org
crm-int-01.torproject.org
dangerzone-01.torproject.org
donate-01.torproject.org
donate-review-01.torproject.org
forum-01.torproject.org
gitlab-02.torproject.org
henryi.torproject.org
materculae.torproject.org
meronense.torproject.org
metricsdb-01.torproject.org
metricsdb-02.torproject.org
metrics-store-01.torproject.org
onionbalance-02.torproject.org
onionoo-backend-03.torproject.org
polyanthum.torproject.org
probetelemetry-01.torproject.org
rdsys-frontend-01.torproject.org
rdsys-test-01.torproject.org
relay-01.torproject.org
rude.torproject.org
survey-01.torproject.org
tbb-nightlies-master.torproject.org
tb-build-02.torproject.org
tb-build-03.torproject.org
tb-build-06.torproject.org
tb-pkgstage-01.torproject.org
tb-tester-01.torproject.org
telegram-bot-01.torproject.org
weather-01.torproject.org
```
17 Tails machines:
```
apt-proxy.lizard
apt.lizard
bitcoin.lizard
bittorrent.lizard
bridge.lizard
dns.lizard
dragon.tails.net
gitlab-runner.iguana
iguana.tails.net
lizard.tails.net
mail.lizard
misc.lizard
puppet-git.lizard
rsync.lizard
teels.tails.net
whisperback.lizard
www.lizard
```
The
[
second batch of bookworm upgrades
][]
took 33 hours for 31
machines, so about one hour per box. Here we have 57 machines, so it
will likely take us 60 hours (or two weeks) to complete the upgrade.
[
second batch of bookworm upgrades
]:
https://gitlab.torproject.org/tpo/tpa/team/-/issues/41252
Feedback and coordination of this batch happens in
[
issue batch 2 TODO
](
).
### Batch 3: high complexity, 2025 Q3-Q4
Those machines are harder to upgrade, or more critical. In the case of
TPA machines, we typically regroup the Ganeti servers and all the
"snowflake" servers that are not properly Puppetized and full of
legacy, namely the LDAP, DNS, and Puppet servers.
That said, we waited a long time to upgrade the Ganeti cluster for
bookworm, and it turned out to be trivial, so perhaps those could
eventually be made part of the second batch.
15 TPA machines:
```
alberti.torproject.org
dal-node-01.torproject.org
dal-node-02.torproject.org
dal-node-03.torproject.org
fsn-node-01.torproject.org
fsn-node-02.torproject.org
fsn-node-03.torproject.org
fsn-node-04.torproject.org
fsn-node-05.torproject.org
fsn-node-06.torproject.org
fsn-node-07.torproject.org
fsn-node-08.torproject.org
nevii.torproject.org
pauli.torproject.org
puppetdb-01.torproject.org
```
It seems like the
[
bookworm Ganeti upgrade
][]
took roughly 10h of
work. We ballpark the rest of the upgrade to another 10h of work, so
possibly 20h.
11 Tails machines:
```
isoworker1.dragon
isoworker2.dragon
isoworker3.dragon
isoworker4.dragon
isoworker5.dragon
isoworker6.iguana
isoworker7.iguana
isoworker8.iguana
jenkins.dragon
survey.lizard
translate.lizard
```
[
bookworm Ganeti upgrade
]:
https://gitlab.torproject.org/tpo/tpa/team/-/issues/41254
The challenge with Tails upgrades is the coordination with the Tails
team, in particular for the Jenkins upgrades.
Feedback and coordination of this batch happens in
[
issue batch 3 TODO
](
).
## Upgrade automation
TODO: document we want to start automating upgrades more
# Alternatives considered
## Retirements or rebuilds
We do not plan any major upgrade or retirements in the third phase
this time.
In the future, we hope to decouple those as much as possible, as the
Icinga retirement and Mailman 3 became blockers that slowed down the
upgrade significantly for bookworm. In both cases, however, the
upgrades
*were*
challenging and had to be performed one way or
another, so it's unclear if we can optimize this any further.
We are clear, however, that we will not postpone an upgrade for a
server retirement. Dangerzone, for example, is scheduled for
retirement (
[
TPA-RFC-78
][]
) but is still planned as normal above.
[
TPA-RFC-78
]:
https://gitlab.torproject.org/tpo/tpa/team/-/wikis/policy/tpa-rfc-78-dangerzone-retirement
# Costs
The entire work here should consist of about four weeks of work,
medium uncertainty.
# Approvals required
This proposal needs approval from TPA team members, but service admins
can request additional delay if they are worried about their service
being affected by the upgrade.
Comments or feedback can be provided in issues linked above, or the
general process can be commented on in issue
[
tpo/tpa/team#41990
][]
.
[
tpo/tpa/team#41990
]:
https://gitlab.torproject.org/tpo/tpa/team/-/issues/41990
# References
*
[
Debian 13 trixie upgrade milestone
][]
*
[
discussion ticket
][
tpo/tpa/team#41990
]
[
TPA bookworm upgrade procedure
]:
https://gitlab.torproject.org/tpo/tpa/team/-/wikis/howto/upgrades/bookworm
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment