Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
Wiki Replica
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Container Registry
Model registry
Operate
Environments
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
The Tor Project
TPA
Wiki Replica
Commits
99283b3a
Verified
Commit
99283b3a
authored
1 year ago
by
anarcat
Browse files
Options
Downloads
Patches
Plain Diff
propose budget for a new backup server (
team#41364
)
parent
7a93952c
No related branches found
Branches containing commit
No related tags found
Tags containing commit
No related merge requests found
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
policy.md
+3
-1
3 additions, 1 deletion
policy.md
policy/tpa-rfc-63-storage-server-budget.md
+106
-0
106 additions, 0 deletions
policy/tpa-rfc-63-storage-server-budget.md
with
109 additions
and
1 deletion
policy.md
+
3
−
1
View file @
99283b3a
...
...
@@ -25,7 +25,9 @@ and add it to the above list.
## Proposed
No policy is currently
`proposed`
.
<!-- No policy is currently `proposed`. -->
*
[
TPA-RFC-63: Storage server budget
](
policy/tpa-rfc-63-storage-server-budget
)
## Standard
...
...
This diff is collapsed.
Click to expand it.
policy/tpa-rfc-63-storage-server-budget.md
0 → 100644
+
106
−
0
View file @
99283b3a
---
title: TPA-RFC-63
:
buy a new backup storage server (5k$ + 100$/mth)
---
[[
_TOC_
]]
Summary: 5k budget amortized over 6 years, with 100$/mth hosting, so
170$USD/mth, for a new 80TB (4 drives, expandable to 8) backup server
in the secondary location for disaster recovery and the new metrics
storage service. Comparable to the current Hetzner backup storage
server (190USD/mth for 100TB).
# Background
Our backup system relies on a beefy storage server with a 90TB raw
disk capacity (72.6TiB). That server currently costs us 175EUR
(190USD) per month at Hetzner, on bare metal. That server is currently
running out of disk space. We've been having issues with it as
[
early
as 2021
][]
, but have continuously been able to work around the issues.
[
early as 2021
]:
https://gitlab.torproject.org/tpo/tpa/team/-/issues/40477
Lately, however, this work has been getting more difficult, wasting
more and more engineering time as we try to fit more things on this
aging server. The last incident, in
[
October 2023
][]
, used up all the
remaining spare capacity on the server, and we're now blocked from
expanding other machines.
[
October 2023
]:
https://gitlab.torproject.org/tpo/tpa/team/-/issues/41361
This is particularly a concern for new metrics services, which are
pivoting towards a new storage solution. This will centralize storage
on one huge database server (5TiB with 0.5TiB growth per year), which
the current architecture cannot handle at all, especially at the
software level.
There was also a
[
scary incident in December 2023
][]
where parts of
the main Ganeti cluster went down, taking down the GitLab server and
many other services for an
[
hour long outage
][]
. The recovery
prospects for this were dim, as an
[
estimate for a GitLab
migration
][]
says it would have taken 18 hours, just to copy data
over between the two data centers.
[
scary incident in December 2023
]:
https://gitlab.torproject.org/tpo/tpa/team/-/issues/41429
[
hour long outage
]:
https://status.torproject.org/issues/2023-12-06-gitlab-collector-outage/
[
estimate for a GitLab migration
]:
https://gitlab.torproject.org/tpo/tpa/team/-/issues/41431
So having a secondary storage server that would be responsible for
backing up Hetzner outside of Hetzner seems like a crucial step to
handle such disaster recovery scenarios.
# Proposal
The proposal is to buy a new bare metal storage server from InterPRO
provider, where we recently bought the Tor Browser build machines and
Ganeti cluster.
We had an estimate of about 5000$USD for a 80TB server (four 20 TB
drives, expandable to eight). Amortized over 6 years, this adds up to
a 70$USD/mth expense.
Our colocation provider in the US has nicely offered us a 100$/mth
deal for this, which adds up to 170$/mth total.
The server would be built with the same software stack as the current
storage server, with the exception of the PostgreSQL database backups,
for which we'd experiment with
[
pgbarman
][]
.
[
pgbarman
]:
https://pgbarman.org/
# Alternatives considered
## Replacement
An alternative to the above would be to completely replace the storage
server at Hetzner by the newer generation they offer, which is the
[
SX134
][]
(the current server being a SX132). That server offers
160TiB of disk space for 208EUR/mth or 227USD/mth.
[
SX134
]:
https://www.hetzner.com/dedicated-rootserver/sx134/configurator/#/
That would solve the storage issue, but would raise monthly costs by
37USD/mth. It would also not address the vulnerability in the disaster
recovery plan, where the backup server is in the same location as the
main cluster.
# Costs
5000USD one time, 100$/mth, 170$/mth amortized over 6 years.
# Approval
Isabela, Sue.
# Deadline
Ideally would be approved in March.
# Status
This proposal is currently in the
`proposed`
state.
# References
*
[
quote from provider
](
https://gitlab.torproject.org/tpo/tpa/team/-/issues/41536
)
*
[
discussion issue
](
https://gitlab.torproject.org/tpo/tpa/team/-/issues/41364
)
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment