estimate storage requirements for metricsdb and backups
in #41424 (closed), we have agreed to continue with the monolithic postgresql design for the time being, more or less -- collector will move to object storage and there's a possibility of introducing other optimizations (#41416 (comment 2978071)) -- but for now that's the plan.
We'll need to scale up storage for metricsdb.
Right now, the storage usage is as follows:
machine | used | size |
---|---|---|
metricsdb-01 | 1.07TiB | 7.88TiB |
bungei pg | 1.36TiB | 2.96TiB |
total | 1.43TiB | 10.84TiB |
Source: https://grafana.torproject.org/d/zbCoGRjnz/disk-usage?orgId=1&var-class=All&var-instance=bungei.torproject.org&from=now-90d&to=now&refresh=5s&var-Filters=mountpoint%7C%3D%7C%2Fsrv%2Fbackups%2Fpg https://grafana.torproject.org/d/zbCoGRjnz/disk-usage?orgId=1&var-class=All&var-instance=metricsdb-01.torproject.org&from=now-1y&to=now&refresh=5s
The specification is that we need weekly backups of the postgresql database (not WAL logs) except for a subset of tables that need hourly or better backups (ideally WAL).
The estimate is the database size at launch will be around 5TiB, with a 500GiB growth per year.
This could involve building a new storage server to handle those backups (#41364 (closed)) and we feel it would be a good idea to start working with Barman for this system.
The output of this issue is an estimate for hardware needs, a rough architectural draft, and subsequent tickets to make necessary changes to reflect said architecture.
/cc @lavamind