From c82270cd5d37b0e6b55c06a7593026decc143322 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Antoine=20Beaupr=C3=A9?= <anarcat@debian.org> Date: Mon, 9 Nov 2020 15:42:37 -0500 Subject: [PATCH] expand on possible solutions --- howto/static-component.md | 55 ++++++++++++++++++++++++++++++--------- 1 file changed, 43 insertions(+), 12 deletions(-) diff --git a/howto/static-component.md b/howto/static-component.md index 2af5b66f8..85b535af0 100644 --- a/howto/static-component.md +++ b/howto/static-component.md @@ -469,34 +469,65 @@ present in Hiera, see [issue 30020](https://gitlab.torproject.org/tpo/tpa/team/- ## Goals -TODO: document requirements - ### Must have + * high availability: continue serving content even if one (or a few?) + servers go down + * atomicity: the deployed content must be coherent + * high performance: should be able to saturate a gigabit link and + withstand simple DDOS attacks + ### Nice to have + * cache-busting: changes to a CSS or JavaScript file must be + propagated to the client reasonably quickly + * possibly host Debian and RPM package repositories + ### Non-Goals + * implement our own global content distribution network + ## Approvals required Should be approved by TPA. ## Proposed Solution -TODO: propose improvements to the current static mirror system. +The static mirror system certainly has its merits: it's flexible, +powerful and provides a reasonably easy to deploy, high availability +service, at the cost of some level of obscurity, complexity, and high +disk space requirements. -brainstorm: +It should be possible to replace parts or the entirety of the system +progressively, however. A few ideas: - * replace source with gitlab CI/runners - * get rid of master altogether? becomes gitlab pages? - * replace mirrors with the caching system? + * the **mirror** hosts could be replaced by the [cache + system](cache). this would possibly require shifting the web service + from the **mirror** to the **master** or at least some significant + re-architecture + * the **source** hosts could be replaced by some parts of the [GitLab + Pages](https://docs.gitlab.com/ee/administration/pages/) system. unfortunately, that system relies on a custom + webserver, but it might be possible to bypass that and directly + access the on-disk files provided by the CI. One concern with using GitLab pages is that it uses a custom webserver -(to get and issue TLS certs for the custom domains) and requires a -shared filesystem to deploy content. GitLab.com uses NFS to decouple -the pages host from the main GitLab host, maybe we could use CephFS -instead? In any case it's a little clunky and doesn't immediately -fulfill the high availability requirement. +(to get and issue TLS certs for the custom domains). + +It also assumes the existence of a shared filesystem to deploy +content. GitLab.com uses NFS to decouple the pages host from the main +GitLab host, maybe we could use CephFS instead? In any case it's a +little clunky and doesn't immediately fulfill the high availability +requirement. + +The other downside of this approach is increased dependency on GitLab +for deployments. + +Next steps: + + 1. check if the GitLab Pages subsystem provides atomic updates + 2. see how GitLab Pages can be distributed to multiple hosts and how + scalable it actually is or if we'll need to run the cache frontend + in front of it ## Cost -- GitLab