diff --git a/howto/static-component.md b/howto/static-component.md index 9a8bb236596b8e1496b0bcf302ea60414de52d87..abe01a447a08fc70c1b6f9a46a30e6af7b29a463 100644 --- a/howto/static-component.md +++ b/howto/static-component.md @@ -793,8 +793,86 @@ would eventually be completely retired. ### Replacing Jenkins with GitLab CI as a builder -See the [Jenkins documentation](service/jenkins#gitlab-ci-replacement) -for more information on that front. +NOTE: See also the [Jenkins documentation](service/jenkins#gitlab-ci) and [ticket 40364](https://gitlab.torproject.org/tpo/tpa/team/-/issues/40364) +for more information on the discussion on the different options that +were considered on that front. We have settled for the "webhook" +design, which is described below. + +NOTE: this is worded as if this was already implement, but this +implementation might be incomplete or even inexistent. See [ticket +40364](https://gitlab.torproject.org/tpo/tpa/team/-/issues/40364) for more information on progress. + +GitLab can publish pages in the static component mirror network +through the use of a "static shim" deployed on a static source. The +workflow goes a little like this: + + 1. user pushes a change to GitLab, which ... + 2. triggers a CI pipeline + 3. CI runner picks up the jobs and builds the website, pushes the + artifacts back to GitLab + 4. GitLab fires a [webhook](https://gitlab.torproject.org/help/user/project/integrations/webhooks#pipeline-events), typically on [pipeline events](https://docs.gitlab.com/ee/user/project/integrations/webhooks.html#pipeline-events) + 5. webhook receives the ping and authenticates against a + configuration, mapping to a given `static-component` (TODO: allow + list for gitlab?) + 6. after authentication, the webhook fires a script + (`static-gitlab-shim-pull`) + 7. `static-gitlab-shim-pull` parses the payload from the webhook and + finds the URL for the artifacts + 8. it extracts the artifacts in a temporary directory + 9. it runs `rsync -c` into the local static source, to avoid + resetting timestamps + 10. it fires the static-update-component command to propagate changes + to the rest of the static-component system + +A subset of those steps can be seen in the following design: + + + +The shim components run on a separate static-source, called +`static-gitlab-shim-source`. This is done to avoid adding complexity +to the already complicated, general purpose static source +(`staticiforme`). This has the added benefit that the source can be +hardened in the sense that access is restricted to TPA (which is not +the case of `staticiforme`). + +The mapping between webhooks and static components is established in +Puppet, which generates the secrets and writes it to the webhook +configuration, along with the `site_url` which corresponds to the site +URL in the `static-components.yaml` file. This is done to ensure that +a given GitLab project only has access to a single site and cannot +overwrite other sites. + +This involves that each site configured in this way must have a +secret token (in Trocla) and configuration (in Hiera) created by TPA +in Puppet. The secret token must also be configured in the GitLab +project. This could be automated by the judicious use of the GitLab +API using admin credentials, but considering that new sites are not +created very frequently, it could also be done by hand. + +We briefly considered using GitLab's [CI deployment mechanism](https://about.gitlab.com/blog/2021/02/05/ci-deployment-and-environments/) +instead of webhooks, but decided against it for the following reasons: + + * the complexity is similar: both need a shared token between GitLab + and the static source + + * however, configuring the deployment variables takes more click + ([9](https://gitlab.torproject.org/tpo/tpa/status-site/-/settings/ci_cd) vs [5](https://gitlab.torproject.org/tpo/tpa/status-site/-/hooks) in my count), and is slightly more confusing + (e.g. what's "Protect variable"?) and possibly insecure + (e.g. private key leakage if user forgets to click "Mask variable") + + * the deployment also requires custom code to be added to the + `.gitlab-ci.yml` file. in the context where we are considering + using GitLab pages to replace the static mirror system in the long + term, we prefer to avoid adding custom stuff to the CI config file + and "pretend" that this is "just like GitLab pages" + + * we prefer to open a HTTPS port than an SSH port to GitLab, from a + security perspective, even if the SSH user would be protected by an + proper `authorized_keys`. in the context where i'm considering + locking down SSH access to only jump boxes, it would require an + exception and is more error prone (e.g. if we somehow forget the + `command=` override, we open full shell access) <!-- LocalWords: atomicity DDOS YAML Hiera webserver NFS CephFS TLS --> diff --git a/howto/static-component/architecture-static-shim.dot b/howto/static-component/architecture-static-shim.dot new file mode 100644 index 0000000000000000000000000000000000000000..ebd562474970316d6ebc85d3cbd3ec8097ecce06 --- /dev/null +++ b/howto/static-component/architecture-static-shim.dot @@ -0,0 +1,33 @@ +digraph static { + label="GitLab / static mirror integration architecture, torproject.org, september 2021" + subgraph "clustergitlab" { + label="GitLab components" + labelloc=bottom + + CI [ label="CI runners" ] + GitLab [ label="GitLab rails\n app" shape=service ] + artifacts [ shape=cylinder ] + GitLab -> CI [ label="dispatches jobs" ] + CI -> artifacts [ label="publishes" ] + } + subgraph "clustersource" { + label="static source" + labelloc=bottom + webhook [ shape=box ] + webhook -> puller [ label="runs" ] + puller [ label="static-shim-pull" ] + puller -> artifacts [ label="pulls and extracts" ] + update [ label="static-update-component" ] + puller -> update [ label="runs" ] + } + subgraph "clusterlegend" { + service [ shape=box ] + files [ shape=cylinder ] + process [ shape=oval ] + label="legend" + labelloc=bottom + } + master [ label="static master\nand mirrors..." ] + update -> master [ label="notifies" ] + GitLab -> webhook [ label="notifies" ] +} diff --git a/howto/static-component/architecture-static-shim.png b/howto/static-component/architecture-static-shim.png new file mode 100644 index 0000000000000000000000000000000000000000..4d969817d2a1f636606aaae842de27ac2bc3fba8 Binary files /dev/null and b/howto/static-component/architecture-static-shim.png differ diff --git a/service/jenkins.md b/service/jenkins.md index 441e4232c7f55c1f6de5d7cb5d7538d8bbeb68e8..cd717385f61a1e44827213ec46f20dd2e642cd20 100644 --- a/service/jenkins.md +++ b/service/jenkins.md @@ -421,14 +421,7 @@ perform the following: 6. triggers `static-update-component` This would mean a new service, but would allow us to retire Jenkins -without rearchitecturing the entire static mirroring system (see above -for the idea of replacing it with GitLab pages). +without rearchitecturing the entire static mirroring system. -We should carefully look at the Jenkins jobs in existence and see -which absolutely need to be migrated in this way, maybe there's a way -to convert those to simply use GitLab pages and CI, with very few -exceptions. - -### other jobs - -TODO: clarify how jobs will be replaced, presumably with GitLab CI +UPDATE: the above design was expanded in [the static component +documentation](howto/static-component#replacing-jenkins-with-gitlab-ci-as-a-builder).