implement mechanism to deploy for static components from GitLab CI instead of Jenkins
as part of jenkin's retirement (#40218 (closed)), a key component is to enable GitLab CI to publish content into the static mirror system.
This is the critical website build of the TPA-RF-10 jenkins retirement proposal. In the proposal, we stated this:
Critical websites should be built by GitLab CI just like non-critical sites, but must be pushed to the static mirror system somehow. The GitLab Pages data source (currently the main GitLab server) should be used as a "static source" which would get triggered by a GitLab web hook after a successful job.
The receiving end of that web hook would be a new service, also running on the GitLab Pages data source, which would receive hook notifications and trigger the relevant static component updates to rsync the files to the static mirror system.
As an exception to the "users migrate their own jobs" rule, TPA and the web team will jointly oversee the implementation of the integration between GitLab CI and the static mirror system. Considering the complexity of both systems, it is unlikely the web team or TPA will be in a position to individually implement this solution.
In the Jenkins retirement ticket (#40218 (closed)) we expanded on this and @lavamind suggested we use the webhook daemon to:
deploy things directly from GitLab CI. So actually, we'd have these two options to choose from.
The first:
Webhook is a lightweight configurable tool written in Go, that allows you to easily create HTTP endpoints (hooks) on your server, which you can use to execute configured commands. You can also pass data from the HTTP request (such as headers, payload or query variables) to your commands. webhook also allows you to specify rules which have to be satisfied in order for the hook to be triggered.
The general idea is that we'd configure a Webhook on each GitLab repository with a secret in the
X-GitLab-Token
header to fire when a job succeeds and whose endpoint would be awebhook
daemon running on the static master. That daemon's job would be to validate the header token and on success launch a predetermined command-line with one or more arguments pulled in from the webhook payload.The second:
We can also decide to go all-in with GitLab CI and set up a
deploy:
job that will push the build artifacts from CI directly to the server directly and triggerstatic-update-component
. To pull this off we'd have to put an SSH private key in GitLab (as a protected variable), and set upauthorized_keys
on the static master to permit the necessary commands only.
The workflow would be:
- user pushes a change to git, which ...
- triggers a CI pipeline
- CI runner picks up the jobs and builds the website, pushes the artifacts back to GitLab
- GitLab fires a webhook
- webhook receives the ping and authenticates against a configuration, mapping to a given static-sync component
- hopefully the webhook has details of the artifacts in-band, otherwise it needs to fetch the path to the artifacts here)
- webhook downloads the artifact into the static component source path
- webhook fires the static-update-component command on the right project
A few concerns I have with this approach:
- we need a configuration file containing the GitLab webhook tokens
- this is the webhook configuration file
- we need to figure out how to map webhook pings to components, maybe by adding a variable to the CI configuration?
- this is also the webhook configuration, the secret webhook tokens maps to one and only one site URL, which then maps correctly into the static component
- this does mean that a new static component like this needs an entry in two places: the old
static-components.yaml
file, and Hiera
- we need to keep the webhook tokens in sync
- this is a one-shot thing
- the webhook software needs to be configured to fire up some command, which command probably needs to parse the above configuration file, which, maybe, means we could just write a whole new daemon instead of reusing webhook?
- we keep the webhook simple and move most business logic into a one-shot script, with the hope that not running a custom-made daemon will simplify the design
- all the above configuration is a manual snowflake just for us, which adds to an already very complicated static-sync design
- that will be the case until we get rid of the static-component system. and hopefully it will be slightly simpler (and better documented) than the Jenkins setup
In any case, we need to implement this, soonish. We have a spot check coming up in september for the retirement, and there has been little progress on the website front, so we need to start doing something here. It was agreed with @lavamind that we could use the static blog website as a test for this deployment, but we could also use status.torproject.org as well.
I favor the webhook + artifacts approach instead of the "SSH deployment" approach. My rationale is:
I have typically been uncomfortable with giving GitLab SSH access on servers, and particularly on the static mirror infrastructure, so we have not picked that approach so far. But maybe it should be reconsidered, especially with hardened
authorized_keys
files?Might certainly be simpler than writing a webook, but it would still require some configuration on both ends, and all mostly manual, which is kind of a drag...
Launch checklist:
-
figure out how to configure webhook and how to integrate (if at all?) with the static component configuration -
decide where to deploy the webhook compnent (new static source) -
name things ("puller", "notifier", "static source", the entire project needs a better name), names are: - static source hostname: gitlab-shim (we previously used static-gitlab-shim-source, but that broke bacula, see #40416)
- puller: static-gitlab-shim-pull
- notifier: "the webhook" should be enough
- class name:
staticsync::gitlab_shim
-
(re-)create a VM for the static source -
write the puller and deploy status.torproject.org locally -
write the notifier (it's the webhook
package with a simple config managed by puppet0 -
add the new VM as a staticsync::static_source
in puppet -
sudo configuration -
deployment: replace the current status.torproject.org source with the new source (involves changing static-components.yaml
) (status-site#14 (closed)) -
at this point, status.torproject.org is managed by GitLab CI, celebrate! -
fix webhook timeout issues (e.g. change timeout, daemonize puller, or switch to ssh deployment) -
retire the webhook code -
send an announcement about the research.tpo migration -
create a gitlab CI template for the deployment script (see the template repo and the deploy template) -
migrate another existing hugo site (research.tpo, tpo/web/research#40005 (closed)) -
document how admins setup such sites -
migrate one lektor site -
document how users setup such sites (see the static shim docs) -
send an announcement about the upcoming migration and documentation -
make tickets for the individual websites to migrate, part of the roadmap -
finish design docs (clear all TODO items in the service/static-shim.md
wiki page) -
deploy the test blog with the webhook system -
consider standardising the lektor CI pipeline (tpo/web/team#4 (moved)) -
migrate another existing lektor site -
migrate all hugo sites (hugo.yaml
in jenkins) -
migrate all lektor sites (lektor.yaml
in jenkins) -
migrate www.torproject.org (website.yaml
in jenkins)
When this checklist is done:
- a solid procedure for migrating sites is implemented
- a few sites have been migrated as tests
all the jobs are retired from jenkinsall the repositories are migrated to GitLab- everything is clearly documented