Skip to content
Snippets Groups Projects
Verified Commit 3fffb29b authored by anarcat's avatar anarcat
Browse files

spellcheck and finish shim docs review

parent 05683fad
No related branches found
No related tags found
No related merge requests found
......@@ -141,7 +141,7 @@ it hasn't already been [migrated to GitLab](howto/gitlab#how-to-migrate-a-git-re
## Building a Hugo site
Normally, you should be able to deploy a hugo site by including the
Normally, you should be able to deploy a Hugo site by including the
template and setting a few variables. This `.gitlab-ci.yml` file,
taken from the [status.tpo .gitlab-ci.yml](https://gitlab.torproject.org/tpo/tpa/status-site/-/blob/main/.gitlab-ci.yml), should be sufficient:
......@@ -272,7 +272,7 @@ template][].
A typical failure will be that users complains that their
`deploy_static` job fails. We have yet to see such a failure occur,
but if if does, users should provide a link to the Job log, which
but if it does, users should provide a link to the Job log, which
should provide more information.
## Disaster recovery
......@@ -312,7 +312,7 @@ occur:
The static shim server itself should be fairly immune to compromise as
only TPA is allowed to login over SSH, apart from the private keys
configured in the GitLab projects. And those are very restricted in
what they can do (ie. only `rrsync` and `static-update-component`).
what they can do (i.e. only `rrsync` and `static-update-component`).
# Reference
......@@ -346,6 +346,30 @@ component.
![SSH deploy design of the static-shim](static-shim/architecture-static-shim-ssh.png)
The sites are deployed on a separate static-source to avoid adding
complexity to the already complicated, general purpose static source
(`staticiforme`). This has the added benefit that the source can be
hardened in the sense that access is restricted to TPA (which is not
the case of `staticiforme`).
The mapping between webhooks and static components is established in
Puppet, which writes the SSH configuration, hard-coding the target
directory which corresponds to the source directory in the
`static-components.yaml` file. This is done to ensure that a given
GitLab project only has access to a single site and cannot overwrite
other sites.
This involves that each site configured in this way must have a secret
token (in GitLab) and configuration (in Hiera) created by TPA in
Puppet. The secret token must also be configured in the GitLab
project. This could be automated by the judicious use of the GitLab
API using admin credentials, but considering that new sites are not
created very frequently, it is currently be done by hand.
The SSH key is generated by the user, but that could also be managed
by Trocla, although only the newer versions support that
functionality, and that version is not currently available in Debian.
A [previous design](#webhook-deployment) involved a webhook written in Python, but now most
of the business logic resides in a [`static-shim-deploy.yml` template]
template which is basically a shell script embedded in a YAML
......@@ -490,7 +514,7 @@ deployment.
* deploy sites from GitLab CI to the static mirror system
* site A cannot deploy to site B without being explicitly granted
permissions
* server-side (ie. in Puppet) access control (ie. user X can only
* server-side (i.e. in Puppet) access control (i.e. user X can only
deploy site B)
### Nice to have
......@@ -526,8 +550,10 @@ documentation](service/jenkins#gitlab-ci) and [ticket 40364](https://gitlab.torp
We considered using GitLab's [CI deployment mechanism](https://about.gitlab.com/blog/2021/02/05/ci-deployment-and-environments/) instead of
webhooks, but originally decided against it for the following reasons:
* the complexity is similar: both need a shared token between GitLab
and the static source
* the complexity is similar: both need a shared token (webhook secret
vs SSH private key) between GitLab and the static source (the
webhook design, however, does look way more complex than the deploy
design, when you compare the two diagrams)
* however, configuring the deployment variables takes more click
([9](https://gitlab.torproject.org/tpo/tpa/status-site/-/settings/ci_cd) vs [5](https://gitlab.torproject.org/tpo/tpa/status-site/-/hooks) in my count), and is slightly more confusing
......@@ -537,27 +563,21 @@ webhooks, but originally decided against it for the following reasons:
* the deployment also requires custom code to be added to the
`.gitlab-ci.yml` file. in the context where we are considering
using GitLab pages to replace the static mirror system in the long
term, we prefer to avoid adding custom stuff to the CI config file
and "pretend" that this is "just like GitLab pages"
term, we prefer to avoid adding custom stuff to the CI
configuration file and "pretend" that this is "just like GitLab
pages"
* we prefer to open a HTTPS port than an SSH port to GitLab, from a
security perspective, even if the SSH user would be protected by an
proper `authorized_keys`. in the context where i'm considering
proper `authorized_keys`. in the context where we could consider
locking down SSH access to only jump boxes, it would require an
exception and is more error prone (e.g. if we somehow forget the
`command=` override, we open full shell access)
After trying the webhook deployment mechanism (below), we decided to
go back to the deployment mechanism instead, because:
* the webhook implementation fails if sites take more than 10 seconds
to deploy.
* the webhook implementation doesn't provide much visibility on
failures or progress, to see the list of recent webhook calls, head
to Settings -> Webhooks -> Edit -> Recent deliveries
See below for details on that, and above for the full design of the
current deployment.
go back to the deployment mechanism instead. See below for details on
the reasoning, and above for the full design of the current
deployment.
### webhook deployment
......@@ -609,7 +629,7 @@ created very frequently, it could also be done by hand.
Unfortunately this design has two major flaws:
1. webhooks are designed to be fast and short-lived: most site
deployments take longer than the preconfigured webhook timeout (10
deployments take longer than the pre-configured webhook timeout (10
seconds) and therefore cannot be deployed synchronously, which
implies that...
......@@ -620,8 +640,9 @@ Unfortunately this design has two major flaws:
output is available to the user at all. running asynchronously is
even worse as deployment errors do not show up in GitLab at all
and would require special monitoring by TPA, instead of delegating
that management to users.
that management to users. It is possible to to see the list of
recent webhook calls, in Settings -> Webhooks -> Edit ->
Recent deliveries. But that is rather well-hidden.
In the short term, the webhook system might be used asynchronously,
but in the long term, we're considering switching back to the
deployment system documented above.
In the short term, the webhook system has be used asynchronously,
but we have since moved to the deployment system documented above.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment