spellcheck and finish shim docs review

3fffb29b · anarcat · 05683fad · 3fffb29b
Verified Commit 3fffb29b authored 3 years ago by anarcat
--- a/service/static-shim.md
+++ b/service/static-shim.md
@@ -141,7 +141,7 @@ it hasn't already been [migrated to GitLab](howto/gitlab#how-to-migrate-a-git-re

 ## Building a Hugo site

-Normally, you should be able to deploy a hugo site by including the
+Normally, you should be able to deploy a Hugo site by including the
 template and setting a few variables. This `.gitlab-ci.yml` file,
 taken from the [status.tpo .gitlab-ci.yml](https://gitlab.torproject.org/tpo/tpa/status-site/-/blob/main/.gitlab-ci.yml), should be sufficient:

@@ -272,7 +272,7 @@ template][].

 A typical failure will be that users complains that their
 `deploy_static` job fails. We have yet to see such a failure occur,
-but if if does, users should provide a link to the Job log, which
+but if it does, users should provide a link to the Job log, which
 should provide more information.

 ## Disaster recovery
@@ -312,7 +312,7 @@ occur:
 The static shim server itself should be fairly immune to compromise as
 only TPA is allowed to login over SSH, apart from the private keys
 configured in the GitLab projects. And those are very restricted in
-what they can do (ie. only `rrsync` and `static-update-component`).
+what they can do (i.e. only `rrsync` and `static-update-component`).

 # Reference

@@ -346,6 +346,30 @@ component.

 ![SSH deploy design of the static-shim](static-shim/architecture-static-shim-ssh.png)

+The sites are deployed on a separate static-source to avoid adding
+complexity to the already complicated, general purpose static source
+(`staticiforme`). This has the added benefit that the source can be
+hardened in the sense that access is restricted to TPA (which is not
+the case of `staticiforme`).
+
+The mapping between webhooks and static components is established in
+Puppet, which writes the SSH configuration, hard-coding the target
+directory which corresponds to the source directory in the
+`static-components.yaml` file. This is done to ensure that a given
+GitLab project only has access to a single site and cannot overwrite
+other sites.
+
+This involves that each site configured in this way must have a secret
+token (in GitLab) and configuration (in Hiera) created by TPA in
+Puppet. The secret token must also be configured in the GitLab
+project. This could be automated by the judicious use of the GitLab
+API using admin credentials, but considering that new sites are not
+created very frequently, it is currently be done by hand. 
+
+The SSH key is generated by the user, but that could also be managed
+by Trocla, although only the newer versions support that
+functionality, and that version is not currently available in Debian.
+
 A [previous design](#webhook-deployment) involved a webhook written in Python, but now most
 of the business logic resides in a [`static-shim-deploy.yml` template]
 template which is basically a shell script embedded in a YAML
@@ -490,7 +514,7 @@ deployment.
 * deploy sites from GitLab CI to the static mirror system
 * site A cannot deploy to site B without being explicitly granted
   permissions
- * server-side (ie. in Puppet) access control (ie. user X can only
+ * server-side (i.e. in Puppet) access control (i.e. user X can only
   deploy site B)

 ### Nice to have
@@ -526,8 +550,10 @@ documentation](service/jenkins#gitlab-ci) and [ticket 40364](https://gitlab.torp
 We considered using GitLab's [CI deployment mechanism](https://about.gitlab.com/blog/2021/02/05/ci-deployment-and-environments/) instead of
 webhooks, but originally decided against it for the following reasons:

- * the complexity is similar: both need a shared token between GitLab
-   and the static source
+ * the complexity is similar: both need a shared token (webhook secret
+   vs SSH private key) between GitLab and the static source (the
+   webhook design, however, does look way more complex than the deploy
+   design, when you compare the two diagrams)

 * however, configuring the deployment variables takes more click
   ([9](https://gitlab.torproject.org/tpo/tpa/status-site/-/settings/ci_cd) vs [5](https://gitlab.torproject.org/tpo/tpa/status-site/-/hooks) in my count), and is slightly more confusing
@@ -537,27 +563,21 @@ webhooks, but originally decided against it for the following reasons:
 * the deployment also requires custom code to be added to the
   `.gitlab-ci.yml` file. in the context where we are considering
   using GitLab pages to replace the static mirror system in the long
-   term, we prefer to avoid adding custom stuff to the CI config file
-   and "pretend" that this is "just like GitLab pages"
+   term, we prefer to avoid adding custom stuff to the CI
+   configuration file and "pretend" that this is "just like GitLab
+   pages"

 * we prefer to open a HTTPS port than an SSH port to GitLab, from a
   security perspective, even if the SSH user would be protected by an
-   proper `authorized_keys`. in the context where i'm considering
+   proper `authorized_keys`. in the context where we could consider
   locking down SSH access to only jump boxes, it would require an
   exception and is more error prone (e.g. if we somehow forget the
   `command=` override, we open full shell access)

 After trying the webhook deployment mechanism (below), we decided to
-go back to the deployment mechanism instead, because:
-
- * the webhook implementation fails if sites take more than 10 seconds
-   to deploy.
- * the webhook implementation doesn't provide much visibility on
-   failures or progress, to see the list of recent webhook calls, head
-   to Settings -> Webhooks -> Edit -> Recent deliveries
-
-See below for details on that, and above for the full design of the
-current deployment.
+go back to the deployment mechanism instead. See below for details on
+the reasoning, and above for the full design of the current
+deployment.

 ### webhook deployment

@@ -609,7 +629,7 @@ created very frequently, it could also be done by hand.
 Unfortunately this design has two major flaws:

 1. webhooks are designed to be fast and short-lived: most site
-    deployments take longer than the preconfigured webhook timeout (10
+    deployments take longer than the pre-configured webhook timeout (10
    seconds) and therefore cannot be deployed synchronously, which
    implies that...

@@ -620,8 +640,9 @@ Unfortunately this design has two major flaws:
    output is available to the user at all. running asynchronously is
    even worse as deployment errors do not show up in GitLab at all
    and would require special monitoring by TPA, instead of delegating
-    that management to users.
+    that management to users. It is possible to to see the list of
+    recent webhook calls, in Settings -> Webhooks -> Edit ->
+    Recent deliveries. But that is rather well-hidden.

-In the short term, the webhook system might be used asynchronously,
-but in the long term, we're considering switching back to the
-deployment system documented above.
+In the short term, the webhook system has be used asynchronously,
+but we have since moved to the deployment system documented above.