@@ -6,40 +7,22 @@ Donate-review is a gitlab runner and review-app deployment target for [donate-ne
## Starting a review app
1. Push some changes to a branch (other than main) on tpo/web/donate-neo
1. Create an MR
1. Wait for the `lint`, `mypy`, and `test` jobs to finish
1. Run the `deploy-review` job, and wait for it to finish
1. Click the "View app" button
1. When done reviewing, click "Stop environment" or run the `stop-review` job
1. Merge changes
Pushing a commit on a non-main branch in the project repository will trigger a
CI pipeline that includes `deploy-review` job. This job will deploy a review
app hosted at `<branchname>.donate-review.torproject.net`.
# How-to
<!-- more in-depth procedure that may require interpretation -->
## Pager playbook
### deploy-review job fails with "The user who started this CI job does not have permission to deploy"
The donate-review runner is a shell runner, it doesn't use any kind of sandboxing or containerization. To try to mitigate potential security issues with created by letting any contributor run a shell script on the donate-review machine, a very simple "access control" was implemented in the gitlab CI script.
echo The user who started this CI job does not have permission to deploy
echo"Required level: 50. Actual level: $access_level"
exit 1
fi
```
Commits to the `main` branch will be deployed to a review app by the
`deploy-staging` job. The deployment process is similar except the app will be
hosted at `staging.donate-review.torproject.net`.
This script checks if the user is an owner of the repository (access level 50). Anyone who is *not an owner* won't be allowed to run the deploy job. If a user's jobs are failing with the above error message, make sure they're an owner, not a maintainer, developer, reporter, or guest.
All review apps are automatically stopped and cleaned up once the associated
branch is deleted.
### Static assets don't load on deployed review apps
# How-to
An issue we've seen several times is that static assets sometimes don't load. The issue was tracked back to donate-neo handling its own static assets. To fix this, the gitlab CI script was changed to run `manage.py collectstatic` and configure donate-review apache vhosts to serve the static files directly. This issue shouldn't rear its head again, but in the event that it does, start by ensuring that django's collectstatic is being run, that the static files are deployed to the correct location, and that the review app vhost is configured to serve the static files from the correct directory, with `Require all granted`.
## Pager playbook
## Disaster recovery
# Disaster recovery
In the event that the box is compromised, it should be rebuilt from scratch.
See [Installation](#installation) below.
@@ -49,68 +32,64 @@ See [Installation](#installation) below.
## Installation
1. bootstrap a new vm, and set up puppet
1. install gitlab runner on the machine and configure a [shell executor][]
1. add the `role: donate_review` parameter to the new machine in hiera
1. add the `role: donate_review` parameter to the new machine in hiera-enc
The only relevant logs are from apache, logrotate rotates them every 14 days.
The review app logs may be accessed using the `podman logs <container>` command
as the `gitlab-runner` user.
## Backups
donate-review has no special backup needs. All the donate-neo instances are
This service has no special backup needs. All the donate-neo instances are
ephemeral, and a new system can be bootstrapped solely from puppet.
## Overview
donate-review was created as part of tpo/web/donate-neo#6 and tpo/tpa/team#41108. Donate-review's purpose is to provide a review app deploy target for tpo/web/donate-neo. Most of the other tpo/web sites are static lektor sites, and can be easily deployed to a review app target as simple static sites fronted by Apache. But because donate-neo is a django application, it needs a specially-created deploy target for review apps. donate-review clones the donate-neo codebase, checks out the right commit, sets up the applications virtuel environment, dependencies, database and static files, and then deploys an Apache vhost for the new review app.
donate-review was created as part of tpo/web/donate-neo#6,
tpo/tpa/team#41108 and refactored as part of tpo/web/donate-neo#21.
Donate-review's purpose is to provide a review app deploy target for
tpo/web/donate-neo. Most of the other tpo/web sites are static lektor sites,
and can be easily deployed to a review app target as simple static sites
fronted by Apache. But because donate-neo is a django application, it needs a
specially-created deploy target for review apps.
## Security and risk assessment
@@ -148,39 +137,16 @@ real sandboxing or containerization. There was an attempt to set up the runner
using systemd-nspawn, but it was taking too long and we
[eventually decided against it][].
The current security measures include the basic things we get for free from
debian (file permissions, apparmor), as well as some access control in the
gitlab-ci file. The deploy-review job has to be run manually, and the job's
script starts by [checking if the user who ran the job is a repo maintainer][].
A proper security and risk assessment should be prioritized to find and
mitigate potential security problems.
Currently, project members with `Developper` permission or above in the
donate-neo project may edit the CI configuration to execute arbitrary commands
as the `gitlab-runner` user on the machine. Since these users are all trusted
contributors, this should pose no problem. However, care should be taken to
ensure no untrusted party is allowed to gain this privilege.
[eventually decided against it]:<https://gitlab.torproject.org/tpo/tpa/team/-/issues/41108#note_2913481>
[checking if the user who ran the job is a repo maintainer]:<https://gitlab.torproject.org/tpo/web/donate-neo/-/blob/2344c59a/.gitlab-ci.yml#L46>
## Technical debt and next steps
donate-neo instances (files, databases, and gunicorn instances) are not cleaned
up automatically. Cleanup only happens when someone runs the `stop-review` job.
If this job isn't run, These instances will not be cleaned up. Currently, there
is no solution for this. There's an [open Gitlab ticket][] to implement running
an `after_script` when a job is cancelled, and as of January 2024 is "targeting near term resolution" for the ticket. A potential solution is to write a cron job that just cleans up
### stop-review sometimes doesn't remove old deployments
The stop-review job is supposed to read the `gunicorn.pid` file, kill the gunicorn process, tear down the apache config, and remove the deployed code. Sometimes gunicorn dies early though, and the stop-review job fails early, leaving old configs and code lying around.
### Outdated databases
Donate Review has a known issue with updating existing MRs. The updated code
will get pulled to the review app target, but necessary setup won't be re-run.
This can lead to situations where a new Django app is installed, but the app's
migrations won't be run, leading to an outdated database. This bug was
discovered in tpo/tpa/team#41491.
The next step here is to make the donate-review service fully generic to allow
other web projects with special runtime requirements to deploy review apps in