A "status" dashboard is a simple website that allows service admins to clearly and simply announce down times and recovery.
Note that this be considered part of the documentation system, but is documented separately.
Local development environment
To install the development environment for the status site, you should have a copy of the Hugo static site generator and the git repository:
sudo apt install hugo git clone --recursive -b main firstname.lastname@example.org:project/web/status-site cd status-site
WARNING: the URL of the Git repository changed! It used to be hosted at GitLab, but is now hosted at Gitolite. The repository is mirrored to GitLab, but pushing there will not trigger build jobs.
Then you can start a local development server to preview the site with:
hugo serve --baseUrl=http://localhost/ firefox https://localhost:1313/
The content can also be built in the
public/ directory with, simply:
Creating new issues
Issues are stored in
content/issues/. You can create a new issue
hugo new, for example:
hugo new issues/2021-02-03-testing-cstate-again.md
This create the file from a pre-filled template (called an
archetype in Hugo)
and put it in
If you do not have hugo installed locally, you can also copy the
template directly (from
copy an existing issue and use it as a template.
Otherwise the upstream guide on how to create issues is fairly thorough and should be followed.
In general, keep in mind that the
date field is when the issue
started, not when you posted the issue, see this feature
request asking for an explicit "update" field.
Also note that you can add
draft: true to the front-matter (the
block on top) to keep the post from being published on the front page
before it is ready.
Uploading site to the static mirror system
In other words, uploading the site is automated by continuous integration. So you simply need to commit and push:
git commit -a -myolo git push
Note that only the
webwml group has access to the repository for
You will see progress of the Jenkins jobs:
If all goes well, the changes should propagate to the mirrors within about 5 to 10 minutes, depending on how busy Jenkins is.
If the jobs did not trigger, make sure you are pushing to the Gitolite
git-rw.torproject.org) and NOT the GitLab server, which is
just a mirror and cannot currently trigger Jenkins jobs.
Merge requests may also be issued from the mirror of the repository on GitLab:
... but will need to be merged into the
git-rw server by someone in
the above group to take effect. More people have access to the GitLab
repository and should therefore be able to collaborate there.
See also the disaster recovery options below.
Keep in mind that this is a public website. You might want to talk
comms@ people before publishing big or sensitive
cState relies on "systems" which live inside a "category" For example,
the "v3 onion services" are in the "Tor network" category. Those are
defined in the
config.yml file, and each issue (in
refers to one or more "system" that is affected by it.
The logo lives in
static/logo.png. Some colors are defined in
config.yml, search for
Colors throughout cState.
The only Nagios warning that can come out of this service is if the static synchronisation fails. See the static site system for more information on diagnosing those.
It should be possible to deploy the static website anywhere that supports plain HTML, assuming you have a copy of the git repository.
The instructions below assume you have a copy of the git repository. Make sure you follow the installation instructions to also clone the submodules! If the git repository is not available, you could start from scratch using the example repository as well.
From here on, it is assumed you have a copy of the git repository (or the example one).
Those procedures were not tested.
Manual deployment to the static mirror system
git-rw is down, you can upload the
public/ folder content under
The canonical source for the static websites rotation is defined in
modules/roles/misc/static-components.yaml) and is
currently set to
should be enough:
rsync -rtP public/ email@example.com:/srv/status.torproject.org/htdocs
NOTE: there is a copy of the git repository in
well. Ignore it: it's out of date but could be used to build the
website in a pinch.
Then the new source material needs to be synchronized to the mirrors, with:
sudo -u torwww static-update-component status.torproject.org
This requires membership to the
Don't forget to push the changes to the git repository, once that is available. It's important so that the next people can start from your changes:
git commit -a -myolo git push
- Build command:
- Publish directory:
- Add one build environment variable
Then, of course, DNS needs to be updated to point there.
GitLab pages deployment
A site could also be deployed on another GitLab server with "GitLab pages" enabled. For example, if the repository is pushed to https://gitlab.com/, the GitLab CI/CD system there will automatically pick it up and publish it.
Then DNS needs to be tweaked to point there as well.
This service should be highly available. It should support failure from one or all point of presence: if all fail, it should be easy to deploy it to a third-party provider.
The status site is part of the static mirror system and is built with Jenkins jobs, from a git repository on the git server. This was setup this way because that is how every other static website is currently built.
- a new static component owned by
- a new build script in the jenkins/tools.git repository
- a new build job in the jenkins/jobs.git repository
- a new entry in the ssh wrapper in the admin/static-builds.git repository
- a new gitolite repository with hooks to ping the Jenkins server and mirror to GitLab
We also considered using GitLab CI for deployment but (a) GitLab pages is not yet setup and (b) it doesn't integrate well with the static mirror system for now. See the broader discussion of the static site system improvements.
Upstream issues can be found and filed in the GitHub issue tracker.
Monitoring and testing
The site, like other static mirrors, is monitored by Nagios with
dsa_check_staticsync check, which ensures all mirrors are up to
Logs and metrics
There are no logs or metrics specific to this service, see the static site service for details.
This project comes from two places:
during the 2020 TPA user survey, some respondents suggested to document "down times of 1h or longer" and better communicate about service statuses
separately, following a major outage in the Tor network due to a DDOS, the network team and network health teams asked for a dashboard to inform tor users about such problems in the future
This is therefore a project spanning multiple teams, with different stakeholders. The general idea is to have a site (say status.torproject.org) that simply shows users how things are going, in an easy to understand form.
In general, the goal is to provide a simple interface to provide users with status updates.
- user-friendly: the public website must be easy to understand by the Tor wider community of users (not just TPI/TPA)
status updates and progress: "post status problem we know about
so the world can learn if problems are known to the Tor team."
- example: "[recent] v3 outage where we could have put out a small FAQ right away (go static HTML!) and then update the world as we figure out the problem but also expected return to normal."
- multi-stakeholder: "easily editable by many of us namely likely the network health team and we could also have the network team to help out"
- simple to deploy and use: pushing an update shouldn't require complex software or procedures. editing a text file, committing and pushing, or building with a single command and pushing the HTML, for example, is simple enough. installing a MySQL database and PHP server, for example, is not simple enough.
- keep it simple
- free-software based
Nice to have
- deployment through GitLab (pages?), with contingency plans
- separate TLD to thwart DNS-based attacks against torproject.org
- same tool for multiple teams
- per-team filtering
- RSS feeds
- integration with social media?
- responsive design
- automation: updating the site is a manual process. no automatic reports of sensors/metrics or Nagios, as this tends to complicate the implementation and cause false positives
TPA, network team, network health team.
We're experimenting with cstate because it's the only static website generator with such a nice template out of the box that we could find.
Just research and development time. Hosting costs are negligible.
Those are the status dashboards we know about and that are still somewhat in active development:
- cstate, Hugo-based static site generator, tag-based RSS feeds, easy setup on Netlify, GitLab CI integration, badges, read only API
- MySQL database
- email notifications
- not distributed
- no Nagios integration
- no Twitter notifications
- user-friendly - seems to be even nicer than Cachet, as there are links to individual announcements and notifications
- no LDAP support
- similar performance problems than Cachet
Those were previously evaluated in a previous life but ended up being abandoned upstream:
- Overseer - used at Disqus.com, Python/Django, user-friendly/simple, administrator non-friendly, twitter integration, Apache2 license, development stopped, Disqus replaced it with Statuspage.io
- Stashboard - used at Twilio, MIT license, demo, Twitter integration, REST API, abandon-ware, no authentication, no Unicode support, depends on Google App engine, requires daily updates
Baobab - previously used at Gandi, replaced with
statuspage.io, Django based
Those were discarded because they do not provide an "out of the box" experience:
- use Jenkins to run jobs that check a bunch of things and report a user-friendly status?
- just use a social network account (e.g. Twitter)
- "just use the wiki"
- use Drupal ("there's a module for that")
- roll our own with Lektor, e.g. using this template
- Amazon Service Health Dashboard
- Disqus - based on statuspage.io
- GitLab - based on status.io
- Github - "Battle station fully operational", auto-refresh, twitter-connected, simple color coded (see this blog post for more details), not open-source (confirmed in personal email between GitHub support and anarcat on 2013-05-02)
- Potager.org - ikiwiki based
- Riseup.net - RSS feeds
- Signal - simple, plain HTML page
- sr.ht - cState
- Twilio - email, slack, RSS subscriptions, lots of services shown
- Wikimedia - based on proprietary nimsoft software, deprecated in favor of Grafana