Skip to content
Snippets Groups Projects
Forked from The Tor Project / TPA / Wiki Replica
3029 commits behind the upstream repository.

How we manage documentation inside TPA, but also touch on other wikis and possibly other documentation systems inside TPO.

Note that there is a different service called status for the status page at https://status.torproject.org.

The palest ink is better than the most capricious memory.

-- ancient Chinese proverb

Tutorial

Editing the wiki

If you have the right privileges (currently: being part of TPA, but we hope to improve this), you should have an Edit button at the top-right of pages in the wiki here:

https://gitlab.torproject.org/tpo/tpa/team/-/wikis/

If not, you need to issue a merge request in the wiki replica.

Because the wiki is actually mirrored to a git repository, it might be preferable to edit the Git repository and push there, so that the replica is up to date. See further instructions below.

How-to

Editing the wiki through Git

It is preferable to edit the wiki through the wiki replica. This ensures both the replica and the wiki are in sync, as the replica is configured to mirror its changes to the wiki. (See the GitLab documentation for how this was setup.)

To make changes there, just clone and push to this git repository:

git clone git@gitlab.torproject.org:tpo/tpa/wiki-replica.git

Make changes, and push. Note that a GitLab CI pipeline will check your changes and might warn you if you work on a file with syntax problems. Feel free to ignore those warnings that were already present, but do be careful at not adding new ones.

Ideally, you should also setup linting locally, see below.

Local linting configuration

While the wiki replica has continuous integration checks, it might be good to run those locally, to make sure you don't add any new warnings when making changes.

We currently only lint Markdown syntax, with markdownlint. You can install it using the upstream instructions, but anarcat prefers to run it under Docker, with a wrapper script like:

#!/bin/sh

exec docker run --volume "$PWD:/data/" --rm -i markdownlint/markdownlint "$@"

Drop this somewhere in your path as mdl and it will behave just as if it was installed locally.

Then you should drop this in .git/hooks/pre-commit (if you want to enforce checks):

#!/bin/bash

${GIT_DIR:-.git}/../bin/mdl-wrapper $(git diff --cached --name-only HEAD)

... or .git/hooks/post-commit (if you just want warnings):

#!/bin/sh

${GIT_DIR:-.git}/../bin/mdl-wrapper $(git diff-tree --no-commit-id --name-only -r HEAD)

Accepting merge requests on wikis

It's possible to work around the limitation of Wiki permissions by creating a mirror of the git wiki backing the wikis. This way more users can suggest changes to the wiki by submitting merge requests. It's not as easy as editing the wiki, but at least provides a way for outside contributors to participate.

To do this, you'll need to create project access tokens in the Wiki and use the repository mirror feature to replicate the wiki into a separate project.

  1. in the Wiki project, head for the Settings: Access Tokens page and create a new token with write_repository access

  2. optionally, create a new project for the wiki, for example called wiki-replica. you can also use the same project as the wiki if you do not plan to host other source code specific to that project there. we'll call this the "wiki replica" in either case

  3. in the wiki replica, head for the Settings: Mirroring repositories section and fill in the details for the wiki HTTPS clone URL:

    • Git repository URL: the HTTPS URL of the Git repository (which you can find in the Clone repository page on the top-right of the wiki) Important: Make sure you add a username to the HTTPS URL, otherwise mirroring will fail. For example, this wiki URL:

       https://gitlab.torproject.org/tpo/tpa/team.wiki.git

      should actually be:

       https://wiki-replica@gitlab.torproject.org/tpo/tpa/team.wiki.git
    • Mirror direction: push (only "free" option, pull is non-free)

    • Authentication method: Password (default)

    • Password: the Access token you created in the first step

    • Keep divergent refs: unchecked (optional, should make sure sync works in some edge cases)

    • Mirror only protected branches: checked (to keep merge requests from being needlessly mirrored to the wiki)

When you click the Mirror repository button, a sync will be triggered. Refresh the page to see status, you should see the Last successful update column updated. When you push to the replica, the wiki should be updated.

Naturally, because of limitations of GitLab, you cannot pull changes from the wiki to the replica. But considering only a limited set of users have access to the wiki in the first place, this shouldn't be a problem as long as everyone pushes to the replica.

Pager playbook

Wiki unavailable

If the GitLab server is down, the wiki will be unavailable. For that reason, it is highly preferable to keep a copy of the git repository backing the wiki on your local computer.

If for some reason you do not have such a copy, it is extremely unlikely you will be able to read this page in the first place. But, if for some reason you are able to, you should find the gitlab documentation to restore that service and then immediately clone a copy of this repository:

git@gitlab.torproject.org:tpo/tpa/team.wiki.git

or:

https://gitlab.torproject.org/tpo/tpa/team.wiki.git

If you can't find the GitLab documentation in the wiki, you can try to read the latest copy in the wayback machine.

Disaster recovery

If GitLab disappears in a flaming ball of fire, it should be possible to build a static copy of this website somehow. Originally, GitLab's wiki was based on Gollum, a simple Git-based wiki. In practice, GitLab's design has diverged wildly and is now a separate implementation.

The GitLab instructions still say you can run gollum to start a server rendering the source git repository to HTML. Unfortunately, that is done dynamically and cannot be done as a one-time job, or as a post-update git hook, so you would have to setup gollum as a service in the short term.

In the long term, it might be possible to migrate back to ikiwiki or another static site generator.

Reference

Installation

"Installation" was trivial insofar as we consider the GitLab step to be abstracted away: just create a wiki inside the team and start editing/pushing content.

In practice, the wiki was migrated from ikiwiki (see issue 34437) using anarcat's ikiwiki2hugo converter, which happened to be somewhat compatible with GitLab's wiki syntax.

The ikiwiki repository was archived inside GitLab in the wiki-archive and wiki-infra-archive repositories. History of those repositories is, naturally, also available in the history of the current wiki.

SLA

This service should be as available as GitLab or better, assuming TPA members keep a copy of the documentation cloned on their computers.

Design

Documentation for TPA is hosted inside a git repository, which is hosted inside a GitLab wiki. It is replicated inside a git repository at GitLab to allow external users to contribute by issuing pull requests.

GitLab wikis support Markdown, RDoc, AsciiDoc, and Org formats.

Scope

This documentation mainly concerns the TPA wiki, but there are other wikis on GitLab which are not directly covered by this documentation and may have a different policy.

Structure

The wiki has a minimalist structure: we try to avoid deeply nested pages. Any page inside the wiki should be reachable within 2 or 3 clicks from the main page. Flat is better than tree.

All services running at torproject.org MUST have a documentation page in the service directory which SHOULD at least include a "disaster recovery" and "pager playbook" section. It is strongly encouraged to follow the documentation template for new services.

This documentation is based on the Grand Unified Theory of Documentation, by Daniele Procida. To quote that excellent guide (which should, obviously, be self-documenting):

There is a secret that needs to be understood in order to write good software documentation: there isn’t one thing called documentation, there are four.

They are: tutorials, how-to guides, technical reference and explanation. They represent four different purposes or functions, and require four different approaches to their creation. Understanding the implications of this will help improve most documentation - often immensely.

We express this structure in a rather odd way: each service page has that structure embedded. This is partly due to limitations in the tools we use to manage the documentation -- GitLab wikis do not offer much in terms of structure -- but also because we have a large variety of services being documented. To give a concrete example, it would not make much sense to have a top-level "Tutorials" section with tutorials for GitLab, caching, emails, followed by "How to guides" with guides for... exactly the same list! So instead we flip that structure around and the top-level structure is by service: within those pages we follow the suggested structure.

Style

Writing style in the documentation is currently lose and not formally documented. But we should probably settle on some english-based, official, third-party style guide to provide guidance and resources. The Vue documentation has a great writing & grammar section which could form a basis here, as well as Jacob Kaplan-Moss's Technical Style article.

Authentication

The entire wiki is public and no private or sensitive information should be committed to it.

People

Most of the documentation has been written by anarcat, which may be considered the editor of the wiki, but any other contributors is strongly encouraged to contribute to the knowledge accumulating in the wiki.

Linting

There is a basic linting check deployed in GitLab CI on the wiki replica, which will run on pull requests and normal pushes. Naturally, it will not run when someone edits the wiki directly, as the replica does not pull automatically from the wiki (because of limitations in the free GitLab mirror implementation).

Those checks are setup in the .gitlab-ci.yml file. There is a basic test job that will run whenever a Markdown (.md) file gets modified. There is a rather convoluted pipeline to ensure that it runs only on those files, which requires a separate Docker image and job to generate that file list, because the markdownlint/markdownlint Docker image doesn't ship with git (see this discussion for details).

There's a separate job (testall) which runs every time and checks all markdown files.

Because GitLab has this... unusual syntax for triggering the automatic table of contents display ([[_TOC_]]), we need to go through some hoops to silence those warnings. This implies that the testall job will always fail, as long as we use that specific macro.

Those linting checks could eventually be expanded to do more things, like spell-checking and check for links outside of the current document. See the alternatives considered section for a broader discussion on the next steps here.

Issues

There is no issue tracker specifically for this project, File or search for issues in the team issue tracker.

Notable issues:

See also the limitations section below.

Monitoring and testing

There is not monitoring of this service, outside of the main GitLab monitoring systems.

There are no continuous tests of the documentation.

See the "alternatives considered" section for ideas on tests that could be ran.

Logs and metrics

No logs or metrics specific to the wiki are kept, other than what GitLab already does.

Backups

Backed up alongside GitLab, and hopefully in git clones on all TPA members machines.

Other documentation

Discussion

Documentation is a critical part of any project. Without documentation, things lose their meaning, training is impossible, and memories are lost. Updating documentation is also hard: things change after documentation is written and keeping documentation in sync with reality is a constant challenge.

This section talks about the known problems with the current documentation (systems) and possible solutions.

Limitations

Redundancy

The current TPA documentation system is a GitLab wiki, but used to be a fairly old ikiwiki site, part of the static site system.

As part of the ikiwiki migration, that level of redundancy was lost: if GitLab goes down, the wiki goes down, along with the documentation. This is mitigated by the fact that the wiki is backed by a Git repository. So TPA members are strongly encouraged to keep a copy of the Git repository locally to not only edit the content (which makes sure the copy is up to date) but also consult it in case of an infrastructure failure.

Unity

We have lots of documentation spaces. There's this wiki for TPA, but there are also different wikis for different teams. There's a proposal to create a community hub which could help. But that idea assumes people will know about the hub, which adds an extra layer of indirection.

It would be better if we could have group wikis, which were published as part of the 13.5 release but, unfortunately, only in the commercial version. So we're stuck with our current approach of having the "team" projects inside each group to hold the wiki.

It should also be noted that we have documentation scattered outside the wiki as well: some teams have documentation in text files, others are entire static websites. The above community hub could benefit from linking to those other resources as well.

Testing

There is no continuous testing/integration of the documentation. Typos frequently show up in documentation, and probably tons of broken links as well. Style is incoherent at best, possibly unreadable at worst. This is a tough challenge in any documentation system, due to the complexity and ambiguity of language, but it shouldn't deter us from running basic tests on the documentation.

This would require hooking up the wiki in GitLab CI, which is not currently possible within GitLab wikis. We'd need to switch the wiki to a full Git repository, possibly pushing to the wiki using a deploy key on succesful runs. But then why would we keep the wiki?

Structure

Wikis are notorious for being hard to structure. They can quickly become a tangled mess with oral tradition the only memory to find your way inside of the forest. The GitLab wikis are especially vulnerable to this as they do not offer many tools to structure content: no includes, limited macros and so on.

The is a mechanism to add a sidebar in certain sections, that said, which can help quite a bit in giving a rough structure. But restructuring the wiki is hard: renaming pages breaks all links pointing to it and there is no way to do redirects which is a major regression from ikiwiki.

Using a static site generator (SSG) could help here: many of them support redirections (and so does GitLab Pages, although in a very limited way). Many SSGs also support more "structure" features like indexes, hierarchical (and automatic) sidebars (based on structure, e.g. Sphinx or mkdocs), paging, per-section RSS feeds (for "blog" or "news" type functionality) and so on.

The "Tutorial/Howto/Reference/Discussion" structure is not as intuitive as one author might like to think. We might be better reframing this in the context of a service, for example merging the "Discussion" and "Reference" section, and moving the "Goals/alternatives considered" section into an (optional?) "Migration" section, since that is really what the discussion section is currently used for (planning major service changes and improvements).

The "Howto" section could be more meaningfully renamed "Guides", but this might break a lot of URLs.

Syntax

Markdown is great for jotting down notes, filing issues and so on, but it has been heavily criticised for use in formal documentation. One of the problem with Markdown is its lack of standardized syntax: there is CommonMark but it has yet to see wider adoption.

This makes Markdown not portable across different platforms supposedly supporting markdown.

It also lacks special mechanisms for more elaborate markups like admonitions (or generally: "semantic meanings") or "quick links" (say: bug#1234 pointing directly to the bug tracker).

It has to be said, however, that Markdown is widely used, much more than the alternatives (e.g. asciidoc or rst), for better or for worse. So it might be better to stick with it than to force users to learn a new markup language, however good it is supposed to be.

Editing

Since few people are currently contributing to the documentation, few people review changes done to it. As Jacob Kaplan-Moss quipped:

All good writers have a dirty little secret: they’re not really that good at writing. Their editors just make it seem that way.

In other words, we'd need a technical writer to review our docs, or at least setup a self-editing process the way Kaplan-Moss suggests above.

Templating

The current "service template" has one major flaw: when it is updated, the editor needs to manually go through all services and update those. It's hard to keep track of which service has the right headings (and is up to date with the template).

One thing that would be nice would be to have a way to keep the service pages in sync with the template. I asked for suggestions in the Hugo forum, where a simple suggestion was to version the template and add that to the instances, so that we can quickly see when a dependency needs to be updated.

To do a more complete comparison between templates and instances, I suspect I will have to roll my own, maybe something like mdsaw but using a real parse tree.

Note that there's also emd which is a "Markdown template processor", which could prove useful here (untested).

See also scaraplate and cookiecutter.

Goals

Note: considering we just migrated from ikiwiki to GitLab wikis, it is unlikely we will make any major change on the documentation system in the short term, unless one of the above issues becomes so critical it needs to immediately be fixed.

That said, improvements or replacements to the current sytem should include...

Must have

  • highly available: it should be possible to have readonly access to the documentation even in case of a total catastrophe (global EMP catastrophe excluded)

  • testing: the documentation should be "testable" for typos, broken links and other quality issues

  • structure: it should be possible to structure the documentation in a way that makes things easy to find and new users easily orient themselves

  • discoverability: our documentation should be easy to find and navigate for new users

  • minimal friction: it should be easy to contribute to the documentation (e.g. the "Edit" button on a wiki is easier than "make a merge request", as a workflow)

Nice to have

  • offline write: it should be possible to write documentation offline and push the changes when back online. a git repository is a good example of such functionality

  • nice-looking, easily themable

  • coherence: documentation systems should be easy to cross-reference between each other

  • familiarity: users shouldn't have to learn a new markup language or tool to work on documentation

Non-Goals

  • repeat after me: we should not write our own documentation system

Approvals required

TPA, although it might be worthwhile to synchronize this technology with other teams so we have coherence across the organisation.

Proposed Solution

We currently use GitLab wikis.

Cost

Staff hours, hosting costs shadowed by GitLab.

Alternatives considered

Static site generators

  • 11ty: picked by mozilla, javascript
  • hugo: golang
  • ikiwiki: previously used, old, Perl, hard to setup, occult templating system, slow
  • lektor: used at Tor for other public websites
  • Mkdocs: also supported by Read the docs, similar to Sphinx, but uses Markdown instead of RST
  • Nanoc: used by GitLab
  • pelican: watch out for pelican, another user reports that, with caching, generating a 500 page site takes 30 seconds, 2 minutes without caching
  • Sphinx: used by Read the docs, enforces more structure and more formal (if less familiar) markup (ReStructured Text, RST), see also rstfmt a linter/formatter for RST
  • zola: rust

mkdocs

I did a quick test of mkdocs to see if it could render the TPA wiki without too many changes. The result is not so bad! I am not a fan of the mkdocs theme, but it does work, and has prev/next links like a real book which is a nice touch (although maybe not useful for us, outside of meetings maybe). Navigation is still manual (defined in the configuration file instead of a sidebar).

Syntax is not entirely compatible, unfortunately. The GitLab wiki has this unfortunate habit of expecting "semi-absolute" links everywhere, which means that to link to (say) this page, we do:

[documentation service](service/documentation)

... from anywhere in the wiki. It seems like mkdocs expects relative links, so this would be the same from the homepage, but from the service list it should be:

[documentation service](documentation)

... and from a sibling page:

[documentation service](../documentation)

Interestingly, mkdocs warns us about broken links directly, which is a nice touch. It found this:

WARNING -  Documentation file 'howto.md' contains a link to 'old/new-machine.orig' which is not found in the documentation files. 
WARNING -  Documentation file 'old.md' contains a link to 'old/new-machine.orig' which is not found in the documentation files. 
WARNING -  Documentation file 'howto/new-machine.md' contains a link to 'howto/install.drawio' which is not found in the documentation files. 
WARNING -  Documentation file 'howto/rt.md' contains a link to 'howto/org/operations/Infrastructure/rt.torproject.org' which is not found in the documentation files. 
WARNING -  Documentation file 'policy/tpa-rfc-1-policy.md' contains a link to 'policy/workflow.png' which is not found in the documentation files. 
WARNING -  Documentation file 'policy/tpa-rfc-9-proposed-process.md' contains a link to 'policy/workflow.png' which is not found in the documentation files. 
WARNING -  Documentation file 'service/forum.md' contains a link to 'service/team@discourse.org' which is not found in the documentation files. 
WARNING -  Documentation file 'service/lists.md' contains a link to 'service/org/operations/Infrastructure/lists.torproject.org' which is not found in the documentation files. 

A full rebuild of the site takes 2.18 seconds. Incremental rebuilds are not faster, which is somewhat worrisome.

hugo

Tests with hugo were really inconclusive. We had to do hugo new site --force . for it to create the necessary plumbing to have it run at all. And then it failed to parse many front matter, particularly in the policy section, because they are not quite valid YAML blobs (because of the colons). After fixing that, it ran, but completely failed to find any content whatsoever.

Lektor

Lektor is similarly challenging: all files would need to be re-written to add a body: tag on top and renamed to .lr.

Testing

To use those tests, we'd need to switch the wiki into a Git repository, as it is not (currently) possible to run CI on changes in GitLab wikis.

  • GitLab has a test suite for their documentation which:
    • runs the nodejs markdownlint: checks that Markdown syntax
    • runs vale: grammar, style, and word usage linter for the English language
    • checks the internal anchors and links using Nanoc
  • Danger systems has a bunch of plugins which could be used to check documentation
  • textlint: pluggable text linting approach recognizing markdown
  • proselint: grammar and style checkign
  • languagetool: Grammar, Style and Spell Checker
  • anorack: spots errors based on phonemes
  • redpen: huge JAR, can be noisy
  • linkchecker: can check links in HTML (anarcat is one of the maintainers), has many alternatives, see for example lychee

Note that we currently use markdownlint, the Ruby version, not the Node version. This was primarily because anarcat dislikes Node more than Ruby, but it turns out the Ruby version also has more features. Notably, it can warn about Kramdown compilation errors, for example finding broken Markdown links.

See also this LWN article.