title: TPA-RFC-38: Setting Up a Wiki Service
costs: TODO
approval: TODO
affected users: TODO
deadline: TODO
status: draft
discussion: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40909
Summary: This RFC aims to identify problems with our current gitlab wikis, and the best solution for those issues.
Background
Currently, our projects that require a wiki use GitLab wikis. GitLab wikis are rendered with a fork of gollum and editing is controlled by GitLab's permission system.
Problem statement
GitLab's permission system only allows maintainers to edit wiki pages, meaning that normal users (anonymous or signed in) don't have the permissions required to actually edit the wiki pages.
One solution adopted by TPA was to create a separate wiki-replica repository so that people without edit permission can at least propose edits for TPA maintainers to accept. The problem with that approach is that it's done through a merge request workflow which adds much more friction to the editing process, so much that the result cannot really be called a wiki anymore.
GitLab wikis are not searchable in the community edition. Wikis require advanced search to be searchable which is not part of the free edition. This makes it extremely hard to find content in the wiki, naturally, but could be mitigated by the adoption of GitLab Ultimate.
The wikis are really disorganized. There are a lot of wikis in GitLab. Out of 1494 publicly accessible projects:
- 383 are without wikis
- 1053 have empty wikis
- 58 have non-empty wikis
They collectively have 3516 pages in total, but almost the majority of
this is the 1619 pages of the legacy/trac
wiki. The top 10 of wikis
by size:
wiki | page count |
---|---|
legacy/trac | 1619 |
tpo/team | 1189 |
tpo/tpa/team | 216 |
tpo/network-health/team | 56 |
tpo/core/team | 39 |
tpo/anti-censorship/team | 35 |
tpo/operations/team | 32 |
tpo/community/team | 30 |
tpo/applications/tor-browser | 29 |
tpo/applications/team | 29 |
Excluding legacy/trac
, more than half (63%) the wiki pages are in the
tpo/team wiki. If we count only the first three wikis, that ratio goes
up to 77% and if 85% of all pages live in the top 10 wikis, again
excluding legacy/trac
.
In other words, there's a very long tail of wikis (Backlog) that account for less than 15% of the page count. We should probably look at centralizing this, as it will make all further problems easier to solve.
Goals
The goals of this proposal are as follows:
- Identify requirements for a wiki service
- Proposal modifications or a new implementation of the wiki service to fits these requirements
Requirements
Must have
-
Users can edit wiki pages without being given extra permissions ahead of time
-
Content must be searchable
-
Users should be able to read and edit pages over a hidden service
-
High-availablity for some documentation: if GitLab or the wiki website is unavailable, administrators should still be able to access the documentation needed to recover the service
-
A clear transition plan from GitLab to this new wiki: markup must continue to work as is (or be automatically converted links must not break during the transition
-
Folder structure: current GitLab wikis have a page/subpage structure (e.g. TPA's
howto/
has all the howto,service/
has all the service documentation, etc) which need to be implemented as well, this includes having "breadcrumbs" to walk back up the hierarchy, or (ideally) automatic listing of sub-pages -
Single dynamic site, if not static (e.g. we have a single MediaWiki or Dokuwiki, not one MediaWiki per team because applications need constant monitoring and maintenance to function properly, so we need to reduce the maintenance burden
Nice to have
-
Minimal friction for contribution, for example a "merge request" might be too large a barrier for entry
-
Namespaces: different groups under TPO (i.e. TPA, anti-censorship, comms) must have their own namespace, for example:
/tpo/tpa/wiki_page_1
,/tpo/core/tor/wiki_page_2
or Mediawiki's namespace systems where each team could have their own namespace (e.g.TPA:
,Anti-censorship:
,Community:
, etc) -
Search must work across namespaces
-
Integration with anon_ticket
-
Integration with existing systems (GitLab, ldap, etc) as an identity provider
-
Support offline reading and editing (e.g. with a git repository backend)
Non-Goals
-
Localization: more important for user-facing, and https://support.torproject.org is translated
-
Confidential content: best served by Nextcloud (eg. TPI folder) or other services, content for the "wiki" is purely public data
-
Software-specific documentation: e.g. Stem, Arti, little-t-tor documentation (those use their own build systems like a static site generator although we might still want to recommend a single program for documentation (e.g. settle on MkDocs or Hugo or Lektor)
Proposals
Separate wiki service
The easiest solution to GitLab's permission issues is to use a wiki service separately from GitLab. This wiki service can be one that we host, or a service hosted for us by another organization.
Examples or Personas
Examples:
Bob: non-technical person
Bob is a non-technical person who wants to fix some typos and add some resources to a wiki page.
With the current wiki, Bob needs to make a GitLab account, and be
given developer permissions to the wiki repository, which is
unlikely. Alternatively, Bob can open a ticket with the proposed
changes, and hope a developer gets around to making them. If the wiki
has a wiki-replica
repository then Bob could also git clone
the
wiki, make the changes, and then create a PR, or edit the wiki through
the web interface. Bob is unlikely to want to go through such a
hassle, and will probably just not contribute.
With a new wiki system fulfilling the "must-have" goals: Bob only needs to make a wiki account before being able to edit a wiki page.
Alice: a developer
Alice is a developer who helps maintain a TPO repository.
With current wiki: Alice can edit any wiki they have permissions for. However if alice wants to edit a wiki they don't have permission for, they need to go through the same PR or issue workflow as Bob.
With the new wiki: Alice will need to make a wiki account in addition to their GitLab account, but will be able to edit any page afterward.
Anonymous cypherpunk
The "cypherpunk" is a person who wants to contribute to a wiki anonymously.
With current wiki, the cypherpunk will need to follow the same procedure as Bob.
With a new wiki: with only the must-have features, cypherpunks can only contribute pseudonymously. If the new wiki supports anonymous contributions, cypherpunks will have no barrier to contribution.
Spammer
1337_spamlord
is a non-contributor who likes to make spam edits for
fun.
spamlord will also need to follow the same procedure as bob. This makes spamlord unlikely to try to spam much, and any attempts to spam are easily stopped.
With new wiki: with only must-have features, spamlord will have the same barriers, and will most likely not spam much. If anonymous contributions are supported, spamlord will have a much easier time spamming, and the wiki team will need to find a solution to stop spamlord.
Potential Candidates
- MediaWiki: PHP/Mysql wiki platform, supports markdown via extension, used by Wikipedia
- MkDocs: python-based static-site generator, markdown, built-in dev server
- Hugo: popular go-based static site generator, documentation-specific themes exist such as GeekDocs
- ikiwiki: a git-based wiki with a CGI web interface
mediawiki
Advantages
Polished web-based editor (VisualEditor).
Supports sub-pages but not in the Main namespace by default. We could use namespaces for teams and subpages as needed in each namespace?
Possible support for markdown with this extension: https://www.mediawiki.org/wiki/Extension:WikiMarkdown status unknown
"Templating", eg. for adding informative banners to pages or sections
Supports private pages (per-user or per-group permissions).
Basic built-in search and supports advanced search plugins (ElasticSearch, SphinxSearch).
packaged in debian
Downsides:
- limited support for our normal database server, PostgreSQL: https://www.mediawiki.org/wiki/Manual:PostgreSQL key quotes:
- second-class support, and you may likely run into some bugs
- Most of the common maintenance scripts work with PostgreSQL; however, some of the more obscure ones might have problems.
- While support for PostgreSQL is maintained by volunteers, most core functionality is working.
- migrating from MySQL to PostgreSQL is possible the reverse is harder
- they are considering removing the plugin from core, see https://phabricator.wikimedia.org/T315396
- full-text search requires Elasticsearch which is ~non-free software
- one alternative is SphinxSearch which is considered unmaintained but works in practice (lavamind has maintained/deployed it until recently)
- no support for offline workflow (there is a git remote, but it's not well maintained and does not work for authenticated wikis)
mkdocs
internationalization status unclear, possibly a plugin, untested
used by onion docs, could be useful as a software-specific documentation project
major limitation is web-based editing, which require either a GitLab merge request workflow or a custom app.
hugo
used for research.tpo, the developer portal.
same limitation as mkdocs for web-based editing
mdbook
used by arti docs, to be researched.
ikiwiki
Not really active upstream anymore, build speed not great, web interface is plain CGI (slow, editing uses a global lock).