This is the discussion ticket for TPA-RFC-38: Setting Up a Wiki Service. This ticket serves as a place where people can suggest changes to the RFC, as well as suggest goals and must-have features for the new wiki service
Thanks kez for working on this! My only comment is about "Users can edit wiki pages without logging in". I'm not sure about that. We also want to have a way to protect the wiki against spam.
agreed, i think the spec is more like "people with an account can edit any wiki", which is the main limitation in gitlab now. we don't necessarily want a "wiki" wiki where anyone can edit...
one of the goals i would like to have is offline support. i do a lot of editing in my own text editor and push changes over git, changing this workflow to a web interface would be extremely irritating for me. it would also be a liability for TPA because if the website is down, our docs are down.
so i guess a "nice to have" would be something like "offline support / git editing". but a must have is "high availability for TPA documentation", which "git" is only one of many implementations i guess...
another "must have", in my opinion, is a clear transition plan from the current wikis. we MUST be able to either convert the existing markup or use it as is (e.g. the wiki must have markdown support or there should be a converter). we've already got bitten by this in the Trac transition, and it was not fun.
in general, this should be meshed with the service/documentation.md page, which has a lot of thoughts on this matter as well. maybe move the alternatives section into the RFC? there's a (redundant now?) goals section in there too:
so i guess a "nice to have" would be something like "offline support / git editing". but a must have is "high availability for TPA documentation", which "git" is only one of many implementations i guess...
funnily enough, all of the wikis that i've test-driven and liked were git (or mercurial/DARCS) based. i was concerned that a VCS-based wiki might not scale well, but it sounds like the availability is more important, so that solves that problem :p
in general, this should be meshed with the service/documentation.md page, which has a lot of thoughts on this matter as well. maybe move the alternatives section into the RFC? there's a (redundant now?) goals section in there too:
More granularity regarding permissions for editing the wiki is fine, but I am not sure I want a situation where anyone can just edit the wiki, logged in or not. In the past I had to worry about trac being edited by random people I don't want to be back to that.
As a side note, if you are looking into static site generators there is also jekyll which is quite good, although not in a stack that people like around here (ruby w sinatra).
As a side note, if you are looking into static site generators there is also jekyll which is quite good, although not in a stack that people like around here (ruby w sinatra).
I absolutely love jekyll, it's the SSG I used the most before lektor! I didn't realize you could integrate it with sinatra, that's really handy. I'll definitely add it as a possibility, but I worry that static generators might not scale well.
I thought it might be a nice feature to have in case we ever want to enable it down the road (most wikis with anonymous editing let you turn it off) but it sounds like it's not a feature we'd ever really want to support. Which is good to me, it's one less requirement to worry about!
@kez are you leading this effort or you want me to? no one is assigned this ticket (yet). i've put it into Next for now at least, please take the ticket or assign it to me according to how you want to run this.
It's actually a hard requirement for the migration away from GitLab
wiki so, as part of this design process, we should make a prototype that
would show whether it's even possible at all to redirect the current
wikis (or any GitLab URL, really) to some other site.
Otherwise any wiki migration is going to hellishly break all sorts of
links everywhere.
...
On 2022-10-06 14:24:16, Silvio Rhatto (@rhatto) wrote:
Another requirement suggestion:
Support for page redirections:
Pros: avoid dead links when reorganizing page structures.
Cons: may leave dangling documents in the wiki tree.
Would it be possible to consider just unifying GitLab wikis under a single project? I think the main problem we have is the fragmentation across various projects and namespaces, making discovery and search much more difficult than it has to be (looking at you, GitLab Enterprise...). Because besides that issue, GitLab wikis do check a lot of boxes, like it or not.
Agreed, we should make it one of the alternatives evaluated. for me the
biggest downside is it's not searchable... but then again, most of our
sites aren't, that's a global problem we have thought of fixing
before. see #33106 (closed) (SolR), tpo/web/manual#90 (closed) (www.tpo
search), tpo/web/team#25 (search.tpo), etc...
...
On 2022-10-11 13:32:11, Jérôme Charaoui (@lavamind) wrote:
Would it be possible to consider just unifying GitLab wikis under a single project? I think the main problem we have is the fragmentation across various projects and namespaces, making discovery and search much more difficult than it has to be (looking at you, GitLab Enterprise...). Because besides that issue, GitLab wikis do check a lot of boxes, like it or not.
--
Antoine Beaupré
torproject.org system administration
for me the biggest downside is it's not searchable.
In fact, wikis in GitLab are searchable. When you navigate to a project, say tpo/tpa/team and enter a search query in the top left search box, the results page has a "Wiki" tab that shows results for that query, but only in the context of that one project. So if we merge all our wikis under a single project, searching for things in that one project should surface more usefuls information, as opposed to the current setup where first one has to figure out in which wiki to submit a search query (TPA? Anti-censorship? Metrics? etc.)
I don't think we can automatically give a user access to a repo. We'd
need to hook in the user registration process with a bot or something,
not very practical. We also do not have a group that is "everyone", as
far as I know.
The "not everyone can edit the wiki" bug is an old GitLab bug:
One thing to keep in mind here is that if we go with an external wiki,
something that's not bound to GitLab user permissions, we're going to
end up with a similar situation as this, in the sense that we'll need to
grant access to new people (and revoke old accesses) on
onboarding/offboarding.
Having a single wiki here would actually make things slightly easier
because user/passwords will have already been created to the regular
GitLab onboarding process. We'd just need to grant users access to the
Wiki. We could also grant access to teams so that once a person joins a
team, they automatically have wiki access as well...
nice. mkdocs is definitely one of my prime "git-based" workflows I would
like to use. i managed to convert the TPA wiki to it and it wasn't
completely crap, which is better than everything else i looked at so
far. :p
...
--
Antoine Beaupré
torproject.org system administration
we had a good conversation between @rhatto@lavamind and me today. we worked a bit on the TPA-RFC-38 document. conversation may continue this afternoon, see this pad:
This issue has been waiting for information two
weeks or more. It needs attention. Please take care of
this before the end of
2023-05-26. ~"Needs
Information" tickets will be moved to the Icebox after
that point.
(Any ticket left in Needs Review, Needs Information, Next, or Doing
without activity for 14 days gets such
notifications. Make a comment describing the current state
of this ticket and remove the Stale label to fix this.)
one of the reasons i'm stalled on this is because I don't like the conclusion we reached when we talked about this. i didn't like it when we reached it either, but we were really converging towards setting up a mediawiki as a replacement for gitlab wikis.
but now i feel this is not the right direction. i can't articulate this clearly yet, so i don't want to go into details, but i feel the wiki is a kind of trap we keep walking into, and we should instead look at how we operate our documentation more globally.
i'll just dump those links that are one of the things that made me really question the conclusions we reached:
so maybe one problem we have here is scope: i wouldn't want TPA's wiki to end up in mediawiki, but maybe it's okay to have one wiki where we can dump things?
but then why not treat that wiki as a proper documentation space and do it correctly as well?
anyways, lots of questions at this point, and few answers, i'm pushing this back because i don't have time to think about it properly right now.
The other issue to consider is that documentation is not only curated by TPI but also is (or it was before Gitlab) a place where the Tor community (core contributors and volunteers in general) could contribute. We need a place that helps us to have good documentation and at the same time, people in the wider community can edit/add/improve. The big problem of the documentation right now is that is spread over teams and tools and siloed to only some people able to edit some parts of the documentation.
one of the reasons i'm stalled on this is because I don't like the conclusion we reached when we talked about this. i didn't like it when we reached it either
+1
i can't articulate this clearly yet
+1
We need a place that helps us to have good documentation and at the same time, people in the wider community can edit/add/improve. The big problem of the documentation right now is that is spread over teams and tools and siloed to only some people able to edit some parts of the documentation.
I think this is a good statement of the problem.
What was most interesting in the discussion at Costa Rica was that:
Part of the solution would involve a better way to include people in the edition process (by having better UX and user/permission management).
But part of the solution is about having a search engine indexing GitLab and all documentation sites (including the upcoming Development Portal).
It was also discussed that's hard to converge different projects/teams documentation needs into a single, centralized solution. There are software or teams with their own self-contained documentation, while there's common TPO/TPI documentation that could stay in a single "umbrella".
It maybe worth then to think more about the problem space and to have a glance on the existing documentation and which would be relevant to be brought to an "official" wiki system.
Another thing to consider is whether to have an official central wiki as well as a community driven one (but this also have trade offs).
I have tried https://github.com/MkDocsEditor/ and it didn't work at all. Its web interface couldn't create or save pages(delete works), and it does not have multi-user support. Additionally it have 67 vulnerabilities (1 low, 17 moderate, 37 high, 12 critical) for npm and 4 Critical 27 H 9M 1L 1U vulnerabilities in the container running it.
i was writing about the problems with wikis and realized we haven't talked about the wiki proliferation problem, so i went crazy and wrote another python program to do an inventory of wikis. i think i have done this before but can't find the code anymore.
i'm running the script now and will report back here when it's done, but we have the problem that we have hundreds of empty wikis, and dozens of wikis with only a handful of pages. part of the transition here should probably involve fixing that by converging into a smaller set of wikis, either One Big Wiki or One Wiki Per Team or something.
update: here's the summary of the script:
INFO: top 10 largest wikis: [('tpo/applications/team', 29), ('tpo/applications/tor-browser', 29), ('tpo/community/team', 30), ('tpo/operations/team', 32), ('tpo/anti-censorship/team', 35), ('tpo/core/team', 39), ('tpo/network-health/team', 56), ('tpo/tpa/team', 216), ('tpo/team', 1189), ('legacy/trac', 1619)]INFO: 1494 projectsINFO: 383 without wikisINFO: 1053 empty wikisINFO: 58 non-empty wikisINFO: 3516 pages total
Excluding legacy/trac, more than half (63%) the wiki pages are in the tpo/team wiki. If we count only the first three wikis, that ratio goes up to 77% and if 85% of all pages live in the top 10 wikis.
In other words, there's a very long tail of wikis (~40) that account for less than 15% of the page count. We should probably look at centralizing this, as it will make all further problems easier to solve, e.g. search and discovery.
i was writing about the problems with wikis and realized we haven't talked about the wiki proliferation problem, so i went crazy and wrote another python program to do an inventory of wikis. i think i have done this before but can't find the code anymore.
We have tested the Web IDE as a tool to allow users to more easily edit Git repositories, as a replacement for "a wiki", and... it's not great.. Here's me editing the sandbox.md markdown file:
You'll notice the preview, on the right, doesn't work. But even if it would, the interface is rather overwhelming and doesn't help the user very much getting through the day. Here, for example, is what happens when you try to "commit" your changes (assuming that you even know you need to do that):
Here the "normal", non-IDE editor:
... but then this one looks so good because I can commit directly to the main branch. If I take some random project (https://gitlab.torproject.org/anarcat-admin/test), then it quickly gets ugly:
This is because I already have a test project in my workspace. So total UX fail here. But even if you manage to create a fork, then you first make a commit, and then you are served with another UI element:
... the MERGE REQUEST (dum dum dum). At this point, most users unfamiliar with this workflow will have likely given up.
And that's assuming users could fork: most GitLab users have a low project limit and wouldn't be able to fork anyway.
So I think it's fair to assume this doesn't work as a replacement for a proper wiki workflow.
we also tested the Web IDE search, it just plain doesn't work. it doesn't even seem like it's issuing any network request at all. at least it can't find the string ldap in the TPA wiki, which is clearly wrong.
i finished reviewing TPA-RFC-38, as a first pass. I reviewed the requirements, the personas, and other sections, and generally shook up the document a bit. i'm not sure what the next step is, we made good progrress in analyzing mkdocs as a possible solution in tpo/community/hackweek#13 (closed) but there are serious blockers there as well (search is a 10MB static asset downloaded every 10 minutes, no good edit workflow, compatibility issues).