The idea here is for TPO to have a status page where we can post status problem we know about so the world can learn if problems are known to the Tor team.
A great example would have been the v3 outtage recently where we could have put out a small FAQ right away (go static HTML!) and then update the world as we figure out the problem but also expected return to normal.
I propose a static website at the moment in order to have this page going very soon and easily editable by many of us namely likely the network health team and we could also have the network team to help out.
We can then over time think of better automated way to get status on that page like automatic reports of sensors/metrics we have and so on. But for the moment, just a DNS + static HTML would be grand.
i really like that idea, and even started planning for it in the roadmap (#40105 (closed)) as this was a request by one survey respondent: communicate better when we (i guess TPA?) have a downtime.
so if I would make a request here is to be able to piggyback TPA services here as well. i would also strongly argue to make a static website and keep it static. there's nothing worse than having your status page down because it's too complicated.
in the roadmap brainstorm, i suggested we try cstate: it's a static site generator, based on hugo, that is exactly designed to make cute status pages. it's already used by the https://sr.ht website (see a https://status.sr.ht/) so it has got to mean something good! (this would break policy a bit: we currently use lektor for static websites, but i think we do use hugo for some others, and i'm not aware of a good "out of the box" system like this in lektor).
this project could be managed through gitlab at first, and deployed with gitlab CI (and, why not, gitlab pages! or just be in the normal static mirror system), with a contingency plan on how to deploy it elsewhere if that fails. many teams could have access to it, so that it has minimal friction.
i would suggest status.torproject.org unless we want to have a different TLD. we have briefly looked at other TLDs in #40121 and (sorry @pastly), tor.network does not seem to be available. but it would be great to have a distinct TLD because attacks against infrastructure have this pesky tendency to attack TLDs as well. this is why many status pages sit on completely unrelated domain names than their primary services.
but maybe we can start simple here and just take status.torproject.org (which i would argue could be linked from check.tpo as well).
We do have control over tor.network just that it has a complicated story where every time we use/mention it, somehow a very angry confused person is summoned from the Internet and we get some backlash. Also, that person gave us that domain in the first place ;).
I do think status.tor.network would be glorious but at this moment, I'm personally more looking to get that page rather than bikeshedding the name. I guess we can have the easy status.tpo in the meantime and argue later for a better improved domain.
Awesome @anarcat! Thanks for your help! Whatever static HTML tech you want to use, I'm behind it :).
something that @arma mention on IRC is that this might have originally been targeted at only the network team, or at least the network-health stuff. i do think the status page should encompass all teams, to reduce the confusion (wait, do i want network.status.torproject.org? or tpa.status.torproject.org? or metrics.tpa.torproject.org? now i'm confused, and that is bad). so I would make only one grand status page.
but it's something we can discuss. in particular, @arma was worried about being able to filter past events so that "when folks start doing performance experiments" we can ask questions like "wait what was going on in january", and I'm assuming "only on the network health stuff". this might be something that cstate can do because it supports tagging issues, but it's something to keep in mind.
i'd also warn that I would keep this thing super simple. one git repo, one command to build, plain HTML, no interaction, no hooks, no automation. maybe RSS feeds. it's a manually curated site, to be used in emergencies, so it needs to get out of our way and just work. no thinking, just dump and run.
so if I would make a suggestion here: no ponies. simplest thing possible.
oh, and sorry for flooding this issue, but what I would typically do next is create a service page detailing the requirements (as brainstormed here) and possible implementations. then people can comment on that and add/remove stuff... makes sense?
Yes, I agree with that. I am in the "no ponies" camp for the whole web, so definitely for that status page. :)
Thanks for getting this started @dgoulet! I'd love to see this page up and running sooner than later for network-health parts at least (and I am in the pro tor.network camp, too).
don't publicise too widely, that's my home server, litterally hosted in my basement. :p
obviously, the service list and categories must be tweaked a bit more (that's in config.yaml).
@duncan i didn't really try to theme this, other than changing the logo from the frontpage. some theming seems to happen inside config.yaml.. but i'm not super familiar with hugo theming, let me know if you get stuck. i'd say don't go crazy theming this just yet too, maybe we won't pick this tool if people don't like it...
Thanks @anarcat, it looks good! I spotted the noscript on TB Safest too but nothing looks to have broken.
There are some basic customization options in config.yml (header, color) but beyond that it's probably not worth customizing it any further. The CSS is pretty scattered and the developer seems to frown upon it.
Alternatively (or for v2 of the status page) we could take a regular content template from tpo.org and add a style for statuses, which would be more on brand, could include the normal site header/footer, and would be relatively straightforward to integrate into lektor and add other components like FAQs.