Website issueshttps://gitlab.torproject.org/tpo/network-health/metrics/website/-/issues2021-09-28T14:44:47Zhttps://gitlab.torproject.org/tpo/network-health/metrics/website/-/issues/30612Replace timeouts-and-failures graph with errorcodes graph2021-09-28T14:44:47ZKarsten LoesingReplace timeouts-and-failures graph with errorcodes graphThe `ERRORCODE` field discussed in legacy/trac#29787 is going to add much more detailed information about timeouts and failures of OnionPerf measurements. We should just throw out our timeouts-and-failures graph and replace it with a gra...The `ERRORCODE` field discussed in legacy/trac#29787 is going to add much more detailed information about timeouts and failures of OnionPerf measurements. We should just throw out our timeouts-and-failures graph and replace it with a graph similar to [this one suggested on #29787](https://trac.torproject.org/projects/tor/ticket/29787#comment:23). This ticket is for adding that graph to the website.https://gitlab.torproject.org/tpo/network-health/metrics/website/-/issues/30303Add "archived" badge to graphs that are not updated anymore2021-09-28T14:44:23ZKarsten LoesingAdd "archived" badge to graphs that are not updated anymoreYesterday's discussion on legacy/trac#29772 made me think about a process for archiving Tor Metrics graphs. The background there is that we need a defined way to remove graphs when we think they're not as useful anymore. On the one hand ...Yesterday's discussion on legacy/trac#29772 made me think about a process for archiving Tor Metrics graphs. The background there is that we need a defined way to remove graphs when we think they're not as useful anymore. On the one hand we don't want to surprise our users that their favorite graph is gone, but on the other hand we simply cannot keep updating everything forever. Last but not least, we need confidence that we can remove an experimental graph, or we might decide to rather not add it in the first place.
Note that this topic has also come up a while ago in the context of removing the Tor Messenger graph (legacy/trac#26030, legacy/trac#26047). We just didn't find a good place to archive that graph and the underlying data. Maybe we can find one now.
How about we add a new badge "archived" for graphs that are not updated anymore? (Or if that's too much UX, we could simply write "(archived)" next to the graph title.)
Whenever we archive a graph, its URL will not change. Whoever has the graph page bookmarked will just end up on a page where the graph doesn't show the latest three months and where they cannot update the graph anymore. But users can still download the graph or data or learn about the CSV data format and how to ~~re~~produce the graph.
One technical detail we need to think about is that we'll have to put the graphs (PNG and PDF) and data (CSV) somewhere. I think we were blocking on that the last time we talked about this, but we shouldn't. I suggest to simply add them to Git and include them in the .war file that we deploy. The image files will always be small, and if the CSV file would be too big, we could compress it. As an example, the Tor Messenger files are: 24K for the CSV, 32K for the PDF and 32K for the PNG. Still, this seems easier than making sure that all relevant files are present in the file system of the server. And for comparison, with all the required libraries, our .war file is currently at 22M.
Something else we'll need to consider is _which_ graph we're putting on the graph page. For example, if the graph had a country parameter, we'll have to choose one country as a sample, and the user will not be able to customize the country on the website anymore. However, the CSV file will still contain all countries, so that the user can plot their own graph with whatever country they want to see.
Regarding timing, it would still be nice to give a two weeks heads up for people to make their favorite customized graph and for them to tell us why we're wrong to archive that graph. Basically, we'd write "(deprecated)" (or add a "deprecated" badge) next to the graph first, and two weeks later change that to "archived".
How does this sound?
Setting priority to high, because it would be great to have a plan for removing graphs before we add the more experimental ones for legacy/trac#29772 and legacy/trac#29773. We don't have to implement this idea first, but it would be good to agree whether this would be a viable solution.https://gitlab.torproject.org/tpo/network-health/metrics/website/-/issues/29837Include Tor Browser data from Google Play2023-01-23T14:23:54ZMatthew FinkelInclude Tor Browser data from Google PlayStatistics of app distribution are available for export from Google Play using Google's special tool. We should use this and provide public data from this.
https://support.google.com/googleplay/android-developer/answer/6135870#exportStatistics of app distribution are available for export from Google Play using Google's special tool. We should use this and provide public data from this.
https://support.google.com/googleplay/android-developer/answer/6135870#exporthttps://gitlab.torproject.org/tpo/network-health/metrics/website/-/issues/29835Include Android as a Tor Browser platform2022-12-19T08:27:17ZMatthew FinkelInclude Android as a Tor Browser platformThere should be some web server log entries now for the tor-browser*.apk files (over the past ~6 months). It'd be nice seeing these included in the graphs. Tor Browser on Android does not automatically execute update pings or download ma...There should be some web server log entries now for the tor-browser*.apk files (over the past ~6 months). It'd be nice seeing these included in the graphs. Tor Browser on Android does not automatically execute update pings or download mar files, so initial downloads and updates likely look identical.https://gitlab.torproject.org/tpo/network-health/metrics/website/-/issues/40032Write integration tests for data-processing modules2021-11-15T14:38:09ZKarsten LoesingWrite integration tests for data-processing modulesWe discussed in Brussels that we'll need at least integration tests for metrics-web in order to make code changes like the Java 8 Date/Time API update.
I started working on this. Here's what I did:
- Pick a small set of descriptors as ...We discussed in Brussels that we'll need at least integration tests for metrics-web in order to make code changes like the Java 8 Date/Time API update.
I started working on this. Here's what I did:
- Pick a small set of descriptors as test data that are sufficient to produce at least something as .csv files.
- Write a script that runs all data-processing modules.
- Run the script once to get output that we would expect from future test runs.
The result is too big (IMHO) to add to the Git repository. That's why I uploaded it here:
https://people.torproject.org/~karsten/volatile/metrics-web-integ-tests.tar
```
shasum -a 256 metrics-web-integ-tests.tar
728c4e4ee184f2260cd30286f2925aa627d7b3d572a236b646021cc1b461de10 metrics-web-integ-tests.tar
```
It would be great if somebody else besides me tries this out and verifies that their run produces the same output.
A next good step after that would be to talk about where/how to put this under version control. If that's impossible, we might be able to reduce the test data size a bit more, but maybe not as substantial as we'd want.
Oh, and we could probably fetch libs from Debian rather than shipping them. I didn't bother for now.https://gitlab.torproject.org/tpo/network-health/metrics/website/-/issues/29330Do something with advertised bandwidth distribution graphs2021-09-28T14:41:02ZKarsten LoesingDo something with advertised bandwidth distribution graphsWe currently have two graphs on advertised bandwidth distribution on Tor Metrics, [Advertised bandwidth distribution](https://metrics.torproject.org/advbwdist-perc.html) and [Advertised bandwidth of n-th fastest relays](https://metrics.t...We currently have two graphs on advertised bandwidth distribution on Tor Metrics, [Advertised bandwidth distribution](https://metrics.torproject.org/advbwdist-perc.html) and [Advertised bandwidth of n-th fastest relays](https://metrics.torproject.org/advbwdist-relay.html).
Unfortunately, the aggregation code that produces the data behind these graphs has always been somewhat painful to maintain. It was never written for the long term, it was rather a one-off analysis that we then made available on Tor Metrics. And now it's blocking a refactoring project where we want to share code between modules (legacy/trac#28342). This is not a good situation to be in.
We discussed this briefly in Brussels, and I put some more thoughts into this today. Basically, I can see four ways for moving forward from here:
- Retain: We accept that this code is hard to maintain, but we retain it. We exclude the module from the refactoring project and keep it as legacy module. This seems like an ugly solution from a bit-rot perspective, unless we're only doing it for a limited time before removing the graphs, in which case this could work.
- Rewrite: We rewrite this module by designing a new database schema that is more flexible than the current approach. This is an awful amount of work, and we should only do it if we really think that these graphs are useful and will stay around for a long time.
- Remove: We remove the graph, because we don't see the need for it anymore. We can do this with a few weeks of warning, and we can archive the .csv files and graphs and put them into an attic kind of thing just like we're planning to do with Tor Messenger graphs (legacy/trac#26030).
- Replace: We replace these two graphs with two that are much easier to provide, namely with consensus weight distribution graphs. I'm going to attach two samples shortly. The code changes are almost trivial, except that the resulting code will be much easier to maintain regarding the refactoring project mentioned earlier.https://gitlab.torproject.org/tpo/network-health/metrics/website/-/issues/26030Delete "Tor Messenger downloads and updates" section2023-01-23T15:01:00ZcypherpunksDelete "Tor Messenger downloads and updates" sectionhttps://metrics.torproject.org/webstats-tm.htmlhttps://metrics.torproject.org/webstats-tm.htmlhttps://gitlab.torproject.org/tpo/network-health/metrics/website/-/issues/25744Resolve RS license issues2023-01-23T12:12:32ZiwakehResolve RS license issuesPlease refer to the branch in [this comment of the parent ticket](https://trac.torproject.org/projects/tor/ticket/25392#comment:5).
There are two license files in RS. These should be integrated into the main LICENSE file.Please refer to the branch in [this comment of the parent ticket](https://trac.torproject.org/projects/tor/ticket/25392#comment:5).
There are two license files in RS. These should be integrated into the main LICENSE file.https://gitlab.torproject.org/tpo/network-health/metrics/website/-/issues/25625Make CollecTor's file structure description part of Metrics-Web's CollecTor docs2022-03-21T15:58:31ZiwakehMake CollecTor's file structure description part of Metrics-Web's CollecTor docsTransfer [PROTOCOL](https://gitweb.torproject.org/collector.git/tree/src/main/resources/docs/PROTOCOL) text file to some useful place and format in Metrics-Web.Transfer [PROTOCOL](https://gitweb.torproject.org/collector.git/tree/src/main/resources/docs/PROTOCOL) text file to some useful place and format in Metrics-Web.https://gitlab.torproject.org/tpo/network-health/metrics/website/-/issues/25589Add test to ensure that output of ATOM feed is valid XML2021-09-28T14:37:32ZirlAdd test to ensure that output of ATOM feed is valid XMLMost clients will refuse to parse the feed if it's not valid XML. This could be a unit test or it could be part of the ant task for updating the news.json file.Most clients will refuse to parse the feed if it's not valid XML. This could be a unit test or it could be part of the ant task for updating the news.json file.https://gitlab.torproject.org/tpo/network-health/metrics/website/-/issues/25571Add an iCalendar feed of metrics events2021-09-28T14:37:18ZirlAdd an iCalendar feed of metrics eventslegacy/trac#23854 added an ATOM feed for the news page. It also laid down enough framework that adding an iCalendar feed would be relatively easy.
I'm adding this with a low priority, but if there are requests for it to be increased to ...legacy/trac#23854 added an ATOM feed for the news page. It also laid down enough framework that adding an iCalendar feed would be relatively easy.
I'm adding this with a low priority, but if there are requests for it to be increased to medium (indicating that it would be used) then we should change the priority to medium.https://gitlab.torproject.org/tpo/network-health/metrics/website/-/issues/25570Allow autodiscovery of the news atom feed2021-09-28T14:37:09ZirlAllow autodiscovery of the news atom feedA <link rel="alternate"> needs to be added to the head of the news page to allow autodiscovery of the feed. As this will require changes to the templates used in all pages, I've not done this as part of adding the feed but will instead d...A <link rel="alternate"> needs to be added to the head of the news page to allow autodiscovery of the feed. As this will require changes to the templates used in all pages, I've not done this as part of adding the feed but will instead do it when we update templates for Bootstrap 4.https://gitlab.torproject.org/tpo/network-health/metrics/website/-/issues/25221Add heap space recommendation to tutorial2022-03-21T15:58:31ZiwakehAdd heap space recommendation to tutorialThe [tutorial 'Prerequisites'](https://metrics.torproject.org/metrics-lib.html#prerequisites) section should mention the possible heap space requirements of metrics-lib (see also [metrics-team post](https://lists.torproject.org/pipermail...The [tutorial 'Prerequisites'](https://metrics.torproject.org/metrics-lib.html#prerequisites) section should mention the possible heap space requirements of metrics-lib (see also [metrics-team post](https://lists.torproject.org/pipermail/metrics-team/2018-February/000667.html)).
Also keep in mind the results of legacy/trac#20395.https://gitlab.torproject.org/tpo/network-health/metrics/website/-/issues/24858Further tweak metrics timeline events underneath graphs and on news page2022-03-01T20:01:35ZKarsten LoesingFurther tweak metrics timeline events underneath graphs and on news pageIn legacy/trac#24260 we added metrics timeline events underneath graphs and tweaked those on the news page. While discussing those changes we came up with a couple of improvements that we were unable to implement, because we didn't have ...In legacy/trac#24260 we added metrics timeline events underneath graphs and tweaked those on the news page. While discussing those changes we came up with a couple of improvements that we were unable to implement, because we didn't have the time. Here's the list, in case we find more time for this later on. We should probably create child tickets when starting on one these items.
- Add tooltips to tags explaining them a little more, like "Onion-Routing protocol" for "<OR>".
- Make tags clickable by linking them to a page with all events related to that tag.
- Maybe make column headers clickable and as a result sort entries accordingly.
- Maybe change the filtering to only show entries for all countries or all transports on graphs showing users from all countries or using all transports.
- Add annotations to the graph or even mouseovers.
- Add optional JavaScript magic that only displays the first entries and that lets the user expend the list if there are more.
- Find categories or standards for link texts for a more consistent presentation.
- Extend events to graphs in other categories than Users and then add this table there.https://gitlab.torproject.org/tpo/network-health/metrics/website/-/issues/24229Provide BGP Data Collection on Tor Metrics2021-09-28T14:29:33ZiwakehProvide BGP Data Collection on Tor MetricsYixin's data description:
[ data&spec](http://raptor.princeton.edu/tor_metrics/)
* Data (06/2016 - 08/2017):
We put each month's BGP updates into a single txt file, compressed with `xz -9e` into [year]-[month]-updates.txt.xz. These a...Yixin's data description:
[ data&spec](http://raptor.princeton.edu/tor_metrics/)
* Data (06/2016 - 08/2017):
We put each month's BGP updates into a single txt file, compressed with `xz -9e` into [year]-[month]-updates.txt.xz. These are the files under all-updates/.
The all-updates.tar is basically a tarball of the all-updates/ directory.
* Description and spec (counter-raptor.html):
This is the html file that we put up. Starting with the <h1> tag following the https://metrics.torproject.org/metrics-lib.html page.
* Software (detection.py):
This is our script to analyze the data (also linked in the html page).
====
Next steps (semi-random order):
* find a place for the data, i.e., a path on CollecTor
* determine a data update process
* integrate html description into Metrics' sitehttps://gitlab.torproject.org/tpo/network-health/metrics/website/-/issues/23983Add middle-only line to Relays by relay flag graph2021-09-28T14:29:15ZcypherpunksAdd middle-only line to Relays by relay flag graphWhen looking at the declining number of tor relays [1]
I also looked at the relays by relay flag graph.
On that graph one can notice that guard relays are declining but exits are not,
but middle relays (no guard or exit flag) are declin...When looking at the declining number of tor relays [1]
I also looked at the relays by relay flag graph.
On that graph one can notice that guard relays are declining but exits are not,
but middle relays (no guard or exit flag) are declining even more and are unfortunately not explicitly on that graph.
Could you add them?
[1] https://lists.torproject.org/pipermail/tor-relays/2017-October/013364.htmlhttps://gitlab.torproject.org/tpo/network-health/metrics/website/-/issues/23973Add "Analysis" as new first category next to "News"2021-09-28T14:28:50ZKarsten LoesingAdd "Analysis" as new first category next to "News"From [#23716](https://trac.torproject.org/projects/tor/ticket/23716#comment:10):
> irl, let's talk more about your suggestion regarding "swapping out the secondary-nav depending on the section at the top of the page and adding an Analys...From [#23716](https://trac.torproject.org/projects/tor/ticket/23716#comment:10):
> irl, let's talk more about your suggestion regarding "swapping out the secondary-nav depending on the section at the top of the page and adding an Analysis section for the things currently on the home page." Do you mean adding "Analysis" as new first category next to "News", with "Users", "Servers", etc. as subitems under "Analysis" and a separate (possibly empty) set of subitems under "News", "Sources", etc.? Happy to talk about that, but let's move to a new ticket, as it's somewhat unrelated to the renaming question here. Note that we'll likely want to talk to a/our web designer and that we might want to wait and see what the main Tor website redesign produces.https://gitlab.torproject.org/tpo/network-health/metrics/website/-/issues/23413Add a draft with "Relay and bridge users by country" and maybe even make it t...2021-09-28T14:27:30ZKarsten LoesingAdd a draft with "Relay and bridge users by country" and maybe even make it the default for the "Users" sectionAs [discussed with dcf on metrics-team@](https://lists.torproject.org/pipermail/metrics-team/2017-August/000438.html), we might consider adding a graph with "Relay and bridge users by country", which shows two time plots on separate y sc...As [discussed with dcf on metrics-team@](https://lists.torproject.org/pipermail/metrics-team/2017-August/000438.html), we might consider adding a graph with "Relay and bridge users by country", which shows two time plots on separate y scales and which allows comparing trends between these two user groups per country.
We might even consider making this graph the new default in the "Users" section to make bridge user numbers more visible.https://gitlab.torproject.org/tpo/network-health/metrics/website/-/issues/40034Take descriptor upload overlap into account when estimating version 3 onion a...2023-01-23T14:47:40ZteorTake descriptor upload overlap into account when estimating version 3 onion address countsBased on this tor metrics paper:
https://research.torproject.org/techreports/extrapolating-hidserv-stats-2015-01-31.pdf
We ignore descriptor upload overlap periods (they're not even mentioned in the paper).
During an overlap period, de...Based on this tor metrics paper:
https://research.torproject.org/techreports/extrapolating-hidserv-stats-2015-01-31.pdf
We ignore descriptor upload overlap periods (they're not even mentioned in the paper).
During an overlap period, descriptors are published to twice as many
HSDirs (v2 & v3). If we ignore this, we will double-count:
* v2: uploads and unique onion addresses and descriptor ids for 1 hour per day,
* v3: uploads and unique descriptor ids for 12 hours per day.
I'm not sure if assuming that descriptors are seen by 2 sets of HSDirs per day covers this, because in v2 they are actually seen by 3 sets of HSDirs with probability 1/24, when the service address starts with 00-0B (assuming stats are collected from 00:00 UTC and relay clocks are accurate).
And in v3, for half the day (00:00 + typical client consensus download delay of 1-2 hours) they are seen by 2 HSDirs, and half the day they are seen by 1 HSDir. Not that we measure v3 yet.https://gitlab.torproject.org/tpo/network-health/metrics/website/-/issues/22236Add new graph showing directory traffic as percentage of all traffic2021-09-28T14:26:44ZKarsten LoesingAdd new graph showing directory traffic as percentage of all trafficMike asked me a while ago to make a graph showing what fraction of network capacity is spent on directory traffic as an easy way to measure directory overhead. The idea was that this graph should trend down as we make directory fetches ...Mike asked me a while ago to make a graph showing what fraction of network capacity is spent on directory traffic as an easy way to measure directory overhead. The idea was that this graph should trend down as we make directory fetches more efficient, but that it could trend up if we add a bunch of clients that don't do much other than download the consensus.
We already have two related graphs, [Total relay bandwidth](https://metrics.torproject.org/bandwidth.html) and [Bandwidth spent on answering directory requests](https://metrics.torproject.org/dirbytes.html), and this new graph would basically combine those two.
I made a sample graph that I'll attach to this ticket together with the code to make that graph.
Let's use this ticket to discuss whether such a graph would be useful to have on Tor Metrics, and if so, let's use it to write a useful description and work on a metrics-web patch.