Calculate effectiveness score for individual bridges and then aggregate scores for bridge strategies

I put the ticket in relay-search component for now, but maybe it makes more sense in onionoo (which is labeled as not getting new development attention yet so I hesitated) or somewhere else.

We will also want to use this situation to figure out the right balance between "metrics team helps with analysis" vs "metrics team collects and maintains data sets and auto runs scripts but the requesting team (anti-censorship in this case) is going to need to write and maintain those scripts".

I put the ticket in relay-search component for now, but maybe it makes more sense in onionoo (which is labeled as not getting new development attention yet so I hesitated) or somewhere else.

@arma: Right. I'd assume for Onionoo to be the right place of what you want and not something showing up on Relay Search or the website in general. But we can move this ticket around once the work to be done is properly understood.

Re: who is belling the cat: my gut feeling right now is that we (network-health/metrics) provide the data the anti-censorship team needs and once that's done the anti-censorship folks are using whatever (scripting) means they see fit to get the useful things out of it. But maybe we need some kind of hybrid approach where we do some work directly on the Onionoo side e.g. exposing a score for each bridge that way and the anti-censorship team proceeds with just the aggregation part.

To get started and get an understanding of what is actually needed: what is preventing you/the anti-censorship team from downloading daily snapshots of Onionoo data and go over that and calculate whatever you need for all the bridges? Is it just the handling of those snapshots? Or do you actually need additional data exposed via Onionoo? See: https://metrics.torproject.org/onionoo.html for what we expose already. E.g. there are dedicated uptime and client documents for bridges, and of course details documents.

added For Anticensorship Team Needs Information Roadmap::Future labels

Specifically, from the metrics side I want the ability to loop through a bridge's past, computing a daily score for it (that is, a score for each of its past days).

I think that part is not easily doable right now. Onionoo keeps data of the past 7 days or so and then it comes to dealing with descriptors which might even be sanitized in a way that makes it extra hard to get what we want. To overcome the "past 7 days"-problem we need to wait until we have set up our database which we plan to do during the ongoing S112 work.

Ok, that's useful to know. So it sounds like if we make the score be only a function of info from the past at-most-7 days, this is much easier to do in the current onionoo / metrics architecture.

With that in mind, here is a concrete idea for a utility function that it sounds like we could implement with available onionoo info: "sum of the number of users the bridge reported having, over the past 5 days, using 0's for days where we don't have data".

We would want to ignore whether we think the bridge is Running, since there are edge cases where bridges are running but metrics thinks they're down, e.g. from the bridges that firewall their ORPort. If they publish stats about having users, they probably had users.

I think we would accomplish two goals here: (1) gamification for the bridge operator. If we put the bridge's score on its relay-search page, then they can see whether they're doing 'well' or not compared to other relays. (2) feedback to the community team and the anti-censorship team and the external Tor community about which distribution strategies are succeeding lately. For 2, we would want to add up the scores of all the bridges in each strategy, and graph today's sums for an easy "which one is working recently" reference. Bonus points if we record and graph those strategy-sum-scores over time, but if that's hard then it can wait until later.

Once that initial score is in place, we'll want to start tracking more scores. The first ones that come to mind are "same thing as before, but only counting users from country X" where the set X starts as {China, Iran, Russia}. But, one step at a time, I accept this. :)

Now, you raise a good point about which team should be tracking and displaying these metrics. It sounds like the team leads plus gaba micah etc should try to sort that out. If I were picking it, I would pick "network health / metrics team computes the score, puts it in the data set, visualizes it on the metrics webpages" and "anti-censorship team looks at visualizations and data to understand what is going on" -- that is, the ongoing data is a network health team thing, but analyzing it and understanding it and identifying bugs and surprises is an anti-censorship team thing.

"We can define a score that is updated without keeping the data that comes in. This is anyways how the current "status" of a node is defined. The problem is that there are architectural limitations in onionoo as to what can be added to the status. At the moment the status is computed with the current status from the bridge authority and the extra-infos. Because of how onionoo was written, adding any information from server descriptors, for example, would probably make the hourly updates too slow, with the consequences that onionoo would overwrite its own status documents while it would still be reading them.

The per country aggregates can be done on relay-search side. As we do not provide aggregates on onionoo side at the moment (except for the openmetrics endpoint https://onionoo.torproject.org/network/metrics [1]).

As a note, if we are not interested in the gamification part of this proposal and we do not care if the score is displayed on relay-search, we can calculate the score and provide this, and aggregates, on a openmetrics endpoint that the anti-censorpship team can import into prometheus and visualize in grafana.

[1] https://gitlab.torproject.org/tpo/network-health/team/-/wikis/metrics/services/monitoring

mentioned in issue tpo/network-health/team#250 (closed)

mentioned in issue tpo/anti-censorship/team#113 (closed)

removed Needs Information label

mentioned in issue tpo/anti-censorship/team#143 (closed)

marked this issue as related to tpo/anti-censorship/rdsys#113

Calculate effectiveness score for individual bridges and then aggregate scores for bridge strategies

Child items ...

Activity