Trac issueshttps://gitlab.torproject.org/legacy/trac/-/issues2020-06-13T16:16:43Zhttps://gitlab.torproject.org/legacy/trac/-/issues/34394Test net keys expired2020-06-13T16:16:43ZjugaTest net keys expiredAnd the integration tests fail cause test net dirauths don't generate consensuses. Travis error https://travis-ci.org/github/torproject/sbws/jobs/693690384, though that doesn't show dirauths tor log.
One more problem that could be avoide...And the integration tests fail cause test net dirauths don't generate consensuses. Travis error https://travis-ci.org/github/torproject/sbws/jobs/693690384, though that doesn't show dirauths tor log.
One more problem that could be avoided implementing #33150.sbws: 1.1.x-finaljugajugahttps://gitlab.torproject.org/legacy/trac/-/issues/34393Maybe vote on the relays that have few or close measurements2020-06-13T16:16:42ZjugaMaybe vote on the relays that have few or close measurementsTo vote on aprox. the same number of relays as Torflow.
Toflow does not do it.To vote on aprox. the same number of relays as Torflow.
Toflow does not do it.sbws: 1.1.x-finalhttps://gitlab.torproject.org/legacy/trac/-/issues/34309Check that relay_recent_measurement_attempt_count and relay_recent_priority_l...2021-03-09T16:28:13ZjugaCheck that relay_recent_measurement_attempt_count and relay_recent_priority_list_count are correcThis was a comment in https://trac.torproject.org/projects/tor/ticket/30905#comment:9This was a comment in https://trac.torproject.org/projects/tor/ticket/30905#comment:9sbws: 1.1.x-finaljugajugahttps://gitlab.torproject.org/legacy/trac/-/issues/33947Compare sbws and Torflow2020-06-13T16:16:41ZjugaCompare sbws and Torflowgk and i were talking about what to review in #30375 and we thought it'd be useful to create a ticket to check whether the bugfixes we have been working on (https://trac.torproject.org/projects/tor/query?keywords=~sbws-roadmap&status=clo...gk and i were talking about what to review in #30375 and we thought it'd be useful to create a ticket to check whether the bugfixes we have been working on (https://trac.torproject.org/projects/tor/query?keywords=~sbws-roadmap&status=closed) to deploy sbws in all bwauths are working, ie. making sbws to behave very close to Torflow.
I think we should document what to check, where/how to check it and which ticket(s) intended to fix it.
I also think we should add this as documentation in sbws itself because they're important questions that have been blockers to deploy sbws in all bwauths and to avoid regressions in the future.
Some of the main things to check that should be further explained are:
- whether sbws "failures" "low" (#30719)
- whether the number of relays to vote on reported by sbws "similar" to the number of relays reported by torflow (#30727, #30735)
- whether sbws relay descriptors are updated (#30733)
- whether sbws router statuses (relay info. from consensus) are updated (#30733)
- whether sbws consensus bandwidth total sum is similar to torflow (#33871, #33009, #33350)?
- whether changes in a relay consensus bandwidth affect in a similar way as torflow (#33871)sbws: 1.1.x-finalhttps://gitlab.torproject.org/legacy/trac/-/issues/33871Scale exactly as torflow does?2020-06-13T16:16:40ZjugaScale exactly as torflow does?#33775 shows that if sbws calculates low consensus bandwidth because of missing descriptors and the rest of bwauths are also using sbws, it'd enter in a spiral in which it'd keep measuring low.
I think this is because while torflow multi...#33775 shows that if sbws calculates low consensus bandwidth because of missing descriptors and the rest of bwauths are also using sbws, it'd enter in a spiral in which it'd keep measuring low.
I think this is because while torflow multiplies the calculated ratio by the descriptor observed bandwidth [0], while sbws multiplies the ratio by the minimum of all descriptor bandwidth values *and* the consensus, which was added in #28598.
So maybe the new consensus bandwidth should not depend on the previous one, or not as the minimum.
For a relation on how bandwidth values depend on each other, see [1]
[0] https://onbasca.readthedocs.io/en/latest/torflow_aggr.html
[1] https://onbasca.readthedocs.io/en/latest/bandwidth_tor.html#bandwidth-values-originsbws: 1.1.x-finaljugajugahttps://gitlab.torproject.org/legacy/trac/-/issues/33832For relays that change ip, only the measurements with the last ip are kept2020-06-13T16:16:38ZjugaFor relays that change ip, only the measurements with the last ip are keptwhich makes those relays more likely to don't be "eligible" cause don't have enough results and therefore, sbws voting in less relays.which makes those relays more likely to don't be "eligible" cause don't have enough results and therefore, sbws voting in less relays.sbws: 1.1.x-finalGeorg KoppenGeorg Koppenhttps://gitlab.torproject.org/legacy/trac/-/issues/33831Relays without descriptors are not scaled, but still added to the bwlines wit...2020-06-13T16:16:37ZjugaRelays without descriptors are not scaled, but still added to the bwlines without vote=0As can be seen in: https://gitweb.torproject.org/sbws.git/tree/sbws/lib/v3bwfile.py?h=maint-1.1#n1317
As a result, some relays (in sample data counted ~800) are included in the bandwidth file without its bandwidth scaled, which could b...As can be seen in: https://gitweb.torproject.org/sbws.git/tree/sbws/lib/v3bwfile.py?h=maint-1.1#n1317
As a result, some relays (in sample data counted ~800) are included in the bandwidth file without its bandwidth scaled, which could be quite different (higher or lower) than the scaled bandwidth.
This is one of the several reasons of #33775.sbws: 1.1.x-finaljugajugahttps://gitlab.torproject.org/legacy/trac/-/issues/33775sbws measures some relays 100x lower than Torflow2020-06-13T16:16:35Zteorsbws measures some relays 100x lower than TorflowHi,
I've received multiple reports from relay operators who are concerned their relays are being measured too low.
At the moment, we only have 2 Torflow instances in the network, and 3 sbws instances. So bad sbws measurements can lower...Hi,
I've received multiple reports from relay operators who are concerned their relays are being measured too low.
At the moment, we only have 2 Torflow instances in the network, and 3 sbws instances. So bad sbws measurements can lower any relay's bandwidth, regardless of the Torflow measurement.
Here is one report:
https://lists.torproject.org/pipermail/tor-relays/2020-March/018321.html
At the moment, the bandwidths are:
moria1 494
maatuska 4
longclaw 5
https://consensus-health.torproject.org/consensus-health-2020-03-31-21-00.html#6E1DA4C0B0C05FB721B42329C47A20DA22908AEB
I have also received a similar report privately from a large relay operator.
This bug could be related to #33009 or #30733.sbws: 1.1.x-finalhttps://gitlab.torproject.org/legacy/trac/-/issues/33572Add the tor version key to the bandwidth file specification2020-06-13T15:52:06ZjugaAdd the tor version key to the bandwidth file specificationIt has already been done and reviewed in https://trac.torproject.org/projects/tor/ticket/30196#comment:16, but the component of that ticket is sbws, no Tor.
GH PR: https://github.com/torproject/torspec/pull/107.It has already been done and reviewed in https://trac.torproject.org/projects/tor/ticket/30196#comment:16, but the component of that ticket is sbws, no Tor.
GH PR: https://github.com/torproject/torspec/pull/107.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/33570Correct the relays to keep after retrieving new consensuses2020-06-13T16:16:31ZjugaCorrect the relays to keep after retrieving new consensusesWhen removing the old consensus timestamps, the oldest date is calculated with timedelta without passing seconds, which was taking it as days, making the oldest date very far in the past.
Recent timestamps are calculated as greater than ...When removing the old consensus timestamps, the oldest date is calculated with timedelta without passing seconds, which was taking it as days, making the oldest date very far in the past.
Recent timestamps are calculated as greater than the oldest date, instead of minor, what would make to store many timestamps.
The most recent timestamp for a relay, was not being took into account, because the router status was being updated after updating the timestamps.sbws: 1.1.x-finaljugajugahttps://gitlab.torproject.org/legacy/trac/-/issues/33472Document that bwauths should checkout stable versions when installing sbws fr...2020-06-13T16:16:30ZjugaDocument that bwauths should checkout stable versions when installing sbws from giti think that if some bwauths are installing sbws from git, they should checkout a tag or a bugfix branch. We might want to have a bwauth to run a development branch, but it should be only one.i think that if some bwauths are installing sbws from git, they should checkout a tag or a bugfix branch. We might want to have a bwauth to run a development branch, but it should be only one.sbws: 1.1.x-finalhttps://gitlab.torproject.org/legacy/trac/-/issues/33350Is sbws weighting some relays too high?2022-02-07T19:22:53ZteorIs sbws weighting some relays too high?Before we deploy sbws to the rest of the bandwidth authorities, we should check if it is weighting some relays (or some ASes) much higher than torflow.
We should also check for bugs in sbws, that weight existing large ASes too high:
htt...Before we deploy sbws to the rest of the bandwidth authorities, we should check if it is weighting some relays (or some ASes) much higher than torflow.
We should also check for bugs in sbws, that weight existing large ASes too high:
https://metrics.torproject.org/rs.html#aggregate/assbws: 1.1.x-finalhttps://gitlab.torproject.org/legacy/trac/-/issues/33199Fix lint error after merging #307332020-06-13T16:16:28ZjugaFix lint error after merging #30733Because there was a merge conflict merging #30733 after merging #30727, we accepted changes from both, but a new line is missing and flake8 complains (https://travis-ci.org/juga0/sbws/jobs/648108391#L2024)Because there was a merge conflict merging #30733 after merging #30727, we accepted changes from both, but a new line is missing and flake8 complains (https://travis-ci.org/juga0/sbws/jobs/648108391#L2024)sbws: 1.1.x-finalhttps://gitlab.torproject.org/legacy/trac/-/issues/33198Check changes related to descriptors in a bandwidth file created by a bwauth ...2020-06-13T16:16:28ZjugaCheck changes related to descriptors in a bandwidth file created by a bwauth before next releaseIn #30733 we did not create tests for the changes updating descriptors nor dormant mode.
Until those tests are implemented, we could change a bwauth to run the latest maint-1.1 branch and look the bandwidth files produced.
This way we co...In #30733 we did not create tests for the changes updating descriptors nor dormant mode.
Until those tests are implemented, we could change a bwauth to run the latest maint-1.1 branch and look the bandwidth files produced.
This way we could avoid some bugs before releasing a new version.
Maybe #30899 should be child of this to know that the bwauth is not running a released version but a git version.sbws: 1.1.x-finalhttps://gitlab.torproject.org/legacy/trac/-/issues/33121Deploy sbws in the rest of bwauths2022-02-17T10:44:00ZjugaDeploy sbws in the rest of bwauthsWhen we're ready to deploy sbws in all the remaining bwauths (3/6), send an email to the dirauths to start the transition process.When we're ready to deploy sbws in all the remaining bwauths (3/6), send an email to the dirauths to start the transition process.sbws: 1.1.x-finalhttps://gitlab.torproject.org/legacy/trac/-/issues/33077Graph results from the torflow to sbws transition2020-06-13T17:58:26ZMike PerryGraph results from the torflow to sbws transitionAfter we get some solid graph methodology that we like from #33076, we should use those graphs to closely compare the sbws consensus votes to TorFlow consensus votes.
After that, when we try to switch to sbws, we should get before and a...After we get some solid graph methodology that we like from #33076, we should use those graphs to closely compare the sbws consensus votes to TorFlow consensus votes.
After that, when we try to switch to sbws, we should get before and after graphs of onionperf data, like we decided on for #33076.https://gitlab.torproject.org/legacy/trac/-/issues/33076Graph consensus and vote information from Rob's experiments2022-03-04T12:51:36ZMike PerryGraph consensus and vote information from Rob's experimentsThis is a ticket for the work to graph the historical onionperf data from Rob's relay flooding experiment.
Some discussion threads:
https://lists.torproject.org/pipermail/tor-scaling/2019-December/000077.html
https://lists.torproject.or...This is a ticket for the work to graph the historical onionperf data from Rob's relay flooding experiment.
Some discussion threads:
https://lists.torproject.org/pipermail/tor-scaling/2019-December/000077.html
https://lists.torproject.org/pipermail/tor-scaling/2020-January/000081.html
Basically, we want to have a standard way to graph results from key metrics from before, during, and after the experiment.
In this case, we want CDF-TTFB, CDF-DL from onionperf results.
We also want CDF-Relay-Stream-Capacity and CDF-Relay-Utilization for the consensus, as well as from the votes, to see if the votes from TorFlow drastically differ from sbws during the experiment.
https://trac.torproject.org/projects/tor/wiki/org/roadmaps/CoreTor/PerformanceMetrics
**Update from June 10, 2020: We finished the CDF-TTFB and CDF-DL portions by adding these graphs to OnionPerf's visualize mode. The remaining parts are the CDF-Relay-* graphs that are based on consensuses and votes. Keep this in mind when reading comments up to June 10, 2020.**https://gitlab.torproject.org/legacy/trac/-/issues/33033sbws stuck thinking a destination is dead2020-06-13T16:16:21ZTom Rittertom@ritter.vgsbws stuck thinking a destination is deadIt looks like around Jan 19th we couldn't talk to our (singular) destination:
```
grep "The last 9 times the destination" * | cut -d ":" -f 2 | sort | uniq -c
507 Jan 19 20
3132 Jan 19 21
3121 Jan 19 22
2809 Jan 19 23
3...It looks like around Jan 19th we couldn't talk to our (singular) destination:
```
grep "The last 9 times the destination" * | cut -d ":" -f 2 | sort | uniq -c
507 Jan 19 20
3132 Jan 19 21
3121 Jan 19 22
2809 Jan 19 23
3120 Jan 20 00
3110 Jan 20 01
2800 Jan 20 02
3128 Jan 20 03
3130 Jan 20 04
3139 Jan 20 05
2834 Jan 20 06
3151 Jan 20 07
3158 Jan 20 08
2844 Jan 20 09
3174 Jan 20 10
3180 Jan 20 11
2862 Jan 20 12
3192 Jan 20 13
3200 Jan 20 14
3212 Jan 20 15
2878 Jan 20 16
3180 Jan 20 17
3181 Jan 20 18
2846 Jan 20 19
3136 Jan 20 20
3130 Jan 20 21
2823 Jan 20 22
3150 Jan 20 23
3150 Jan 21 00
2835 Jan 21 01
3150 Jan 21 02
3152 Jan 21 03
3162 Jan 21 04
2854 Jan 21 05
3180 Jan 21 06
3180 Jan 21 07
2862 Jan 21 08
3190 Jan 21 09
3190 Jan 21 10
2862 Jan 21 11
3180 Jan 21 12
3171 Jan 21 13
2853 Jan 21 14
3171 Jan 21 15
3170 Jan 21 16
2846 Jan 21 17
3153 Jan 21 18
3136 Jan 21 19
2805 Jan 21 20
3110 Jan 21 21
3110 Jan 21 22
2794 Jan 21 23
3100 Jan 22 00
3100 Jan 22 01
2790 Jan 22 02
3108 Jan 22 03
3124 Jan 22 04
2817 Jan 22 05
3136 Jan 22 06
3150 Jan 22 07
2837 Jan 22 08
3160 Jan 22 09
3162 Jan 22 10
2853 Jan 22 11
3170 Jan 22 12
3172 Jan 22 13
2871 Jan 22 14
3190 Jan 22 15
3190 Jan 22 16
2871 Jan 22 17
3190 Jan 22 18
3154 Jan 22 19
2835 Jan 22 20
3142 Jan 22 21
3140 Jan 22 22
2812 Jan 22 23
3120 Jan 23 00
3120 Jan 23 01
2814 Jan 23 02
3130 Jan 23 03
2812 Jan 23 04
3130 Jan 23 05
2198 Jan 23 06
```
However we aren't retrying it properly (and the log message isn't complete):
```
grep "was not tried" scanner.log* | cut -d ":" -f 2 | sort | uniq -c
1 Jan 19 20
3 Jan 19 21
1 Jan 19 22
1 Jan 19 23
1 Jan 20 02
1 Jan 20 07
1 Jan 20 18
1 Jan 21 15
```
```
grep "was not tried" scanner.log*
scanner.log.18:Jan 21 15:32:03 INFO Thread-2 destination.py:251 - is_functional - The destination %s was not tried for %s hours, it is going to by tried again.
scanner.log.28:Jan 20 18:15:50 INFO Thread-2 destination.py:251 - is_functional - The destination %s was not tried for %s hours, it is going to by tried again.
scanner.log.32:Jan 20 07:32:22 INFO Thread-2 destination.py:251 - is_functional - The destination %s was not tried for %s hours, it is going to by tried again.
scanner.log.35:Jan 20 02:10:50 INFO Thread-4 destination.py:251 - is_functional - The destination %s was not tried for %s hours, it is going to by tried again.
scanner.log.36:Jan 19 23:30:03 INFO Thread-4 destination.py:251 - is_functional - The destination %s was not tried for %s hours, it is going to by tried again.
scanner.log.37:Jan 19 20:55:41 INFO Thread-3 destination.py:251 - is_functional - The destination %s was not tried for %s hours, it is going to by tried again.
scanner.log.37:Jan 19 21:01:49 INFO Thread-3 destination.py:251 - is_functional - The destination %s was not tried for %s hours, it is going to by tried again.
scanner.log.37:Jan 19 21:14:09 INFO Thread-2 destination.py:251 - is_functional - The destination %s was not tried for %s hours, it is going to by tried again.
scanner.log.37:Jan 19 21:32:37 INFO Thread-4 destination.py:251 - is_functional - The destination %s was not tried for %s hours, it is going to by tried again.
scanner.log.37:Jan 19 22:15:55 INFO Thread-4 destination.py:251 - is_functional - The destination %s was not tried for %s hours, it is going to by tried again.
```sbws: 1.1.x-finalhttps://gitlab.torproject.org/legacy/trac/-/issues/33009sbws bandwidth scans should require a minimum exit bandwidth2020-06-13T16:16:20Zteorsbws bandwidth scans should require a minimum exit bandwidthWhen sbws is constructing a two-hop measurement circuit to run a test, it tries to pick an exit that has at least twice the consensus weight of the current relay-under-test:
https://github.com/torproject/sbws/blob/master/sbws/core/scanne...When sbws is constructing a two-hop measurement circuit to run a test, it tries to pick an exit that has at least twice the consensus weight of the current relay-under-test:
https://github.com/torproject/sbws/blob/master/sbws/core/scanner.py#L216
So this means that in this case, sbws would have picked any exit that was not a BadExit, has an acceptable ExitPolicy, and has a consensus weight of at least, well, 2. That's not a lot.
As it turns out, something like 10% of exits have under a 600Kbyte/sec advertised bandwidth. So it seems pretty easy from this weight=1 bootstrap scenario to get paired with an exit that will give poor test results.
Perhaps bwauth path selection should also choose a testing pair from exits/relays with a certain absolute minimum of weight or advertised bandwidth?
Reported by Jimmy on tor-relays:
https://lists.torproject.org/pipermail/tor-relays/2020-January/018027.htmlsbws: 1.1.x-finalGeorg KoppenGeorg Koppenhttps://gitlab.torproject.org/legacy/trac/-/issues/30899Include the commit hash in the sbws version2020-06-13T16:16:13ZteorInclude the commit hash in the sbws versionWhen we put the sbws version in the bandwidth file, we should include the git commit hash.
That helps us tell dev versions apart.When we put the sbws version in the bandwidth file, we should include the git commit hash.
That helps us tell dev versions apart.sbws: 1.1.x-final