Trac issueshttps://gitlab.torproject.org/legacy/trac/-/issues2020-06-13T15:37:53Zhttps://gitlab.torproject.org/legacy/trac/-/issues/29317Protect heartbeat logs using PrivCount2020-06-13T15:37:53ZteorProtect heartbeat logs using PrivCountSometimes, we ask relay operators to share their heartbeat logs on mailing lists or the bug tracker. Other relay operators keep graphs of their heartbeat logs (or other tor statistics).
Once we have implemented PrivCount, we can add noi...Sometimes, we ask relay operators to share their heartbeat logs on mailing lists or the bug tracker. Other relay operators keep graphs of their heartbeat logs (or other tor statistics).
Once we have implemented PrivCount, we can add noise to these stats, round them, and then log them:
https://lists.torproject.org/pipermail/tor-dev/2019-February/013659.htmlTor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/29271PrivCount proof of concept: use pubsub for events that PrivCount cares about2020-06-13T15:37:44ZteorPrivCount proof of concept: use pubsub for events that PrivCount cares aboutOnce the pubsub API is finished (#28226), we should use it for PrivCount events:
* new consensus
* start/end stats collection
* infrequent statistical events (not bandwidth)Once the pubsub API is finished (#28226), we should use it for PrivCount events:
* new consensus
* start/end stats collection
* infrequent statistical events (not bandwidth)Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/29270Work out the config and control interfaces to PrivCount2020-06-13T15:37:44ZteorWork out the config and control interfaces to PrivCountOnce #29201 and #29211 are done, we should work out the config and control interfaces to PrivCount, and implement them.Once #29201 and #29211 are done, we should work out the config and control interfaces to PrivCount, and implement them.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/29268PrivCount proof of concept: Use the subsystems API for PrivCount2020-06-13T15:37:43ZteorPrivCount proof of concept: Use the subsystems API for PrivCountWe should use the subsystems API for the PrivCount proof of concept.We should use the subsystems API for the PrivCount proof of concept.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/29266PrivCount proof of concept: round results to hide usage over time2020-06-13T15:37:43ZteorPrivCount proof of concept: round results to hide usage over timeWe need to round the results to hide usage over time. We should work out how much to round each result, and how to implement the rounding.
Here is one possible scheme:
* we want to round the result to the closest multiple of N
* each ta...We need to round the results to hide usage over time. We should work out how much to round each result, and how to implement the rounding.
Here is one possible scheme:
* we want to round the result to the closest multiple of N
* each tally reporter rounds their blinded subtotal to the closest multiple of N
* the final un-blinded result is rounded to the nearest multiple of N (plus an arbitrary value between 0 and N, due to the unblinding), as long as at least one Tally Reporter is honestTor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/29265PrivCount proof of concept: Re-implement the C code in Rust2020-06-13T15:37:42ZteorPrivCount proof of concept: Re-implement the C code in RustWe'll create some C code with a debugging version of PrivCount. Then we need to create libprivcount (#29264) and re-implement the code in Rust.We'll create some C code with a debugging version of PrivCount. Then we need to create libprivcount (#29264) and re-implement the code in Rust.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/29264Create a libprivcount repository containing a rust crate2020-06-13T15:37:42ZteorCreate a libprivcount repository containing a rust crateWe want the tor-independent parts of PrivCount to be usable by researchers and other groups. Let's create a Rust crate, git repository, a trac component for libprivcount.We want the tor-independent parts of PrivCount to be usable by researchers and other groups. Let's create a Rust crate, git repository, a trac component for libprivcount.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/29261Create consensus parameters to turn off all stats and individual stats2020-06-13T15:37:41ZteorCreate consensus parameters to turn off all stats and individual statsWhen we implement statistics in PrivCount, we'll want to turn off our existing statistics. Let's create consensus parameters for each statistic, and maybe also ExtraInfoStatistics.
We want to make this change as early as possible, so th...When we implement statistics in PrivCount, we'll want to turn off our existing statistics. Let's create consensus parameters for each statistic, and maybe also ExtraInfoStatistics.
We want to make this change as early as possible, so that Tor versions don't publish less safe statistics forever.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/29131Split rephist.c into modules for each type of statistic2020-06-13T15:37:00ZteorSplit rephist.c into modules for each type of statisticLet's split up rephist.c by statistic. We can also split out the stat-specific structs at the same time.
If we do this in 0.4.1, it will help us remove the bandwidth stats as part of our Sponsor V work.Let's split up rephist.c by statistic. We can also split out the stat-specific structs at the same time.
If we do this in 0.4.1, it will help us remove the bandwidth stats as part of our Sponsor V work.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/29129PrivCount proof of concept: decide if we want to remove bandwidth statistics2020-06-13T15:36:59ZteorPrivCount proof of concept: decide if we want to remove bandwidth statisticsIf we're really happy with the proof of concept, we could remove the bandwidth statistics from extrainfo document once most relays are running that code.
Alternately, we could remove bandwidth statistics when we move the proof of concep...If we're really happy with the proof of concept, we could remove the bandwidth statistics from extrainfo document once most relays are running that code.
Alternately, we could remove bandwidth statistics when we move the proof of concept code to the generic PrivCount config framework.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/29027PrivCount proof of concept: put the PrivCount statistics in a stats/ file2020-06-13T15:36:31ZteorPrivCount proof of concept: put the PrivCount statistics in a stats/ fileMaybe we should put all the stats in files while we're at it.Maybe we should put all the stats in files while we're at it.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/29019Make relays report bandwidth usage more often in test networks2020-06-13T15:36:29ZteorMake relays report bandwidth usage more often in test networksLet's put something more interesting in this line:
```
pc-rw-bytes 0
```Let's put something more interesting in this line:
```
pc-rw-bytes 0
```Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/29018Make all statistics depend on ExtraInfoStatistics2020-06-13T15:42:53ZteorMake all statistics depend on ExtraInfoStatisticsLike #29017, when a user sets ExtraInfoStatistics 0, they probably don't want any statistics in their extra-info document.Like #29017, when a user sets ExtraInfoStatistics 0, they probably don't want any statistics in their extra-info document.Tor: 0.4.1.x-finalteorteorhttps://gitlab.torproject.org/legacy/trac/-/issues/29011PrivCount proof of concept: work out how to do integration testing without ch...2020-06-13T15:36:24ZteorPrivCount proof of concept: work out how to do integration testing without chutneyTor and chutney can be unreliable, so we might need to do integration tests on PrivCount without chutney.
Here's one possible design:
* split PrivCount into generic rust modules, tor-specific rust modules, and tor interface modules
* cr...Tor and chutney can be unreliable, so we might need to do integration tests on PrivCount without chutney.
Here's one possible design:
* split PrivCount into generic rust modules, tor-specific rust modules, and tor interface modules
* create a command-line interface to the generic rust modules and tor-specific rust modules, which takes events, writes results, and aggregates results from files
* use these file-based interfaces to integration test the non-tor, non-network PrivCount codeTor: 0.4.1.x-finalhttps://gitlab.torproject.org/legacy/trac/-/issues/29010PrivCount proof of concept: work out how to do CI for PrivCount using chutney2020-06-13T15:36:23ZteorPrivCount proof of concept: work out how to do CI for PrivCount using chutneyWe can do CI for PrivCount using chutney:
* launch a standard chutney network
* launch the Tally Reporter (TR)
* run some data through the network
* check the results using a chutney script
If we want to check extrainfo descriptors agai...We can do CI for PrivCount using chutney:
* launch a standard chutney network
* launch the Tally Reporter (TR)
* run some data through the network
* check the results using a chutney script
If we want to check extrainfo descriptors against PrivCount statistics, we'll need to publish them more often in test networks.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/29009PrivCount proof of concept: use blinding and Shamir secret sharing2020-06-13T15:36:23ZteorPrivCount proof of concept: use blinding and Shamir secret sharingWe should split PrivCount secrets and aggregate them using privcount_shamir. We'll need to merge privcount_shamir into tor, and use it on the Data Collector and Tally Reporter.We should split PrivCount secrets and aggregate them using privcount_shamir. We'll need to merge privcount_shamir into tor, and use it on the Data Collector and Tally Reporter.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/29008PrivCount proof of concept: add a PrivCount module, torrc option, and protocol2020-06-13T15:36:22ZteorPrivCount proof of concept: add a PrivCount module, torrc option, and protocolWe should make PrivCount into an optional module, with a compile-time option. If Rust is off, or the PrivCount option is off, PrivCount will be disabled.
When PrivCount is enabled, it will advertise support for the PrivCount protocol ve...We should make PrivCount into an optional module, with a compile-time option. If Rust is off, or the PrivCount option is off, PrivCount will be disabled.
When PrivCount is enabled, it will advertise support for the PrivCount protocol versions and statistics versions it supports.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/29007PrivCount proof of concept: implement debugging for PrivCount2020-06-13T15:36:21ZteorPrivCount proof of concept: implement debugging for PrivCountLet's debug PrivCount by splitting statistics into multiple categories:
* public statistics: information that is already public
* private statistics: information that is sensitive
We can safely log public statistics at any log level. We...Let's debug PrivCount by splitting statistics into multiple categories:
* public statistics: information that is already public
* private statistics: information that is sensitive
We can safely log public statistics at any log level. We can publish public statistics' unencrypted, per-relay values and noise amounts in extrainfo descriptors.
We need a debug mode to log private statistics above info level. When we are in debug mode, and not using any public authorities, we can publish private statistics' unencrypted, per-relay values and noise amounts in extrainfo descriptors.
We'll work out other debugging mechanisms during the proof of concept process.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/29006PrivCount proof of concept: add noise to counters2020-06-13T15:36:21ZteorPrivCount proof of concept: add noise to countersWe need to add noise to PrivCount counters to protect user activity. We split this noise between the Data Collectors (DCs), so that the final aggregate count includes enough noise to protect at least one user's activity.
For consumed ba...We need to add noise to PrivCount counters to protect user activity. We split this noise between the Data Collectors (DCs), so that the final aggregate count includes enough noise to protect at least one user's activity.
For consumed bandwidth, we can calculate an average user's activity by dividing the total consumed bandwidth by the total number of users. Then, we split this noise across all the DCs (the relays that support version 1 of the PrivCount protocol), using each relay's consensus weight fraction.
DCs need to add noise when we start, and when each consensus arrives, based on our own consensus weight fraction in that consensus. (How do we deal with the off-by-one error here? Weight by the time between the last consensus, the round end/start, and the time we'll try to fetch the next consensus?)
We'll need to add excess noise to compensate for relay failures, and malicious relays.
We should just use the easiest Gaussian sampling method available. Adding any noise is an improvement for almost all of our statistics - we can deal with floating point issues later.
We should keep track of:
* ConsensusCount - the number of consensuses we've seen
* PrivCountConsensusWeightFraction - the consensus weight of this relay, divided by the consensus weight of all relays supporting PrivCount
For each counter, we should keep track of:
* NoiseVarianceAmount - the total variance (standard deviation squared) of all noise added to this counter. We use variance because it's additive. (And standard deviation is not.)Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/29005PrivCount proof of concept: implement consumed bandwidth counters2020-06-13T15:36:20ZteorPrivCount proof of concept: implement consumed bandwidth countersWe want to add a copy of the current consumed bandwidth statistic to PrivCount:
* consumed bandwidth by relay flags
https://metrics.torproject.org/bandwidth-flags.html
We can defer these statistics, because they are not as sensitive to ...We want to add a copy of the current consumed bandwidth statistic to PrivCount:
* consumed bandwidth by relay flags
https://metrics.torproject.org/bandwidth-flags.html
We can defer these statistics, because they are not as sensitive to manipulation by hostile clients or services:
* Bandwidth spent on answering directory requests
* Fraction of connections used uni-/bidirectionally
We will need counters for:
* read-history
* write-history
Split each counter into 4 (G, E, G and E, not G and not E) using these flags:
* Guard
* Exit and not BadExit
We can do the G/E split on the Data Collector (DC).
We can also do a smaller check aggregation for each group on the Tally Reporter (TR).
Check that relays are Running.
We can do the Running check on the DC.
We should also check Running on the TR, and exclude DCs that weren't Running at all during the period.
How often should we update the bandwidth counters?
If we update counters at the end of the day, we can match the current statistics exactly:
"we only include bandwidth histories for a given day if a relay was listed as running in a consensus at least once on that day. We attribute bandwidth to guards and/or exits if a relay was a guard and/or exit at least in one consensus on a day."
But we would be storing some sensitive data in memory for the whole day.
Instead, we could update the counters whenever we queue data, based on the flags in the current consensus we have. (The difference will likely be minimal.)
If updating the counters multiple times per second is too CPU-intensive, we can update them every few seconds. If that's too often, we can update them just before we delete an old consensus.
Sources:
https://metrics.torproject.org/reproducible-metrics.html#traffic
https://trac.torproject.org/projects/tor/wiki/org/meetings/2018MexicoCity/Notes/PrivCountTor: unspecified