Design overview for reducing directory download volume, especially for low-bandwidth client
Here I'm going to discuss a bunch of ideas for making directories use less bandwidth. See Sponsor4Plan for the overview of what we do when.
We're not necessarily going to do all of these ideas. First we're going to gather metrics to figure out the impact of each one, and then we'll decide what to actually implement. This page will guide us as we proceed.
Idea 1: Consensus diffs
Proposal 140. Ticket #13339.
Idea: Instead of fetching a consensus, clients could fetch a diff from the most recent consensus they have.
Measurement questions to answer: How (in)frequently do clients actually fetch a new consensus? For busy clients that do it once an hour, a diff is a clear win. But do typical clients want a day-by-day diff? Week-by-week?
Status: Implemented in Tor 0.3.1.x.
Idea 2: Negotiate better compression algorithms
Ticket #21211.
Idea: There are other algorithms like LZMA/LZMA2 or bzip2 that deliver better compression than zlib. We could use HTTP encoding negotiation to select them when available.
Measurement/design questions to answer:
- How do these algorithms perform in space and CPU? Is it realistic to do them on-line? (See Idea 3 if not.)
- Which of these algorithms has the best code that implements it?
- Which of these algorithms compresses the best?
- Would it make sense to provide a dictionary to the compression algorithms?
- How much would this help?
Small tool that should allow us to figure out what to do with the above points. The tool can be found here: Gitlab.
Initial set of data from running the bench tool on the documents saved by my local Tor daemon: Google Spreadsheet.
- What about using BSON instead of ASCII to begin with prior to the compression step?
TBD.
Proposal: Proposal 278, written by ahf.
Status : Implemented in Tor 0.3.1.x.
Idea 3: Better (micro)descriptor selection queries
Ticket #21212 is germane?
Idea: It's predictable that a client who just asked for consensus-diff X:Y will want the microdescriptors which appear in Y but not X. It's also predictable that a client who just asked for consensus Y, having had no previous consensus, will want all the microdescriptors in consensus Y. But right now, clients ask for microdescriptors by SHA256 digest. If instead they ask in some terser format, we can deliver two benefits: reduced query size, and the opportunity to spend more CPU compressing these queries, since we expect to serve them more frequently.
Measurement/design questions:
- Same as for Idea 1, in terms of "how often do clients want to get a new consensus."
- How much would this idea help?
- Are there any new attack opportunities here? Any new defense opportunities?
Status: Proposal 281 describes and analyzes a solution here; it is not yet implemented, because most of the benefit came from other proposals in this area.
Idea 4: A TAP-less consensus/microdesc flavor?
Proposal 245 is relevant.
Idea: The TAP onion-key field is only used today for some legacy hidden service stuff. But it is the largest component of microdescriptors. If we could remove it or something, all microdescriptors would get much much smaller. We might do this with a new consensus flavor and microdescriptor type? We'd still need to support legacy hidden services somehow. We might need to start laying the groundwork early.
TAP keys are only used by legacy hidden services and legacy hidden service clients:
- Introduction point (usually 3, up to 10) TAP keys are placed in the hidden service descriptor by the hidden service.
- Hidden service clients use intro point TAP keys from the hidden service descriptor (not the consensus or microdesc) to authenticate the intro point.
- The rendezvous point (1) TAP key is placed in the INTRODUCE cell by the hidden service client.
- Hidden services use the rendezvou point TAP key from the INTRODUCE cell (not the consensus or microdesc) to authenticate the rend point.
Here's a sketch of a design:
- Hidden Services can choose their intro points (or slightly more than the desired number in case some don't work), then download TAP microdescriptors for those relays over a 3-hop circuit,
- Clients can choose a rendezvous point, then download a TAP microdescriptor for that relay over a 3-hop circuit,
- Clients and hidden services can stop using the TAP microdescriptor flavour entirely as soon as legacy hidden services are turned off.
Clients can't download the TAP flavour on-demand (it leaks too much info, or it takes too long, or both)
Are there any other fields we'd want to remove in a new flavour?
(Idea 7 answers this: maybe published date, bandwidth.)
Status: Migrating consensus flavors is a great deal of work, and much of the benefit here would come out of simply generating microdescriptors less often, which we did for idea 5 below. The logical time to do this is when we no longer need TAP keys.
Idea 5: Change microdescriptors less frequently?
Idea: How often do microdescriptors change? Do some relays change their microdescriptors much more (or less) frequently than others? How can we make them change less?
Questions: * Let's analyze whether the microdescriptor change rate is constant or not. (in-progress, nickm is doing. Preliminary results suggest that 99% of relays change their microdescs no more than you'd expect given onion key rotation. A small number flap back and forth, though.) * What if we made onion keys rotate much less often?
Results:
- scripts and results
- The most frequent reasons for a microdesc to change is because the onion-key changed. This accounts for nearly all of the changes. (Roger thinks it would be safe to increase the rotation interval. Nick does too. So does teor, FWIW.)
- In the case of multiple relays running with the same ID, the behavior can be pathological.
- Key pinning might fix some of these cases, and if not, we should exclude relays that change too frequently, or not often enough (#20055).
Proposal:
- Draft proposal 1: https://people.torproject.org/~nickm/volatile/proposals/xxx-rotate-onion-keys-less.txt
- Draft proposal 2: https://people.torproject.org/~nickm/volatile/proposals/xxx-detect-id-sharing.txt
Status: These became proposals 274 and 277. 274 is implementd in 0.3.1.x. 277 does not require changes in the Tor network, but it hasn't received an implementation yet.
Idea 6: Combined consensus/microdescriptor diff
Idea: The microdescriptors for a consensus are generated at the same time as the consensus. What if just concatenated them, and had diffs from one of those to the next be what you downloaded? This would use the machinery from idea 1 and idea 2 to solve idea 3.
Note that this is not an obvious win -- it would underperform if microdescriptors tended to flip back and forth (like, one router having MD 1, then 2, then 1 again.) It would also enable an attack where one or more relays' microdescriptors deliberately flip back and forth every hour.
Questions:
- As for 1 and 3.
- Would this help? How much? Are other schemes for 3 better?
Status: Superseded by 3; there is no benefit to this over doing just 3 in my analysis.
Idea 7: Minimize consensus diff size
Idea: If we do consensus diffs (idea 1), what makes them change so much? Can we do anything to make the consensus change less so the diffs are smaller?
Questions: What changes, typically, from one consensus to the next?
Results:
- scripts and results
- The single most frequent change in a consensus is a variation in the bandwidth measure.
- Bandwidths are currently 3 significant figures. We could make this 2 (or 1) without much loss of accuracy.
- The 'published' date also changes a lot, and serves no real good purpose.
- Doesn't it protect relays from having their old descriptors replayed to the authorities?
- The microdescriptor field changes as often as the microdescriptor does.
Proposal:
- Draft proposal 1: https://people.torproject.org/~nickm/volatile/proposals/xxx-lower-bw-granularity.txt
- Draft proposal 2: https://people.torproject.org/~nickm/volatile/proposals/xxx-md-published-time-is-silly.txt
Status: These turned into proposals 276 and 277. Both have received discussion. For 277 (published times), we found at least one use of published time that prevents it from being implemented now. For 276, we've been discussing tradeoffs in different schemes, and need to simulate the effects on path selection correctness.
Idea 8: Update directory information less frequently (client-side)
Idea: What if clients simply try less hard to stay up-to-date? What would the anonymity impact be?
Status: This doesn't seem worthwhile, given consensus diffs.
Idea 9: Update directory information less frequently (server side)
Idea: How bad would it be if there were a consensus once every 2 hours? every 3?
Note: The shared random interval depends on the consensus interval, and is currently hardcoded at 1 hour * 12 rounds * 2 phases = 24 hours. We'd need to implement something like #18295 if we still wanted shared random every 24 hours.
Status: This doesn't seem worthwhile, given consensus diffs.
Idea 10: Avoid requesting microdescriptors in small batches
Idea: Currently when we want e.g. 21 microdescriptors, we'll probably request them in 3 batches of 7. This makes our compression ratios worse than they would be otherwise. But it ensures that a single relay can't selectively send us microdescriptors.
Note: There's not much reason to make this change if we also make the change of idea 3 above.
Status: Implementation pending review in #23220.
Idea 11: Stop caching old consensus flavours
Idea: Currently directory caches download the default (also called "ns"), and "microdesc" flavours. But recent relays and clients only use the microdesc flavour. We should work out a way to stop caching old flavours once no supported versions need them. This reduces the load on the authorities.
We could also work out a way to cache new flavours even if caches can't parse them.
One way to do this is to add a line to the consensus that lists the flavours that should be cached.
Note: We might need to serve empty documents to older client versions, so they don't panic and overload the network.
Status: We can't implement this until 2020, when 0.2.9 and earlier are obsolete.