Trac issueshttps://gitlab.torproject.org/legacy/trac/-/issues2020-06-13T15:08:09Zhttps://gitlab.torproject.org/legacy/trac/-/issues/22051Use streaming compression API in the non-streaming compression API2020-06-13T15:08:09ZAlexander Færøyahf@torproject.orgUse streaming compression API in the non-streaming compression APIWe currently have some code duplication in tor's compress/uncompress API and the streaming compression/decompression API.
It could make sense to check if the streaming API could be used to implement `tor_gzip_compress()` and `tor_gzip_u...We currently have some code duplication in tor's compress/uncompress API and the streaming compression/decompression API.
It could make sense to check if the streaming API could be used to implement `tor_gzip_compress()` and `tor_gzip_uncompress()` (and the new LZMA & Zstandard implementations when they land).Tor: 0.3.1.x-finalNick MathewsonNick Mathewsonhttps://gitlab.torproject.org/legacy/trac/-/issues/21215Lower the directory overhead for low-bandwidth clients2020-06-13T15:05:26ZNick MathewsonLower the directory overhead for low-bandwidth clientsThis is a parent ticket for the actual implementation work of sponsor4. As we complete measurement (#21205) and design (#21209), we'll add child tickets here for the particular work we are going to accomplish.This is a parent ticket for the actual implementation work of sponsor4. As we complete measurement (#21205) and design (#21209), we'll add child tickets here for the particular work we are going to accomplish.Tor: 0.3.2.x-finalNick MathewsonNick Mathewsonhttps://gitlab.torproject.org/legacy/trac/-/issues/21214Based on measurement of #21205, write/analyze additional proposals and ticket...2020-06-13T15:05:26ZNick MathewsonBased on measurement of #21205, write/analyze additional proposals and tickets for lowering bw usage for directory stuffBased on the measurements we get from #21205, we'll probably learn more about some actual bandwidth needs, and the circumstances when dir BW is overused. We should add tickets to fix the bugs, possibly with proposals, based on what we f...Based on the measurements we get from #21205, we'll probably learn more about some actual bandwidth needs, and the circumstances when dir BW is overused. We should add tickets to fix the bugs, possibly with proposals, based on what we find otu.Tor: 0.3.2.x-finalhttps://gitlab.torproject.org/legacy/trac/-/issues/21213Write and analyze proposals for fetching consensuses/microdescriptors less fr...2020-06-13T15:05:25ZNick MathewsonWrite and analyze proposals for fetching consensuses/microdescriptors less frequently?**The idea**: Our current algorithm for deciding whether you need a new consensus is ad hoc; we just picked an interval more or less at random.
Depending on the results from #21205, we may learn that it's not as necessary as we had tho...**The idea**: Our current algorithm for deciding whether you need a new consensus is ad hoc; we just picked an interval more or less at random.
Depending on the results from #21205, we may learn that it's not as necessary as we had thought for a client to fetch consensuses and microdescriptors so often. If that's the case, we should have proposals and analyses for (optionally?) decreasing the frequency of our downloads.
There may be different results here for "busy" and "not so busy" clients.
Of course, the analysis needs to include the security impact.Tor: 0.3.2.x-finalNick MathewsonNick Mathewsonhttps://gitlab.torproject.org/legacy/trac/-/issues/21211Write and analyze proposals for compressing consensus (diff)s with better alg...2020-06-13T15:05:24ZNick MathewsonWrite and analyze proposals for compressing consensus (diff)s with better algorithms**The idea:** Consensus documents are compressed with zlib, but nobody has to compress any given consensus more than once. Therefore, we can safely use more CPU compressing them, and save bandwidth on consensus downloads by switching to...**The idea:** Consensus documents are compressed with zlib, but nobody has to compress any given consensus more than once. Therefore, we can safely use more CPU compressing them, and save bandwidth on consensus downloads by switching to something else instead of zlib for consensuses.
This same analysis also applies to consensus diffs.
For this ticket, we should look at the code complexity and potential bandwidth savings here, and decide whether they are worth it.Tor: 0.3.1.x-finalAlexander Færøyahf@torproject.orgAlexander Færøyahf@torproject.orghttps://gitlab.torproject.org/legacy/trac/-/issues/21209Write, revise, analyze proposals for ways to use less directory bandwidth2020-06-13T15:05:23ZNick MathewsonWrite, revise, analyze proposals for ways to use less directory bandwidthWe have a bunch of ideas about how to use less bandwidth for directory stuff. But most of them need to be expanded into proposals, and some of the the ones that *are* proposals need better analysis -- informed in part by the information ...We have a bunch of ideas about how to use less bandwidth for directory stuff. But most of them need to be expanded into proposals, and some of the the ones that *are* proposals need better analysis -- informed in part by the information we hope to get from #21205.
This is a parent ticket. Each child ticket will be for one particular proposal.Tor: 0.3.2.x-finalNick MathewsonNick Mathewsonhttps://gitlab.torproject.org/legacy/trac/-/issues/21208Measure overall client bandwidth usage and circuit counts2020-06-13T15:05:23ZNick MathewsonMeasure overall client bandwidth usage and circuit countsSee parent ticket for context.
To put directory requests into context, we should analyze non-directory usage in a similar way, for bandwidth and circuit usage. This part doesn't need to be so fine-grained though.See parent ticket for context.
To put directory requests into context, we should analyze non-directory usage in a similar way, for bandwidth and circuit usage. This part doesn't need to be so fine-grained though.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/21207Test scenarios for clients that are idle for large periods of time2020-06-13T15:05:22ZNick MathewsonTest scenarios for clients that are idle for large periods of timeSee parent ticket for context.
We could use some automated tests that exercise clients in a certain predictable way (as described in the parent ticket), and record the directory bandwidth usage and non-directory bandwidth usage.
I thin...See parent ticket for context.
We could use some automated tests that exercise clients in a certain predictable way (as described in the parent ticket), and record the directory bandwidth usage and non-directory bandwidth usage.
I think that we should do this on the real network, and not on a test network: otherwise, there are far too many ways that we can get it wrong.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/21206Measure client up/down bandwidth for directory requests, split by type2020-06-13T15:05:22ZNick MathewsonMeasure client up/down bandwidth for directory requests, split by typeSee parent ticket for context.
We need a way to measure, over time, split up by type of directory request, how much bandwidth a client uses for requests and for responses.
We should include both directory lookups that are successful an...See parent ticket for context.
We need a way to measure, over time, split up by type of directory request, how much bandwidth a client uses for requests and for responses.
We should include both directory lookups that are successful and those that are not.
We should, if possible, measure this with and without circuit overhead. But not if it's too hard.
We should, if possible, count failed circuits that are opened only for directory requests. But not if it's too hard.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/21205Instrument clients to measure directory usage2020-06-13T15:05:21ZNick MathewsonInstrument clients to measure directory usageWe want to reduce the directory overhead needed for low-bandwidth clients. To do this well, we should make it so a client can measure how much stuff it downloads, and how much of what kind of traffic it uses to download it.
In general,...We want to reduce the directory overhead needed for low-bandwidth clients. To do this well, we should make it so a client can measure how much stuff it downloads, and how much of what kind of traffic it uses to download it.
In general, this should be local-only measurements for use while testing: there's no reason we need a complex infrastructure for this one, since the results should be pretty uniform given the same client software in the same circumstances.
Here are some strawman scenarios for client usage which we might want to think about:
* Client is completely unused.
* Client is completely unused, but restarted or HUPed once every N hours.
* Client is used to fetch a .onion website once every N hours.
* Client is used to fetch a http website once every N hours.
* Client is turned on, and connected to (say) IRC all the time.
The relevant values of "N" are probably something like: .5 hours, 1 hour, 4 hours, 8 hours, ... up to a week?
This is a parent-ticket. Child tickets will focus on particular measurements we want.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/16844Slow clients can't bootstrap because they expire their consensus fetch but th...2020-06-13T15:42:21ZRoger DingledineSlow clients can't bootstrap because they expire their consensus fetch but then receive all the bytes from it anyway, making them expire their next fetch, putting them in a terrible loopIf you start your Tor client with no cached directory info, and on a slow (high latency) link, you get:
```
$ tail -f tordebug-log|grep connected
Aug 09 16:17:12.299 [info] circuit_handle_first_hop(): Next router is $7EA6EAD6FD83083C538...If you start your Tor client with no cached directory info, and on a slow (high latency) link, you get:
```
$ tail -f tordebug-log|grep connected
Aug 09 16:17:12.299 [info] circuit_handle_first_hop(): Next router is $7EA6EAD6FD83083C538F44038BBFA077587DD755~7EA6EAD6FD83083C538 at 194.109.206.212: Not connected. Connecting.
Aug 09 16:17:12.826 [info] connection_edge_process_relay_cell_not_open(): 'connected' received for circid 2747423797 streamid 16685 after 0 seconds.
Aug 09 16:21:57.298 [info] circuit_handle_first_hop(): Next router is $9695DFC35FFEB861329B9F1AB04C46397020CE31~9695DFC35FFEB861329 at 128.31.0.39: Not connected. Connecting.
Aug 09 16:21:59.099 [info] connection_edge_process_relay_cell_not_open(): 'connected' received for circid 4248917890 streamid 42612 after 1 seconds.
Aug 09 16:22:09.711 [info] circuit_handle_first_hop(): Next router is $332CD489177F202570A7021328A17A91BF823889~332CD489177F202570A at 192.150.94.49: Not connected. Connecting.
Aug 09 16:22:09.711 [info] circuit_handle_first_hop(): Next router is $90E9E44FD74B98F87F7573F917AE4AF651B86B4C~90E9E44FD74B98F87F7 at 5.102.146.106: Not connected. Connecting.
Aug 09 16:22:09.712 [info] circuit_handle_first_hop(): Next router is $547C1CDB516798EC66A01F04A5884DCE1A151919~547C1CDB516798EC66A at 87.72.85.217: Not connected. Connecting.
Aug 09 16:22:12.499 [info] connection_edge_process_relay_cell_not_open(): 'connected' received for circid 2850575558 streamid 43745 after 0 seconds.
Aug 09 16:23:33.901 [info] connection_edge_process_relay_cell(): 'connected' received on circid 2850575558 for streamid 43746, no conn attached anymore. Ignoring.
Aug 09 16:24:11.599 [info] connection_edge_process_relay_cell_not_open(): 'connected' received for circid 4148503990 streamid 17036 after 0 seconds.
Aug 09 16:25:34.306 [info] connection_edge_process_relay_cell(): 'connected' received on circid 4148503990 for streamid 17037, no conn attached anymore. Ignoring.
Aug 09 16:26:29.559 [info] connection_edge_process_relay_cell_not_open(): 'connected' received for circid 2948078868 streamid 42748 after 0 seconds.
[...]
```
Oh hey, what's this "no conn attached anymore" issue?
```
$ grep 43746 tordebug-log
Aug 09 16:22:12.302 [info] connection_ap_handshake_send_begin(): Sending relay cell 1 on circ 2850575558 to begin stream 43746.
Aug 09 16:22:22.299 [debug] circuit_detach_stream(): Removing stream 43746 from circ 2850575558
Aug 09 16:23:33.901 [debug] connection_edge_process_relay_cell(): Now seen 1433 relay cells here (command 4, stream 43746).
Aug 09 16:23:33.901 [info] connection_edge_process_relay_cell(): 'connected' received on circid 2850575558 for streamid 43746, no conn attached anymore. Ignoring.
Aug 09 16:23:35.799 [debug] connection_edge_process_relay_cell(): Now seen 1434 relay cells here (command 2, stream 43746).
Aug 09 16:23:35.799 [info] connection_edge_process_relay_cell(): data cell dropped, unknown stream (streamid 43746).
[...]
```
We're hitting the 10-second stream timeout in connection_ap_expire_beginning(), and detaching the stream from the circuit, and presumably trying it again elsewhere.
But that last line above, "data cell dropped", is especially bad news for us -- we get the whole answer, and ignore it all, since we sent an 'end' cell changing our mind but the answer was already coming at us.
This situation comes up because of the optimistic data feature -- we get the answer to our request bundled with the 'connected' cell, which is a feature except in the case where we canceled (and then forgot about) the stream.
For people trying to bootstrap their Tor on a low-bandwidth high-latency network connection, I bet this landmine will be especially frustrating, since you will be clogging your network connection with directory information that you will discard, which in turn will delay the receipt of the other directory information.
You can reproduce the bug on your own, even if you have a great network connection, by starting your Tor with "bandwidthrate 2000 bandwidthburst 5000".Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/13339Prop140: Complete Consensus diffs / Merge GSoC project2020-06-13T14:39:24ZmvdanProp140: Complete Consensus diffs / Merge GSoC projectGoogle Summer of Code finished over a month ago, and during this time I've been tidying up my code a bit and reading it for the merge. You will find it on github:
https://github.com/mvdan/tor
This ticket is for the sole purpose of foll...Google Summer of Code finished over a month ago, and during this time I've been tidying up my code a bit and reading it for the merge. You will find it on github:
https://github.com/mvdan/tor
This ticket is for the sole purpose of following the merge process and its progress. But as always I'm on IRC and mail if you want to contact me directly.
I just rebased against master this morning. Nick and Sebastian have been reviewing my code over the summer, but of course more sets of eyes are needed.
The test coverage for the diff generation and application is fine (see test_consdiff.c), but there aren't any tests for the stuff I wrote to wire it into serving and fetching consensus diffs. Not really sure how to go about that, can't really promise I'd have the time to dive into it.
And regarding commit messages and changelog entries, I pretty much went with my instinct. Chances are they can be improved - the commit messages for future reference and the changelog entries for future release changelogs - so criticism is welcome.Tor: 0.3.1.x-finalNick MathewsonNick Mathewsonhttps://gitlab.torproject.org/legacy/trac/-/issues/10871Download more microdescriptors with a shorter request2020-06-13T14:34:07ZNick MathewsonDownload more microdescriptors with a shorter requestIn a comment on #9969, karsten said:
"""
A few thoughts:
* Would it help if we implemented /tor/micro/all which is mentioned in dir-spec section 5.2 "Downloading router descriptors or microdescriptors" but which is not implemented yet? ...In a comment on #9969, karsten said:
"""
A few thoughts:
* Would it help if we implemented /tor/micro/all which is mentioned in dir-spec section 5.2 "Downloading router descriptors or microdescriptors" but which is not implemented yet? Of course, then clients would download the bulk of microdescriptors from a single directory.
* Do we have to include full digests in requests, or would it be sufficient to ask for the first few digest bytes? Assuming that clients would only accept descriptors matching locally stored full digests. For example, requests could contain only the first 4 (or 8) base64 chars representing the first 3 (or 6) digest bytes. Directories could accept any multiple of 4 base64 chars.
* Mixing the two ideas, how about we add a way to ask for 1/2, 1/4, etc. of all microdescriptors in a single request? The request could be /tor/micro/all/<base64-prefix>/<bits>, so that /tor/micro/all/A/1 means all digests starting with 0 binary, /tor/micro/all/w/2 means all digests starting with 11 binary, etc. Clients could decide how many requests to send from the number of descriptors they need, which may change over time.
Each of these ideas requires us to upgrade authorities and caches before clients will be able to use them.
"""
I'm giving this a separate ticket since it's going to need analysis which #9969 won't.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/7986Lengthen the consensus validity interval2020-06-13T14:26:36ZNick MathewsonLengthen the consensus validity intervalRight now, the 24 hour "build a consensus in this interval or die horribly" time is chosen completely arbitrarily. I know of no reason to not allow 48 or 72 hours.
See comments on #2681 for more discussion of this point. There's some ...Right now, the 24 hour "build a consensus in this interval or die horribly" time is chosen completely arbitrarily. I know of no reason to not allow 48 or 72 hours.
See comments on #2681 for more discussion of this point. There's some discussion there of choosing an optimal time, but I suggest we just have some clients try 48 or 72 hours and see how it goes.
I'm boldly going with 72.
cc'ing arma because he remembers where the pain points are on the directory info download food chain, and mike because he has felt the pain.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/2681brainstorm ways to let Tor clients use yesterday's consensus more safely2022-03-22T13:28:40ZRoger Dingledinebrainstorm ways to let Tor clients use yesterday's consensus more safelyRight now Tor clients won't use a consensus that's 25 hours old. But if the directory authorities don't agree on a consensus for a day, things can go bad. We need to investigate other tradeoffs in this space than the one we've currently ...Right now Tor clients won't use a consensus that's 25 hours old. But if the directory authorities don't agree on a consensus for a day, things can go bad. We need to investigate other tradeoffs in this space than the one we've currently picked.
For instance: if you got your directory consensus info when it was valid, but you haven't been able to get any new consensus, perhaps you should be more forgiving about the timestamp on the consensus you have. That's a slightly different scenario than believing a new consensus that's 48 hours old.
Another option is just to change 24 to 48, which probably doesn't put clients at much greater harm, but gives us a lot more breathing room for mistakes.
The implementation side of this will be tricky, because we'll need to make sure that clients can handle descriptors that are 36 hours out of date too. We started implementing that feature several times, but I think we've never finished it.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/2149new 'extra dormant' mode for people who never use their tor2020-06-13T15:35:57ZRoger Dingledinenew 'extra dormant' mode for people who never use their torIf you stop using your Tor as a client (and it doesn't have dirport or orport open), after an hour you'll stop fetching descriptors, but you'll continue to fetch consensuses.
Periodically we hear plans like http://wiki.debian.org/Freedo...If you stop using your Tor as a client (and it doesn't have dirport or orport open), after an hour you'll stop fetching descriptors, but you'll continue to fetch consensuses.
Periodically we hear plans like http://wiki.debian.org/FreedomBox/DesignAndToDos that will involve installing Tor on things by default.
We should teach Tor that if a lot of time passes (e.g. 24 hours), it should stop fetching consensuses also. It should also remember in its state file that it's doing so, so the dormancy survives across reboots.Tor: unspecified