Starting up 0.4.3.0-alpha-dev (git-71daad16) without any cached-* files in my DataDirectory, I get:
Oct 20 04:44:56.026 [notice] Bootstrapped 30% (loading_status): Loading networkstatus consensusOct 20 04:44:56.636 [notice] I learned some more directory information, but not enough to build a circuit: We have no usable consensus.Oct 20 04:44:56.758 [notice] Bootstrapped 40% (loading_keys): Loading authority key certsOct 20 04:44:56.936 [notice] The current consensus has no exit nodes. Tor can only build internal paths, such as paths to onion services.Oct 20 04:44:56.936 [notice] Bootstrapped 45% (requesting_descriptors): Asking for relay descriptorsOct 20 04:44:56.936 [notice] I learned some more directory information, but not enough to build a circuit: We need more microdescriptors: we have 0/5841, and can only build 0% of likely paths. (We have 0% of guards bw, 0% of midpoint bw, and 0% of end bw (no exits in consensus, using mid) = 0% of path bw.)Oct 20 04:44:57.337 [notice] Bootstrapped 50% (loading_descriptors): Loading relay descriptorsOct 20 04:44:57.592 [notice] The current consensus contains exit nodes. Tor can build exit and internal paths.Oct 20 04:44:58.178 [notice] Bootstrapped 58% (loading_descriptors): Loading relay descriptors
It's that "The current consensus has no exit nodes." line that is out of place.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items 0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Linked items 0
Link issues together to show that they're related.
Learn more.
It looks like compute_frac_paths_available() prints that line as a log_notice whenever it is called and count_usable_descriptors() returns "np" (number present) of 0.
Specifically, count_usable_descriptors() counts up how many of the exit relays in the consensus are "present" and how many are "usable". At the beginning, when we've gotten the consensus but no microdescriptors, many of the exit relays are "usable" but none of them are "present" yet. But! The logic inside count_usable_descriptors() uses the opposite meaning: it says to itself that many of the exit relays are "present" in the consensus but none of them are "usable" by this Tor yet (because we don't have a microdescriptor for them yet).
So the simple fix is that it needs to check if nu > 0, not np.
And the broader fix is that maybe we need better words for these two notions.
I based my branch on maint-0.4.2 since I don't think there's a need to go earlier than that -- it is just a "misleading log" problem so far as I can tell.
For renaming them, I would propose "present" -> "ready", and "usable" -> "listed". If we like these better names, I or somebody could go through and make a new commit changing them.
We've switched this check a few times, and it breaks different things each time.
So we need to be very careful about this change.
I'm not sure if it is correct.
Please check the history of changes to this line.
We've switched this check a few times, and it breaks different things each time.
So we need to be very careful about this change.
I'm not sure if it is correct.
Please check the history of changes to this line.
Ok, there have been three changes to this line:
The original commit, 9b2d106e, which documented np and nu clearly (yay) but got their meanings reversed (oops).
Nick's fix in legacy/trac#14918 (moved), which went in as commit 8eb3d81e: it clarified the function comment in count_usable_descriptors (yay), fixed np to nu as it should, but left the (still wrong) comments explaining np and nu in place (oops).
I just checked num_present vs num_usable again, and I am confident that the fix in 94cb4f8 (which was the same fix as 8eb3d81e) is right: num_usable is how many are in the consensus that we would use, and num_present is how many of those we actually have descriptors for right now. So on first boot, when we have no descriptors present, we must check nu, not np, or we will wrongly conclude that we have a consensus with no exits.
Now, legacy/trac#27236 (moved) speaks of internal-only onion service networks and chutney. But I don't see how commit 3ebbc1c8 does anything about those -- it essentially just reverts Nick's fix. I'm guessing it has something to do with the nearby commit 588c7767. It looks like that commit is trying to make sure that at least one relay in the consensus can exit to at least one port. Makes sense -- but it appears to accidentally also be checking that we have a local descriptor for such a relay. Can you help me understand what exactly we need to check for, in the "internal-only onion service networks and chutney" case? We shouldn't merge this new patch until we understand the chutney requirements better.
Yes, I agree. I'll give this a go if you like my proposed name changes above: "present" -> "ready" and "usable" -> "listed". (I figure going through and doing it for the wrong names isn't a good use of anybody's time.)
We've switched this check a few times, and it breaks different things each time.
So we need to be very careful about this change.
I'm not sure if it is correct.
Please check the history of changes to this line.
Ok, there have been three changes to this line:
The original commit, 9b2d106e, which documented np and nu clearly (yay) but got their meanings reversed (oops).
Nick's fix in legacy/trac#14918 (moved), which went in as commit 8eb3d81e: it clarified the function comment in count_usable_descriptors (yay), fixed np to nu as it should, but left the (still wrong) comments explaining np and nu in place (oops).
I just checked num_present vs num_usable again, and I am confident that the fix in 94cb4f8 (which was the same fix as 8eb3d81e) is right: num_usable is how many are in the consensus that we would use, and num_present is how many of those we actually have descriptors for right now. So on first boot, when we have no descriptors present, we must check nu, not np, or we will wrongly conclude that we have a consensus with no exits.
It's not just first boot, it's also "if tor is launched after most descriptors have expired".
Whatever change you make, we need all the chutney tests in CI to pass. And we also need to avoid breaking Tor Browser.
I think there were some weird complexities here last time. But let's have a go :-)
Now, legacy/trac#27236 (moved) speaks of internal-only onion service networks and chutney. But I don't see how commit 3ebbc1c8 does anything about those -- it essentially just reverts Nick's fix. I'm guessing it has something to do with the nearby commit 588c7767. It looks like that commit is trying to make sure that at least one relay in the consensus can exit to at least one port. Makes sense -- but it appears to accidentally also be checking that we have a local descriptor for such a relay. Can you help me understand what exactly we need to check for, in the "internal-only onion service networks and chutney" case? We shouldn't merge this new patch until we understand the chutney requirements better.
Chutney runs some networks without exits, to make sure we are testing onion service circuits. (And to speed up the tests, by removing redundant relays.) Chutney doesn't care much about the difference between the exit flag and exit policy, because those networks don't have any of either.
However, some tor code does care about the difference. Because it's trying to predict if any relays with exit policies will appear, once it downloads descriptors. Maybe it shouldn't care so much. Maybe it needs to care just enough to tell the user what is going on.
Yes, I agree. I'll give this a go if you like my proposed name changes above: "present" -> "ready" and "usable" -> "listed". (I figure going through and doing it for the wrong names isn't a good use of anybody's time.)
We want to avoid confusion. So let's call the variables what they are, not how we are using them. So we should end up using at least two names like:
n_exit_flag
n_exit_policy
n_exit_flag_and_policy
(I can't easily work out what present, ready, usable, or listed mean. And that's part of the problem here.)
Launching chutney using Python 2.7.6NOTE: Registering send-data-1NOTE: Registering check-2Verifying data transmission: (retrying for up to 60 seconds)Connecting: HS to fjqckbajebsxujm6.onion:5858 (127.0.0.1:4747) via client localhost:9005Transmitting Data:NOTE: Status:{'send-data-1': 'connected, sending socks handshake'}