I agree that it would be useful to have such a check box, and I really want to implement it, because I'm interested in the results, too.
But I'm more and more convinced that it's not possible to group relays by family, or at least not in an unambiguous way. What is possible is looking up relays in the same family of a given relay. But looking at all relays and grouping them by family seems hard if not even impossible.
Here's an example. Assume we have three relays: A, B, and C. These relays state the following family relationships:
A: A, B
B: A, B, C
C: B, C
We require mutual agreement about being in the same family, so we could either come up with family A, B or with family B, C. Which one is correct?
Of course, we could apply fancy heuristics to find largest families and break ties in favor of higher overall consensus weights, smaller fingerprints, or something. But that still sounds like hacking to me. Before we enter that stage, I'd like to know if this problem can be solved otherwise. Got any ideas?
I agree that it would be useful to have such a check box, and I really want to implement it, because I'm interested in the results, too.
But I'm more and more convinced that it's not possible to group relays by family, or at least not in an unambiguous way. What is possible is looking up relays in the same family of a given relay. But looking at all relays and grouping them by family seems hard if not even impossible.
Here's an example. Assume we have three relays: A, B, and C. These relays state the following family relationships:
A: A, B
B: A, B, C
C: B, C
We require mutual agreement about being in the same family, so we could either come up with family A, B or with family B, C. Which one is correct?
In this situation we have two overlapping families.
family1 = A,B
family2 = B,C
For compass (even if this is not the case in real path selection by tor clients) I would choose the simple approach: merge overlapping families:
family (as seen/interpreted by compass) = A, B, C
If you would go this path (merging such overlapping families) you could still ad an option that requires strict mutual agreement. (Than you would have two separate families: family1 & family2).
What do you think about this approach (merging overlapping families)?
Hey, I like both suggestions. I was so focused on resolving overlapping families and coming up with nicely separated families that I didn't think of either merging them into "extended families" or accepting the fact that they're overlapping. :) I'll experiment with both family definitions and let you know what I come up with. Compass integration will then be the next step.
Trac: Status: new to accepted Owner: N/Ato karsten
I just attached early results of overlapping/extended families as defined by you. These results look plausible to me, but I didn't confirm them as carefully as I'd like to, and I gotta run now and don't know if I have time today to continue working on this. Maybe you want to have a look?
I did some manual checks of your results, because the lines with only two family members were not clear to me,
but after checking them manually it was clear that the "overlapping node" was down and therefore not showing up in the second part (merged families) of the txt file.
I'm also going to contact these relay operators, so they can fix their family settings.
Configuring families with a high number of relays that are regularly extended is a PITA, why not using this definition of families for Tor directly? This would reduce the number of nodes to reconfigure to 2 regardless how many nodes you have, but I would be surprised if no one else had that thought already. Anyway I'll propose it on tor-dev.
I'm wondering why the following family is showing up in the results, because it seems to be a mutually setup family (one relay is down since 2012-03 though).
I did some manual checks of your results, because the lines with only two family members were not clear to me,
but after checking them manually it was clear that the "overlapping node" was down and therefore not showing up in the second part (merged families) of the txt file.
That's true, we cannot confirm mutual family relationships if one of the nodes is down. That means that the two families A-B and B-C cannot be merged to A-B-C if B is down.
But that being said, I found a bug in my code where mutual checks were not performed correctly. I attached a fixed list based on the same consensus.
I'm wondering why the following family is showing up in the results, because it seems to be a mutually setup family (one relay is down since 2012-03 though).
I'm not sure what you mean. I don't see where this is wrong, but it could be that the issue is fixed in the newly attached document. If not, can you explain where the problem lies?
Also, if you can, please go through the fixed list and let me know if there are any other problems. I looked through the list once or twice and it looked plausible (though the first one did that, too). Thanks!
That's true, we cannot confirm mutual family relationships if one of the nodes is down. That means that the two families A-B and B-C cannot be merged to A-B-C if B is down.
I wouldn't mind declaring the family A-B-C even if B is down, as long as we are able to find a valid descriptor for B within the last X days.
I'm not sure what you mean. I don't see where this is wrong, but it could be that the issue is fixed in the newly attached document. If not, can you explain where the problem lies?
OK, there was probably a misunderstanding on my side what your list actually contains. I thought it only contains "imperfect" families that have been merged afterwards but if it contains all families (even those that have complete and mutual agreements) than it is clear.
I wouldn't mind declaring the family A-B-C even if B is down, as long as we are able to find a valid descriptor for B within the last X days.
Oh yes, we can and probably should do that. We have the descriptors of relays that have been running in the past seven days. I attached a new output file.
In #23517 (moved) it is planned to merge Compass functionality with Relay Search (formerly known as Atlas). These tickets may be relevant to that work and so these are being reassigned to the Metrics/Atlas component.
A details view with aggregated graphs may be added in #23509 (moved).
I think in general this addresses most use cases. A true grouping by family may be enabled by using a database to back Onionoo but until we get there, this is just not implementable.
Trac: Status: assigned to closed Resolution: N/Ato wontfix