Reach a security decision about version-fingerprinting on network documents

added Next label

assigned to @nickm

If left to my own devices, I would adopt at minimum the following rule:

Any parsing change that renders a formerly valid document invalid SHOULD include a warning that identifies the nature and source of the document. This warning becomes a MUST for bridge descriptors and HsDescs.

We SHOULD NOT render any formerly valid document invalid without:

including a historical note in the spec that says that the document used to be valid.

verifying that no supported relay or authority, and no recently-supported onion service implementation, will generate such a document.

mentioned in issue #294

If we wanted to this Properly, we would put something in the consensus. It wouldn't have to be very extensive.

A sketch

"Distinguishability version" in the consensus, a single integer
Each protocol element's spec entry gives the range of distinguishability versions that include it
Every client should treat received protocol elements where the spec range doesn't include the consensus distinguishability version, as unrecognised
(The consensus documents should exclude elements not included in the distinguishability version)
If the client finds that its range of known (and therefore emulatable) distinguishability versions doesn't include the consensus version, it should generate a warning about weakened anonymity, but continue to operate

This would need some firming up and I think my description is clumsy.

I have little idea whether this would be worthwhile. I'm hoping we don't have to put this on the critical path for arti relay.

Alternatively:

Let's consider the fingerprinting risk with hsdescs. The fingerprinting relies on the client eventually connecting to the service via one of the "good to that client" descriptors.

If the client treated a bogus-in-certain-ways hsdesc as a fatal error and remembered that this HS was trying to do shit, the attack would be avoided. The client would fail to connect. There's still some issue with web page element loading, though. I haven't thought that through. Maybe the attacker could make a page which loads resources from a bunch of different HS's with differently-mangled hsdescs, and then see which resources actually worked.

mentioned in merge request !312 (merged)

Data point:

Network health and Anticensorship could probably do some kind of scanning to help with the bridge descsriptors; scanning isn't possible for HsDescs.

Health team does that for some that I won't mention and actually caught many bad relays that way.

The first thing to decide is, "how bad is this kind of attack?"

Since the bridge has access to client IPs, I think it is game over there in terms of partitioning attack as you mention.

The second thing to do is make recommendations based on what we decide.

I'm very pro making it that a client that recognizes a token but ends up with an unexpected value should definitely trigger a loud warning and even cache that .onion as "malicious" and avoid connecting back to it until it can see a different descriptor. That is for certain one of the things we'll never do, change an existing field.

This is the same for relays to be honest. The amount of relays that the health team has found to be lying with their descriptors, it would have neutralized entirely any attacks if clients by default would go "nope" and refuse to use it. Most of this can be done at the dirauth level.

Changes to netdoc in general

Because our network is ever evolving and we allow client to be backward compatible for thousands of years, I think the approach of "ignore unrecognized, validate recognized" is the right approach that will allow us rapid upgrades of the network which are crucial to the safety of the network.

Since the bridge has access to client IPs, I think it is game over there in terms of partitioning attack as you mention.

What do you mean by "game over" here? That the partitioning attack is very serious, or that the attack is not something to worry about? I think you mean "not something to worry about, since the bridge is already so powerful")

This is the same for relays to be honest. The amount of relays that the health team has found to be lying with their descriptors, it would have neutralized entirely any attacks if clients by default would go "nope" and refuse to use it. Most of this can be done at the dirauth level.

Right now, dirauths reject descriptors if they see any errors; I don't know if you're suggesting something different?

I think the approach of "ignore unrecognized, validate recognized" is the right approach that will allow us rapid upgrades of the network which are crucial to the safety of the network.

So IOW, you are saying that we should adopt the easy mitigations here, but not worry about the partitioning opportunities otherwise?

So IOW, you are saying that we should adopt the easy mitigations here, but not worry about the partitioning opportunities otherwise?

FWIW, this is my take, except I believe we should go farther in rejecting anything that can allow the adversary to create HSDESCs of custom (long) lengths that are thus unique. This includes rejecting lots of "opt" lines, or lots of "serenity xyzfoobazbar∞+1" lines, and putting length limits on "future compatibility" of existing lines. I also agree with @dgoulet that these scenarios should cause the client to stop interacting with such an onion, for some period, and should be used in scanners.

This position is consistent with the statements in https://gitlab.torproject.org/tpo/core/torspec/-/blob/main/proposals/344-protocol-info-leaks.txt, especially Section 1.1.3, 1.2.1, and 1.3.4.

Partitioning is second-to-lowest in priority, in that proposal. However, the addition of arbitrary lines to HS descriptors with ignored data, OTOH, is a form of "dropped cell" attack that also increases the uniqueness of our already unique HS handshake, on the wire.

I also sense that this ticket is not actually about partitioning so much as it is about forward and backward compatibility engineering approaches for descriptors. I have less strong opinions on that front, other than I don't think that it is strictly necessary to address the HS descriptor inflation problem, and that we should not burden ourselves with this needlessly.

FWIW, this is my take, except I believe we should go farther in rejecting anything that can allow the adversary to create HSDESCs of custom (long) lengths that are thus unique

For this, would it be adequate to impose a maximum hsdesc length, to be enforced at hsdirs, and adjustable via consensus parameter? That would seem like the easiest way to me.

For this, would it be adequate to impose a maximum hsdesc length, to be enforced at hsdirs, and adjustable via consensus parameter? That would seem like the easiest way to me.

Almost: we also need to deal with the problem of HSDIR malleability. The worst case of which is HSDIRs can append junk at the end of the descriptor, after the signed portion, and clients accept it and ignore it, even though it is not signed. IMO, clients should scream loudly in this case, and avoid that HSDIR, and we should scan for this, too.

I do agree that in general, it seems sub-optimal that it is trivial to detect the software that is running an onion service, especially in the case of onionbalance, but compared to the issues that adversaries now appear to be actively exploiting (such as side channels used for guard discovery, and then used again for client confirmation), it still ranks low on the list.

So while I would not close this ticket as "notabug", we have far bigger fish to fry, IMO.

Thanks for the feedback!

One thing I want to clarify before I go ahead: my main goal with this ticket has been to decide whether we need to worry about these issues as we extend or restrict our directory formats. I am definitely okay with deciding that we don't need to jump through hoops concerning this issue.

Trying to summarize, I think that what we have agreed on is:

It is perfectly fine to add new restrictions such that Tor or Arti version X will reject some possible descriptors that would have accepted by some earlier version Y. We think that this is especially okay if those descriptors are not generated on the live network.
We think that among the issues that this raises, we should look for easy mitigations when it comes to HsDirs.
Issues concerning HsDir malleability, length, and DATA message trickling are more important than this kind of client version fingerprinting. This is well discussed in the "HsDir inflation" section of prop349; we just need to expand it a little and implement it.

Thank you for this very helpful summary. It seems like a good and practical strategy.

added Doing label and removed Next label

mentioned in issue #313 (closed)

mentioned in issue arti#1884

Opened #313 (closed) and arti#1884 for the minimal followups.

closed

mentioned in merge request arti!2866 (merged)

mentioned in issue tor#41028

Reach a security decision about version-fingerprinting on network documents

Mitigations

Conclusion

Child items ...

Activity

A sketch