Should we stop sanitizing nicknames in bridge descriptors?

added actualpoints::6 component::metrics/collector priority::medium resolution::implemented status::closed type::enhancement labels

Some options:

Do not sanitize the nicknames

counting Tor Cloud bridges would be easy
might reveal bridges whose nicknames correlate with public relays
reveals approximate network location (AWS)

More than half of all bridges have a unique nickname. How do we estimate the risk of not sanitizing nicknames?

Mention Tor Cloud in the extra-info 'platform' string

counting Tor Cloud bridges would be easy
could indicate approximate network location (AWS)
Tor Cloud image would need to be updated and re-deployed

Is this possible? How does Tor decide what to put in the platform string?

Trac:
Cc: N/A to aagbsn

Replying to aagbsn:

Mention Tor Cloud in the extra-info 'platform' string

counting Tor Cloud bridges would be easy

could indicate approximate network location (AWS)

Tor Cloud image would need to be updated and re-deployed

Is this possible? How does Tor decide what to put in the platform string?

This would require a patch to Tor -- possibly a patch to the Tor package used in the Tor Cloud images, or possibly an upstream patch to add an ExtraPlatformInformationString torrc option.

Replying to aagbsn:

Some options:

Do not sanitize the nicknames

counting Tor Cloud bridges would be easy

might reveal bridges whose nicknames correlate with public relays

reveals approximate network location (AWS)

More than half of all bridges have a unique nickname. How do we estimate the risk of not sanitizing nicknames?

The typical approach in the past was to describe the suggested change on tor-dev, ask people if they think it's a bad idea and why, and if nobody objects, make the new data available one or two weeks later. If there are no general concerns about the idea, I'll move the discussion to tor-dev.

Mention Tor Cloud in the extra-info 'platform' string

counting Tor Cloud bridges would be easy

could indicate approximate network location (AWS)

Tor Cloud image would need to be updated and re-deployed

Is this possible? How does Tor decide what to put in the platform string?

Yes, this is possible by doing what Robert suggests. And it's probably even the cleaner approach to encode this information in the platform string instead of the nickname. Drawbacks are that we won't learn about previously deployed EC2 bridges, and that status websites like Atlas wouldn't benefit from this solution. Maybe we should do both.

This is to CC myself and to ask something that I would have asked on the list otherwise.

Couldn't you add a setting for bridges like

sanitize0/1 where 1 is default?

For the ones that you wanna count you can set it to 0. Bridge operators that want to use something like Atlas can set it to 0 as well.

Trac:
Cc: aagbsn to aagbsn, bastik.public@gmx-topmail.de

Replying to bastik:

Couldn't you add a setting for bridges like

sanitize0/1 where 1 is default?

For the ones that you wanna count you can set it to 0. Bridge operators that want to use something like Atlas can set it to 0 as well.

Adding an option only because the devs don't want to make a decision is in general a pretty bad idea. How can a user decide if a dev cannot? I'd guess that 95% of bridge operators would never see this option and the remaining 5% wouldn't know how to set it right. That would make the data almost unusable for counting EC2 bridges and for Atlas, and we'd generate support requests for no good reason. No, we should decide whether we can safely include all original nicknames, and if not, we should keep sanitizing all of them.

Replying to karsten:

Replying to bastik:

Couldn't you add a setting for bridges like

sanitize0/1 where 1 is default?

For the ones that you wanna count you can set it to 0. Bridge operators that want to use something like Atlas can set it to 0 as well.

Adding an option only because the devs don't want to make a decision is in general a pretty bad idea.

Might be true, but maybe the devs can't decide.

How can a user decide if a dev cannot?

They would know if relay and bridge name share a naming scheme.

I'd guess that 95% of bridge operators would never see this option and the remaining 5% wouldn't know how to set it right. That would make the data almost unusable for counting EC2 bridges and for Atlas, and we'd generate support requests for no good reason.

I thought the Tor people would create the Tor image and can control the setting.

I don't love the idea, but wanted to add it for discussion. Maybe I should have added myself as CC, only.

No, we should decide whether we can safely include all original nicknames, and if not, we should keep sanitizing all of them.

For my understanding you, the Tor people, can't do that. Names can be changed. How to define safe? Please don't feel "forced" to reply. I really don't want to start a discussion here.

Replying to bastik:

Replying to karsten:

How can a user decide if a dev cannot?

They would know if relay and bridge name share a naming scheme.

In order to make this decision, operators would have to understand that they should use a different scheme for naming their bridges than for their relays. As I said on tor-dev, that's yet one more thing to tell them, and it's likely going to generate support requests for no good reason.

I'd guess that 95% of bridge operators would never see this option and the remaining 5% wouldn't know how to set it right. That would make the data almost unusable for counting EC2 bridges and for Atlas, and we'd generate support requests for no good reason.

This is the case for newly created EC2 images. It doesn't apply to existing EC2 images which are not updated. We'd also not learn about past statistics, and this wouldn't help Atlas at all. All in all, this config option is a usability nightmare that leaves us with mostly useless statistics.

No, we should decide whether we can safely include all original nicknames, and if not, we should keep sanitizing all of them.

For my understanding you, the Tor people, can't do that. Names can be changed. How to define safe?

I think this is something developers have to decide, not users. Note that this isn't about a single bridge that can be located via nickname similarity. It's about not letting the attack become successful enough to make it attractive. If the adversary could locate 1% of bridges via nickname similarity, they probably wouldn't care. Also, if we can double the number of bridges by getting more funding for EC2 bridges and making it easier for operators to check how their bridge is doing via Atlas, that's a win.

Please don't feel "forced" to reply. I really don't want to start a discussion here.

Oh, discussion is good. Please feel free to post any thoughts you have either here or on tor-dev. I'm not at all trying to kill the discussion.

Replying to karsten:

Replying to bastik:

Replying to karsten:

How can a user decide if a dev cannot?

They would know if relay and bridge name share a naming scheme.

In order to make this decision, operators would have to understand that they should use a different scheme for naming their bridges than for their relays. As I said on tor-dev, that's yet one more thing to tell them, and it's likely going to generate support requests for no good reason.

True.

I'd guess that 95% of bridge operators would never see this option and the remaining 5% wouldn't know how to set it right. That would make the data almost unusable for counting EC2 bridges and for Atlas, and we'd generate support requests for no good reason.

This is the case for newly created EC2 images. It doesn't apply to existing EC2 images which are not updated. We'd also not learn about past statistics, and this wouldn't help Atlas at all. All in all, this config option is a usability nightmare that leaves us with mostly useless statistics.

I hadn't thought about that. I agree about the usability. Sure statistics would be mostly useless.

No, we should decide whether we can safely include all original nicknames, and if not, we should keep sanitizing all of them.

For my understanding you, the Tor people, can't do that. Names can be changed. How to define safe?

I think this is something developers have to decide, not users. Note that this isn't about a single bridge that can be located via nickname similarity. It's about not letting the attack become successful enough to make it attractive. If the adversary could locate 1% of bridges via nickname similarity, they probably wouldn't care. Also, if we can double the number of bridges by getting more funding for EC2 bridges and making it easier for operators to check how their bridge is doing via Atlas, that's a win.

I agree that only the devs can make the decision. It has to be a "global"/general decision. I should have been more verbose. My point was (and I should have made that clear) that an adversary may learn about bridge locations via nickname similarity. That couldn't be called safe, but is has to be set into relation to other things an adversary can do. Might be safe enough. So I agree with the adversary thing.

Please don't feel "forced" to reply. I really don't want to start a discussion here.

Oh, discussion is good. Please feel free to post any thoughts you have either here or on tor-dev. I'm not at all trying to kill the discussion. I agree here as well. I picked this "channel" because it creates less noise.

Sanitized bridge descriptors now contain bridge nicknames. Closing.

Trac:
Resolution: N/A to implemented
Status: new to closed
Actualpoints: N/A to 6

closed

added 48h of time spent

mentioned in issue #5807 (moved)

mentioned in issue #9854 (moved)

Should we stop sanitizing nicknames in bridge descriptors?

Child items ...

Activity