Draft: Simplify BridgeLine struct and make it variable length
Closes #60
This is a big change, all for the purpose of removing the fixed length limit from the BridgeLine
struct. While I was making the change, I took the liberty of modifying the BridgeLine
into something more usable for the client.
Note: this change is not backwards compatible, but it was never going to be. Even if all we changed was the BRIDGE_BYTES
value, old clients wouldn't be able to decrypt/decode the bridge table entries properly because they assumed an outdated value. However, this change should provide a path towards backwards compatibility if changes are made later.
To break down the changes into pieces:
-
The new
BridgeLine
struct was modified to:pub struct BridgeLine { pub uid_fingerprint: u64, pub line: String, }
We didn't need the other fields for anything Lox related, and all the client really needs is the bridge line string. Note also that there is no fixed length for the
line
field. -
Modified the encoding of
Bucket
andBridgeLine
to use protobufsThe previous encoding relied on both structs having fixed-length fields. I went with protobufs for a few reasons:
- There is a nice library for it, called prost, that is popular and maintained by the tokio-rs project.
- It's interoperable, meaning that we're using a deterministic protobuf schema, so any other client implementation that uses the same schema should be able to decode and decrypt the bridge table entries just fine.
- It allows for forward and backward compatibility, though we have to be careful here. The prost library allows us to annotate struct fields with tags, which help maintain a deterministic schema even if we reorder or remove struct fields. We have to be careful though if we rename the fields or add new ones, to use a different tag for it.
-
The
Resource
struct in the rdsys-backend-api crate changed the type of theparams
field:pub struct Resource { pub r#type: String, pub blocked_in: HashMap<String, bool>, pub test_result: TestResults, pub protocol: String, pub address: String, pub port: u16, pub fingerprint: String, #[serde(rename = "or-addresses")] pub or_addresses: Option<Vec<String>>, pub distribution: String, pub flags: Option<HashMap<String, bool>>, #[serde_as(as = "Option<Map<_,_>>")] pub params: Option<Vec<(String, String)>>, }
This change was necessary to create a deterministic bridge line string when parsing the resources in the lox-distributor.
-
Lots of new new tests
Merge request reports
Activity
assigned to @cohosh
requested review from @meskio
- Resolved by meskio
I have one more change I want to make to this. We got some feedback from Ian and Vecna that it would be nice if the
uid_fingerprint
was in fact the full bridge fingerprint. If we go that route, I want to make the change before merging/deploying this MR to avoid creating and tagging a new protobug entry.Just added one commit to deal with this (the rest is unchanged): 287ee238
I'd like @onyinyang's opinion on this, since I don't recall the reason for going with a shortened hash of the fingerprint in the first place. Any reason not to just stick with the original fingerprint string we get from the resource parser?
The fingerprint should be 20 bytes, right? (So a 40-character hex string?) Should they be generated as 20-byte arrays converted to hex strings in the
random()
functions in bridge_table and request_handler?I'm going to just double check this but if I recall correctly, I was trying to avoid using strings/changing the bridgeline structure too much at the time and also ran into issues translating the fingerprint from go to rust, but it didn't matter for the
uid_fingerprint
if it was an exact match, as long as it would be unique for each bridge. Theuid_fingerprint
is important for identifying bridges in the bridgetable for updates/replacements/blockages but as long as this works as expected, I'm pretty sure it's fine to use the original fingerprint string as theuid_fingerprint
.
added 9 commits
-
90b0228c...c7aa2d2e - 2 commits from branch
tpo/anti-censorship:main
- 20f90313 - Add tests for parsing bridgelines
- 80d146da - Decouple bucket encoding from table entry preparation
- 65cfdd2b - Simplify BridgeLine struct
- 0bb9603a - Modify BridgeLine and EncryptedTableEntry for protobufs
- 9359e266 - Use protobufs for encrypted bridge table
- 8135fb0f - Add test to check bridge table encryption
- 1a76fd79 - Use Vec<(_,_)> instead of HashMap<_,_> for bridge params
Toggle commit list-
90b0228c...c7aa2d2e - 2 commits from branch
added 1 commit
- 287ee238 - Keep uid_fingerprint as String from resource parser
added 11 commits
-
287ee238...39531bd9 - 3 commits from branch
tpo/anti-censorship:main
- f1fc5f45 - Add tests for parsing bridgelines
- cd43dd9a - Decouple bucket encoding from table entry preparation
- c49cafe3 - Simplify BridgeLine struct
- d8842075 - Modify BridgeLine and EncryptedTableEntry for protobufs
- 29ba0956 - Use protobufs for encrypted bridge table
- d7e568d6 - Add test to check bridge table encryption
- aaf0972d - Use Vec<(_,_)> instead of HashMap<_,_> for bridge params
- cc734bed - Keep uid_fingerprint as String from resource parser
Toggle commit list-
287ee238...39531bd9 - 3 commits from branch
I wasn't able to give feedback on the MR earlier due to being afk for a while. In general, I support this idea and think the simplification of the bridge line and move to prost are good changes, however, I'm a bit concerned about the amount of information that will be leaked about Lox's bridge allocation and bucket status with the change from fixed length bridge lines.
Ian mentioned (in the aforementioned feedback) that this change would reveal the types of bridges in the bridge table, how they are allocated across buckets in the bridge table and the distribution of open-entry/trusted buckets. This will still not allow an adversary to learn which buckets are distributed to which users, but certainly leaks more information than we had originally considered and opens some additional side channels that we hadn't previously considered the security impacts of.
The way this is currently implemented, anyone that receives the encrypted bridge table will also be able to see the status of each bucket since the bucket reachability token is appended only to unblocked buckets. A patient adversary could collect this information from the lox authority on a regular basis to see how reachability of buckets, as well as the number of bridges/bucket, changes over time. This could provide useful feedback to an adversary about how their blocking strategy is working, as well as how long it takes us to notice and update a blockage. I think at least this issue could be mitigated by appending a dummy reachability token to each blocked bucket but we may also want to consider a different strategy for the bucket encoding itself to osbfuscate the makeup of buckets.
This may not be the extent of what an adversary can do with this information but I think we probably would want to have a bit more control over how we publicly share this information.
added 184 commits
-
cc734bed...ca30ff27 - 171 commits from branch
tpo/anti-censorship:main
- ca30ff27...6cd83025 - 3 earlier commits
- d9cfd937 - Early continue to remove else block indentation
- d8d6489b - De-duplicate some code in bridge_blocked
- 05aabdd6 - Add tests for parsing bridgelines
- 9023f53f - Decouple bucket encoding from table entry preparation
- 46fa4b6b - Simplify BridgeLine struct
- 507fd0b3 - Modify BridgeLine and EncryptedTableEntry for protobufs
- 01cc50ae - Use protobufs for encrypted bridge table
- f865c813 - Add test to check bridge table encryption
- 034d8b67 - Use Vec<(_,_)> instead of HashMap<_,_> for bridge params
- a240d14b - Keep uid_fingerprint as String from resource parser
Toggle commit list-
cc734bed...ca30ff27 - 171 commits from branch
requested review from @onyinyang and removed review request for @meskio
mentioned in issue #84
mentioned in merge request !305