Decide how to sanitize pluggable transport lines in bridge descriptors
We're providing sanitized versions of bridge descriptors in almost real-time on the metrics website and via rsync.
Once we enable bridges to include pluggable transport information in their server and/or extra-info descriptors, we need to come up with a way to sanitize the sensitive parts. We'll want to remove any keys contained in pluggable transport lines, okay. But maybe the fact that a bridge offers a specific pluggable transport is already sensitive? Maybe the fact that it offers any pluggable transport is sensitive?
This problem came up in #3589 (moved). Bridge clients aren't supposed to learn about pluggable transports contained in the bridge's extra-info descriptor. Neither the bridge authority nor the bridge gives out extra-info descriptors. But once the client knows the bridge's server descriptor it can easily look up the sanitized extra-info descriptor from the metrics archives. If we don't want the client to learn about the bridge's transports, we need to take that into account. If it helps, we can define new sanitizing rules for each pluggable transport there is.
So, there's a trade-off between revealing too much information and being able to analyze pluggable transport deployment. We'll probably want to run some analyses on pluggable transport deployment. We can only do that if the information is contained in the sanitized versions of bridge descriptors (because we don't use the original descriptors for analysis at all).