Sanitize TCP ports in bridge descriptors

We should consider sanitizing TCP ports in bridge descriptors. Let's add a new sanitizing step between 3 and 4 here:

https://collector.torproject.org/#bridge-descriptors

  1. Replace TCP port with TCP port hash: It may be less obvious that TCP ports need to be sanitized, but an unusual TCP port used by a high-value bridge might still stand out and provide yet another way to locate and block the bridge.
  • Each non-zero TCP port is replaced with H(port | bridge identity | secret)[:2] % 65535 + 1 written as decimal number. The input port is the 2-byte long binary representation of the TCP port. The bridge identity is the 20-byte long binary representation of the bridge's long-term identity fingerprint. The secret is a 33-byte long secure random string that changes once per month for all descriptors and statuses published in that month. H() is SHA-256. The [:2] operator means that we pick the 2 most significant bytes of the result. TCP ports that are 0 in the original descriptor are left unchanged.

In order to make this change we'll need to write and test the code and re-process all bridge descriptors since 2008. The last part is going to take at least a week, maybe longer.