Upgrade tor on snowflake-01 to 0.4.7
snowflake-01 is currently running 0.4.5.16, which is end of life and may get removed from the network as early as next week:
- https://mastodon.social/@torproject/110256009579078954
- https://forum.torproject.net/t/tor-relays-psa-tor-0-4-5-reaches-end-of-life-eol-on-2023-02-15/6338
We need to upgrade to the 0.4.7 series. This probably means switching to the torproject.org deb repository rather than the Debian one.
But we need to be careful with this one and be ready to revert the upgrade quickly if necessary.
0.4.7.13 has new support for the IP_BIND_ADDRESS_NO_PORT
socket option when using OutboundBindAddress
in torrc.
(As we currently do
as an attempted mitigation for anti-DDoS scripts at middle relays.)
Past analysis has shown that different processes interact badly with one another
when they bind sockets to a particular address and differ in whether they use IP_BIND_ADDRESS_NO_PORT
:
a situation where everything uses the option is stable,
as is one where nothing uses the option;
but when one process uses IP_BIND_ADDRESS_NO_PORT
and other do not,
the one that does may have its ephemeral ports "stolen" by those that do,
leading to EADDRNOTAVAIL
errors.
It's why we
force a source port range
in haproxy.cfg, because that has the side effect of disabling IP_BIND_ADDRESS_NO_PORT
in haproxy.
If tor's new use of IP_BIND_ADDRESS_NO_PORT
starts giving EADDRNOTAVAIL
,
the immediate remedy is to downgrade to an old version.
Another possible quick fix is to remove OutboundBindAddress
from the torrc files:
if the socket is not pre-bound, there's no problem.
The longer-term solution is to make the other programs on the system that bind to a source address
also use EADDRNOTAVAIL
.
For haproxy it's easy, just remove the source port ranges from haproxy.cfg.
The other programs to worry about are
snowflake-server
and extor-static-cookie,
both of which
bind to a random localhost IP address in a range
to avoid running out of localhost 4-tuples
(which is a separate issue from the IP_BIND_ADDRESS_NO_PORT
one).
Last time I checked, net.Dialer
with LocalAddr
set did not use IP_BIND_ADDRESS_NO_PORT
.
Background on the IP_BIND_ADDRESS_NO_PORT
issue:
- #40201 (comment 2868367)
- https://forum.torproject.net/t/tor-relays-inet-csk-bind-conflict/5757/16
- https://blog.cloudflare.com/the-quantum-state-of-a-tcp-port/
snowflake-02 is already running 0.4.7.13 without any apparent ill effects, though it is at only about 15% the clients of snowflake-01.
/cc @linus