Exit relay "DoS" using FTP
In the last 1-2 weeks, we have been receiving alerts about some of our exit relays being "down". When we investigated the hosts, we found pairs of exit instances (running on the same IP address) consuming 100% CPU and spamming Error binding network socket: Address already in use
to the log. A look with netstat
showed tens of thousands of sockets from these IP addresses to port 21 in CLOSE_WAIT
state.
Screenshot showing number of open sockets on the affected hosts (times in UTC):
When we disallowed port 21 for these exits, the problem went away immediately.
It's unclear to me whether someone is deliberately creating this problem via the FTP protocol (but why kill only a few relays?), is trying to do credential stuffing on FTP servers in a very inefficient manner, or tor
is creating this problem itself once it hits the per-instance file handle limit (65536
according to the systemd service config).