We've received an abuse report from Hetzner on october 26th about multiple ssh connections originating from our IP for web-fsn-01 towards different IPs in the same block
anarcat says that there is indeed traffic for port 22 flowing out of port 22 from the VM. there should not be anything on that host that permits some kind of forwarding. we need to investigate what's happening
so i didn't find a trace of that specific address or /16 netblock in my traces, but i do find a lot of RST packets going out of that particular web server. right now i have this running:
tcpdump -n dst port 22 and '!' dst host 116.202.120.165 > resets.txt
that found over 7000 resets over the course of 5 minutes, most from unique IP addresses. so i can't confirm if this is the same issue as what the plaintiff is describing, but it seems we are going through some sort of DOS attack or attempt at one anyways. because the webservers' SSH port is firewalled, it's getting dropped early and harmlessly, but i suspect the backscatter might be what's tripping those IDS...
i've replied to the two reports, we'll see what flies back, if anything. Here's the response I sent to one of them:
Hi,
We were forwarded this report about supposedly "unusual activity"
originating from 116.202.120.165, one of the web servers in our main
rotation.
From our analysis, it seems the webserver is a victim of an attempt at a
distributed denial over service (DDOS) attack. We haven't found trace of
the other IP addresses (in the 202.91/16 netblock) mentioned in the
report in our (temporary) probes, but lots of other hosts seem to try
to connect over the SSH port (22) to our server, which is otherwise
firewalled.
So what might be happening is that you're receiving RST traffic from our
firewall which is (rightly) refusing those connections.
Could you confirm whether or not the traffic your receiving has the RST
flag set, and whether it matches outgoing SYN traffic?
I responded to Hetzner as well, who threatened to cut us off if we don't make a statement, with:
Hi,
We are finding it difficult to investigate this specific complaint because cert.br has redacted the actual IP addresses allegedly victims of this presumed attack.
We we can say is that we are under attack from a distributed set of IP addresses that are repeatedly trying to open the SSH port on this server. In a 5 minutes sample, we saw over 7000 attempts for mostly all distinct IP addresses. Our firewall is blocking SSH connection attempts apart from a small set of ACL'd IP addresses, and is responding to that traffic with RST packets.
So this might be what those complaints are about, misinterpreting our RST responses as an attack.
We have two theories:
the plaintiffs, in this case, are the attackers and they are hosting components of a botnet that's attacking us
something is spoofing the plaintiff's IP addresses
I've asked both plaintiffs (from this complaint, AbuseID:EBAEC4:1A, and another, AbuseID:EBA1A6:19) for more information about the traffic from their perspective, but it would be nice if hetzner could confirm or invalidate the IP spoofing theory, something we can't do without access to a BGP edge router, I believe.
it seems i might have gotten at least one bit backwards here: i was assuming our RST packets were in response to a connect attempt on our SSH servers, but we're actually responding to port 22, not from port 22, as correctly pointed out by @arma.
so it seems someone is spoofing our webserver and using that to harrass people on the internet. in that case, this is more clearly a case of someone spoofing Hetzner's IP space, and there's actually very little we can do about this.
and there's actually very little we can do about this.
Yep! Two follow-up thoughts for completeness:
(A) There is one thing you could do, which is to add some filter rules to not actually send out the rst or synack packets in response to incoming mistaken packets. It won't really change anything, but it would make it more clear that you're not the source of any problem (because you are no longer sending any packets related to the problem). Some of the hosts in tpo/network-health/analysis#85 (closed) decided to take this step.
(B) If it comes to it, the hetzner people could easily verify that we're not sending out connections to port 22, by watching our outbound traffic and noting that there aren't any. But... yeah this is the place that's already illegally watching traffic it shouldn't be watching, sharing it with people it shouldn't be sharing it with, etc. So maybe there is a way to turn this into a good idea, or maybe there is no way. :)
a) add some filter rules to not actually send out the rst or synack packets in response to incoming mistaken packets
i'd be curious to see what those rules would look like. from what i understand, i'd need to start tracking state in the firewall for this, and for a box with the traffic handled by our mirrors, that could be a significant CPU overhead.
b) maybe there is a way to turn this into a good idea, or maybe there is no way. :)
normally, BCP38 is what takes care of that, so there's someone, somewhere in the path between hetzner and the victims that is not properly handling spoofed packets.
it looks like 49.12.57.133 (relay-01) and 95.216.163.36 (hetzner-hel1-03, another mirror) have been marked for abuse by hetzner as well, in two of those 5 emails.
I'm not sure what the end game is here. It seems we're just getting automated messages from people and no one is actually responding at the other end. From our side, the impact is minimal, we might need some mitigations to deal with disk space, but otherwise it's a big shrug. I'm tempted to just close this issue unless we really do have a problem with disk space.
looking at grafana, there does seem to be an impact, but it looks like our garbage collection policies are dealing with it, even though alerting sometimes yells at us:
yesterday before going afk monitoring alerted that web-fsn-01 was nearly full so I manually launched logrotate in order to get one of the big logs compressed and free up some space.
today's usage looks OK, but we might end up back in the "nearly full" situation in a day or two. so I think we might want to keep our eyes on disk alerts and maybe short-curcuit or temporarily remove the firewall's logging before rejecting.
network-wise though, there's not much that we can do. we could flip our firewall to drop instead of reply with reject (rst) but that just means the other end will stop being informed that faulty connections should be closed. for us, in the eyes of Hetzner it might help showing that we're actually just receiving crap and not participating in any measure.. and who knows if the RST responses are "double"-tripping the IPS systems that send us abuse reports, but I somehow doubt that, after seeing what's being reported
fwiw I've just deployed new firewall rules everywhere to drop packets with source or destination ports set to 0 since that's bogus and should not happen on any network (the only case where that field can be set to 0 is when asking the kernel to choose a port number for us, so before anything can be sent outbounds).
the amount of such packets we were seeing was very small so it won't fix anything about the network issue, but that's one weird case less. the firewall was logging that it was sending rejection responses to those, I'm not sure if linux actually does send something in those cases but now it won't even have to consider the case.
update on this front: it seems we're still getting some of that trash, but the rate has diminished a bit. but it's still there. we should probably figure out a way to not log those queries, @lelutin is that something i could interest you in?
anarcatchanged title from unexpected outgoing ssh connections originating from web-fsn-01 to unexpected outgoing ssh traffic on web-fsn-01
changed title from unexpected outgoing ssh connections originating from web-fsn-01 to unexpected outgoing ssh traffic on web-fsn-01
@anarcat I've sent a branch to the tor-puppet repository named disable_firewall_reject_logging with a very simplistic fix for the logging (e.g. just commenting out the log line in the chain log_and_reject. if that change makes sense to you, I'll merge and start applying everywhere
people watching this issue will be super excited to know we've received more useless warnings from hetzner about this, the "attack" still seems to be in progress, now expanding to more hosts:
web-fsn-01 (not new)
submit-01
web-fsn-02
gitlab-02
check-01
polyanthum
archive-01
hetzner-nbg1-01
weather-01
relay-01 (not new)
meronense
media-01
eugeni
hetzner-hel1-03
this list is not exhaustive, we keep getting more (harmless reports).
also, over the weekend, the folks at watchdogcyberdefense.com eventually replied to my request for more information, but failed to answer the questions I have asked (whether or not the traffic they are receiving matches ours) and instead sent me a copy-paste of the virustotal.com report on the web-fsn-01 IP address reputation (?!).
I have replied, again, suggesting that they look into the AS path and suggesting they just stop sending complaints because they are creating collateral damage.