IPsec is deployed with strongswan on multiple servers throughout the architecture. It interconnects many of the KVM hosts but also the monitoring server because it can be used as a NAT bypass mechanism for some machines.
How-to
Hooking up a new node to the IPsec network
TODO: This is the old way of configuring Puppet nodes. There's now an
ipsec
module which does that more easily.
This is managed through Puppet, so it's basically a matter of adding
the hostname to the ipsec
role in
modules/torproject_org/misc/local.yaml
and adding the network
configuration block to modules/ipsec/misc/config.yaml
. For example,
this was the diff for the new monitoring server:
diff --git c/modules/ipsec/misc/config.yaml w/modules/ipsec/misc/config.yaml
index e4367c38..3b724e77 100644
--- c/modules/ipsec/misc/config.yaml
+++ w/modules/ipsec/misc/config.yaml
@@ -50,3 +49,9 @@ hetzner-hel1-01.torproject.org:
subnet:
- 95.216.141.241/32
- 2a01:4f9:c010:5f1::1/128
+
+hetzner-nbg1-01.torproject.org:
+ address: 195.201.139.202
+ subnet:
+ - 195.201.139.202/32
+ - 2a01:4f8:c2c:1e17::1/128
diff --git c/modules/torproject_org/misc/local.yaml w/modules/torproject_org/misc/local.yaml
index 703254f4..e2dd9ea3 100644
--- c/modules/torproject_org/misc/local.yaml
+++ w/modules/torproject_org/misc/local.yaml
@@ -163,6 +163,7 @@ services:
- scw-arm-par-01.torproject.org
ipsec:
- hetzner-hel1-01.torproject.org
+ - hetzner-nbg1-01.torproject.org
- kvm4.torproject.org
- kvm5.torproject.org
- macrum.torproject.org
Then Puppet needs to run on the various peers and the new peer should be rebooted, otherwise it will not be able to load the new IPsec kernel modules.
Special case: Mikrotik server
Update: we don't have a microtik server anymore. This documentation is kept for historical reference, in case such a manual configuration is required elsewhere.
The Mikrotik server is a special case that is not configured in Puppet, because Puppet can't run on its custom OS. To configure such a pairing, you first need to configure it on the normal server end, using something like this:
conn hetzner-nbg1-01.torproject.org-mikrotik.sbg.torproject.org
ike = aes128-sha256-modp3072
left = 195.201.139.202
leftsubnet = 195.201.139.202/32
right = 141.201.12.27
rightallowany = yes
rightid = mikrotik.sbg.torproject.org
rightsubnet = 172.30.115.0/24
auto = route
forceencaps = yes
dpdaction = hold
The left
part is the public IP of the "normal server". The right
part has the public and private IPs of the Mikrotik server. Then a
secret should be generated:
printf '195.201.139.202 mikrotik.sbg.torproject.org : PSK "%s"' $(base64 < /dev/urandom | head -c 32) > /etc/ipsec.secrets.d/20-local-peers.secrets
In the above, the first field is the IP of the "left" side, the second field is the hostname of the "right" side, and then it's followed by a secret, the "pre-shared key" (PSK) that will be reused below.
That's for the "left" side. The "right" side, the Mikrotik one, is a
little more involved. The first step is to gain access to the Mikrotik
SSH terminal, details of which are stored in tor-passwords
, in
hosts-extra-info
. A good trick is to look at the output of
/export
for an existing peer and copy-paste the good stuff. Here is
how the nbg1
peer was configured on the "right" side:
[admin@mtsbg] /ip ipsec> peer add address=195.201.139.202 exchange-mode=ike2 name=hetzner-nbg1-01 port=500 profile=profile_1
[admin@mtsbg] /ip ipsec> identity add my-id=fqdn:mikrotik.sbg.torproject.org peer=hetzner-nbg1-01 secret=[REDACTED]
[admin@mtsbg] /ip ipsec> policy add dst-address=195.201.139.202/32 proposal=my-ipsec-proposal sa-dst-address=195.201.139.202 sa-src-address=0.0.0.0 src-address=172.30.115.0/24 tunnel=yes
[admin@mtsbg] /ip firewall filter> add action=accept chain=from-tor-hosts comment=hetzner-hel1-01 src-address=195.201.139.202
[admin@mtsbg] /system script> print
Flags: I - invalid
0 name="ping_ipsect_tunnel_peers" owner="admin" policy=ftp,reboot,read,write,policy,test,password,sniff,sensitive,romon
,,
[admin@mtsbg] /system script> remove 0
[admin@mtsbg] /system script> add dont-require-permissions=no name=ping_ipsect_tunnel_peers owner=admin policy=\
\... ftp,reboot,read,write,policy,test,password,sniff,sensitive,romon source="/ping count=1 src-address=172.30.115.1 172.30.134.1 ; \
"\... \n/ping count=1 src-address=172.30.115.1 94.130.28.193 ; \
"\... \n/ping count=1 src-address=172.30.115.1 94.130.38.33 ; \
"\... \n/ping count=1 src-address=172.30.115.1 95.216.141.241 ; \
"\... \n/ping count=1 src-address=172.30.115.1 195.201.139.202 ; \
"\... \n"
[admin@mtsbg] /ip firewall nat> add action=accept chain=srcnat dst-address=195.201.139.202 src-address=172.30.115.0/24
The [REDACTED]
part should be the PSK
field defined on the left
side (what is between quotes).
More information about how to configure IPsec on Mikrotik routers is available in the upstream documentation.
Special case: roaming clients
To setup a client, you will first need to do part of the ipsec configuration done in Puppet by hand, which involves:
sudo apt install strongswan libstrongswan-standard-plugins
Then you will need to add something like this to a configuration file
in /etc/ipsec.conf.d/
(strings with $
are variables that should be
expanded, see below for an example):
conn $hostname
# left is the client (local)
left = $peer_ipaddress
leftid = $peer_id
leftsubnet = $peer_networks
# right is our peer (remote the server where this resource is used)
right = $local_ipaddress
rightsubnet = $local_networks
rightid = $local_id
auto=route
For example, anarcat configured a tunnel to chi-node-01 successfully by adding this configuration on chi-node-01:
ipsec::client { 'curie.anarc.at':
peer_ipaddress_firewall => '216.137.119.51',
peer_networks => ['172.30.141.242/32'],
}
Note that the following is configured in the resource block above:
local_networks => ['172.30.140.0/24'],
... but will be used as the rightsubnet
below.
Then on "curie", the following configuration was added to
/etc/ipsec.conf
:
conn chi-node-01
# left is us (local)
left = %any
leftid = curie.anarc.at
leftsubnet = 172.30.141.242/32
# right is our peer (remote, chi-node-03)
right = 38.229.82.104
rightsubnet = 172.30.140.0/24
rightid = chi-node-01
auto=route
authby=secret
keyexchange=ikev2
(Note that you can also add a line like this to ipsec.conf
:
include /etc/ipsec.conf.d/*.conf
and store the configurations in
/etc/ipsec.conf.d/20-chi-node-01.torproject.org.conf
instead.)
The secret generated on chi-node-01
for the roaming client (in
/etc/ipsec.secrets.d/20-curie.anarc.at.secrets
) was copied over to
the roaming client, in /etc/ipsec.secrets
(by default, AppArmor
refuses access /etc/ipsecrets.d/
which is why we use the other
path). The rightid
name needs to be used here:
chi-node-01 : PSK "[CENSORED]"
Whitespace is important here.
Then the magic IP address (172.30.141.242
) was added to the
external interface of curie:
ip a add 172.30.141.242/32 dev br0
Puppet was applied on chi-node-01 and ipsec reloaded on curie, and curie could ping 172.30.140.1 and chi-node-01 could ping 172.30.141.242.
To get access to the management network, forwarding can be enabled with:
sysctl net.ipv4.ip_forward=1
This should only be a temporary solution, obviously, because of the security implications. It is only used for rescue and bootstrap operations.
Debugging
To diagnose problems, you can check the state of a given connexion with, for example:
ipsec status hetzner-hel1-01.torproject.org-hetzner-nbg1-01.torproject.org
This will show summary information of the current connexion. This shows, for example, an established and working connexion:
root@hetzner-nbg1-01:/home/anarcat# ipsec status hetzner-hel1-01.torproject.org-hetzner-nbg1-01.torproject.org
Routed Connections:
hetzner-hel1-01.torproject.org-hetzner-nbg1-01.torproject.org{6}: ROUTED, TUNNEL, reqid 6
hetzner-hel1-01.torproject.org-hetzner-nbg1-01.torproject.org{6}: 195.201.139.202/32 2a01:4f8:c2c:1e17::1/128 === 95.216.141.241/32 2a01:4f9:c010:5f1::1/128
Security Associations (3 up, 2 connecting):
hetzner-hel1-01.torproject.org-hetzner-nbg1-01.torproject.org[4]: ESTABLISHED 9 minutes ago, 195.201.139.202[195.201.139.202]...95.216.141.241[95.216.141.241]
hetzner-hel1-01.torproject.org-hetzner-nbg1-01.torproject.org{7}: INSTALLED, TUNNEL, reqid 6, ESP SPIs: [redacted]_i [redacted]_o
hetzner-hel1-01.torproject.org-hetzner-nbg1-01.torproject.org{7}: 195.201.139.202/32 2a01:4f8:c2c:1e17::1/128 === 95.216.141.241/32 2a01:4f9:c010:5f1::1/128
As a comparison, here is a connexion that is failing to complete:
root@hetzner-hel1-01:/etc/ipsec.secrets.d# ipsec status hetzner-hel1-01.torproject.org-hetzner-nbg1-01.torproject.org
Routed Connections:
hetzner-hel1-01.torproject.org-hetzner-nbg1-01.torproject.org{6}: ROUTED, TUNNEL, reqid 6
hetzner-hel1-01.torproject.org-hetzner-nbg1-01.torproject.org{6}: 95.216.141.241/32 2a01:4f9:c010:5f1::1/128 === 195.201.139.202/32 2a01:4f8:c2c:1e17::1/128
Security Associations (7 up, 1 connecting):
hetzner-hel1-01.torproject.org-hetzner-nbg1-01.torproject.org[18]: CONNECTING, 95.216.141.241[%any]...195.201.139.202[%any]
The following messages are then visible in /var/log/daemon.log
on
that side of the connexion:
Apr 4 21:32:58 hetzner-hel1-01/hetzner-hel1-01 charon[14592]: 12[IKE] initiating IKE_SA hetzner-hel1-01.torproject.org-hetzner-nbg1-01.torproject.org[17] to 195.201.139.202
Apr 4 21:35:44 hetzner-hel1-01/hetzner-hel1-01 charon[14592]: 05[IKE] initiating IKE_SA hetzner-hel1-01.torproject.org-hetzner-nbg1-01.torproject.org[18] to 195.201.139.202
In this case, the other side wasn't able to start the charon
daemon
properly because of missing kernel modules:
Apr 4 21:38:07 hetzner-nbg1-01/hetzner-nbg1-01 ipsec[25243]: charon has quit: initialization failed
Apr 4 21:38:07 hetzner-nbg1-01/hetzner-nbg1-01 ipsec[25243]: charon refused to be started
Apr 4 21:38:07 hetzner-nbg1-01/hetzner-nbg1-01 ipsec[25243]: ipsec starter stopped
Note that the ipsec statusall
can also be used for more detailed
status information.
The ipsec up <connexion>
command can also be used to start a
connexion manually, ipsec down <connexion>
for stopping a connexion,
naturally. Connexions are defined in /etc/ipsec.conf.d
.
The traceroute
command can be used to verify a host is well
connected over IPsec. For example, this host is directly connected:
root@hetzner-nbg1-01:/home/anarcat# traceroute hetzner-hel1-01.torproject.org
traceroute to hetzner-hel1-01.torproject.org (95.216.141.241), 30 hops max, 60 byte packets
1 hetzner-hel1-01.torproject.org (95.216.141.241) 23.780 ms 23.781 ms 23.851 ms
Another example, this host is configured through IPsec, but somehow unreachable:
root@hetzner-nbg1-01:/home/anarcat# traceroute kvm4.torproject.org
traceroute to kvm4.torproject.org (94.130.38.33), 30 hops max, 60 byte packets
1 * * *
2 * * *
3 * * *
4 * * *
5 * * *
That was because Puppet hadn't run on that other end. This Cumin recipe fixed that:
cumin 'C:ipsec' 'puppet agent -t'
The first run "failed" (as in, Puppet returned a non-zero status because it performed changes) but another run "succeeded").
If everything connects, and everything seems to work, and if you're using a roaming client, it's very likely that the IP address from your side of the tunnel is not correctly configured. This can happen if NetworkManager cycles your connection or something. The fix for this is simple, just add the IP address locally again. In my case:
ip a add 172.30.141.242/32 dev br0
You also need to down/up the tunnel after adding that IP.
Another error that frequently occurs on the gnt-chi cluster is that
the chi-node-01
server gets rebooted and the IP forwarding setting
gets lost, just run this again to fix it:
sysctl net.ipv4.ip_forward=1
Finally, never forget to "try to turn it off and on again". Simply rebooting the box can sometimes do wonders:
reboot
In my case, it seems the configuration wasn't being re-read by strongswan and rebooting the machine fixed it.
How traffic gets routed to ipsec
It might seem magical, how traffic gets encrypted by the kernel to do
ipsec, but there's actually a system that defines what triggers the
encryption. In the Linux kernel, this is done by the xfrm
framework.
The ip xfrm policy
command will list the current policies defined,
for example:
root@chi-node-01:~# ip xfrm policy
src 172.30.140.0/24 dst 172.30.141.242/32
dir out priority 371327 ptype main
tmpl src 38.229.82.104 dst 216.137.119.51
proto esp spi 0xc16efcf5 reqid 2 mode tunnel
src 172.30.141.242/32 dst 172.30.140.0/24
dir fwd priority 371327 ptype main
tmpl src 216.137.119.51 dst 38.229.82.104
proto esp reqid 2 mode tunnel
src 172.30.141.242/32 dst 172.30.140.0/24
dir in priority 371327 ptype main
tmpl src 216.137.119.51 dst 38.229.82.104
proto esp reqid 2 mode tunnel
src 0.0.0.0/0 dst 0.0.0.0/0
socket in priority 0 ptype main
src 0.0.0.0/0 dst 0.0.0.0/0
socket out priority 0 ptype main
src 0.0.0.0/0 dst 0.0.0.0/0
socket in priority 0 ptype main
src 0.0.0.0/0 dst 0.0.0.0/0
socket out priority 0 ptype main
src ::/0 dst ::/0
socket in priority 0 ptype main
src ::/0 dst ::/0
socket out priority 0 ptype main
src ::/0 dst ::/0
socket in priority 0 ptype main
src ::/0 dst ::/0
socket out priority 0 ptype main
This will encrypt packets going to or coming from
172.30.141.242
.
Specific states can be looked at with the ip xfrm state
command:
root@chi-node-01:~# ip xfrm state
src 38.229.82.104 dst 216.137.119.51
proto esp spi 0xc16efcf5 reqid 2 mode tunnel
replay-window 0 flag af-unspec
auth-trunc hmac(sha256) [...] 128
enc cbc(aes) [...]
encap type espinudp sport 4500 dport 4500 addr 0.0.0.0
anti-replay context: seq 0x0, oseq 0x9, bitmap 0x00000000
src 216.137.119.51 dst 38.229.82.104
proto esp spi 0xcf47e426 reqid 2 mode tunnel
replay-window 32 flag af-unspec
auth-trunc hmac(sha256) [...] 128
enc cbc(aes) [...]
encap type espinudp sport 4500 dport 4500 addr 0.0.0.0
anti-replay context: seq 0xc, oseq 0x0, bitmap 0x00000fff
Here we can see the two-way association for that tunnel defined above.
You can also see the routes installed by ipsec in:
ip rule
For example, here it sets up routing table 220:
# ip r show table 220
172.30.140.0/24 via 192.168.0.1 dev eth1 proto static src 172.30.141.244
It's not yet clear to me how to use this to debug problems, but at least it should make it clear what IP addresses are expected by the stack. In my case, I realized I hadn't assigned 172.30.141.242 on the remote end, so packets were never being encrypted, so it's good to double-check the IP addresses defined on the policy are actually allocated on the interfaces otherwise traffic will not flow properly.
Note: those commands were found in this excellent blog post which might have a thing or two to teach us about ipsec routing as well.
Traffic inspection
You may need to legitimately inspect the cleartext of an IPsec connexion, for example to diagnose what's taking up all that bandwidth between two nodes. It seems the state of the art on this is doing this by decrypting the ESP packets with Wireshark.