btrack_orconn slowly leaks memory
Over about 6 days, bto_find_or_new memory usage on a busy relay as reported by gperftools increased from ~13 MB to ~53 MB, or ~6.5 MB/day, ~200 MB/month. The number of open connections increased from about 6000 to 8000 over the 6 days. On another relay, manually calling bto_clear_maps and bto_init_maps freed 120 MB, as measured by or_metrics. The relay had been up for 14 days, so btrack_orconn used about 8.5 MB/day, including connections which were still open at the time of manual clearing.
I don't know why this memory is leaked. Doing some quick tests with ss -K
on a client, bto_delete seems to be called appropriately. I suppose either there exists some case(s) where either bto_delete is not called or bto_delete does not free all the memory, but bto_clear_maps does.
More broadly, though, it seems like btrack could stop tracking connections once bootstrapping is completed. That's probably easier than fixing the memory leak. I have written this hack to do that:
diff --git a/src/feature/control/btrack_orconn.c b/src/feature/control/btrack_orconn.c
index 8b1b5788d0..fbee374693 100644
--- a/src/feature/control/btrack_orconn.c
+++ b/src/feature/control/btrack_orconn.c
@@ -121,6 +121,9 @@ bto_state_rcvr(const msg_t *msg, const orconn_state_msg_t *arg)
(void)msg;
bto = bto_find_or_new(arg->gid, arg->chan);
+ if (!bto)
+ return;
+
log_debug(LD_BTRACK, "ORCONN gid=%"PRIu64" chan=%"PRIu64
" proxy_type=%d state=%d",
arg->gid, arg->chan, arg->proxy_type, arg->state);
@@ -162,6 +165,9 @@ bto_chan_rcvr(const msg_t *msg, const ocirc_chan_msg_t *arg)
(void)msg;
bto = bto_find_or_new(0, arg->chan);
+ if (!bto)
+ return;
+
if (!bto->is_orig || (bto->is_onehop && !arg->onehop)) {
log_debug(LD_BTRACK, "ORCONN LAUNCH chan=%"PRIu64" onehop=%d",
arg->chan, arg->onehop);
diff --git a/src/feature/control/btrack_orconn_cevent.c b/src/feature/control/btrack_orconn_cevent.c
index 525f4f5d0d..be7162c9aa 100644
--- a/src/feature/control/btrack_orconn_cevent.c
+++ b/src/feature/control/btrack_orconn_cevent.c
@@ -20,6 +20,7 @@
#include "core/or/orconn_event.h"
#include "feature/control/btrack_orconn.h"
#include "feature/control/btrack_orconn_cevent.h"
+#include "feature/control/btrack_orconn_maps.h"
#include "feature/control/control_events.h"
/**
@@ -147,6 +148,7 @@ bto_cevent_apconn(const bt_orconn_t *bto)
break;
case OR_CONN_STATE_OPEN:
control_event_bootstrap(BOOTSTRAP_STATUS_AP_HANDSHAKE_DONE, 0);
+ bto_clear_maps();
break;
default:
break;
diff --git a/src/feature/control/btrack_orconn_maps.c b/src/feature/control/btrack_orconn_maps.c
index 2b458d5826..400fe0024b 100644
--- a/src/feature/control/btrack_orconn_maps.c
+++ b/src/feature/control/btrack_orconn_maps.c
@@ -68,6 +68,9 @@ bto_gid_clear_map(void)
{
bt_orconn_t **elt, **next, *c;
+ if (!bto_gid_map)
+ return;
+
for (elt = HT_START(bto_gid_ht, bto_gid_map);
elt;
elt = next) {
@@ -90,6 +93,9 @@ bto_chan_clear_map(void)
{
bt_orconn_t **elt, **next, *c;
+ if (!bto_chan_map)
+ return;
+
for (elt = HT_START(bto_chan_ht, bto_chan_map);
elt;
elt = next) {
@@ -188,6 +194,9 @@ bto_new(const bt_orconn_t *key)
bt_orconn_t *
bto_find_or_new(uint64_t gid, uint64_t chan)
{
+ if (!bto_gid_map)
+ return NULL;
+
bt_orconn_t key, *bto = NULL;
tor_assert(gid || chan);