Lox distributor hangs and does not respond to requests
While testing the Lox distributor functions today, I noticed it sometimes gets into a bad state where it hangs indefinitely and does not respond to requests. When trying to curl the open invitation endpoint I got a 504 response:
$ curl -I -X POST https://rdsys-frontend-01.torproject.org/lox/invite
HTTP/2 504
server: nginx
date: Wed, 17 Jan 2024 01:21:18 GMT
content-type: text/html
content-length: 160
Looking at the logs, I don't see anything unusual:
Jan 16 23:39:16 rdsys-frontend-01 lox-distributor[1209121]: Writing context to the db with key: "context_2024-01-16_23:39:16"
Jan 16 23:41:16 rdsys-frontend-01 lox-distributor[1209121]: BridgeLine [scrubbed] no longer in bridge table.
Jan 16 23:41:16 rdsys-frontend-01 lox-distributor[1209121]: BridgeLine [scrubbed] no longer in bridge table.
Jan 16 23:41:16 rdsys-frontend-01 lox-distributor[1209121]: BridgeLine [scrubbed] NOT replaced, saved for next update!
The distributor responds again after restarting it.