Jenkins is broken after some upgrades, until the Jenkins services are restarted
After some upgrades (IIRC libc6 was 1 of those but the last failure makes me guess that openjdk is another one), some Jenkins operations fail until the relevant Jenkins services are restarted:
- Jenkins workers can't start
git
so they fail all jobs - Jenkins orchestrator can't run
/bin/sh
so it fails to reboot nodes after test suite runs
The error message one can see in the Jenkins job logs when this happens is: Failed to exec spawn helper
This is happening regularly. The fix is to restart the relevant Jenkins services on the affected machines, or simply reboot them.
I'm not sure there's a good way to solve this with a reasonable amount of resources, and that's not my call, but I wanted to document this problem so that the next person who searches for this error message in GitLab finds a clue and knows they have to poke sysadmins :)
cc @foundations-team