prevent Puppet from restarting docker (and gitlab-runner?)
I had a job die mysteriously this morning:
https://gitlab.torproject.org/jnewsome/sponsor-61-sims/-/jobs/39943#L7771
ERROR: Job failed (system failure): aborted: terminated
And at the top of the page:
There has been a runner system failure, please try again
@anarcat mentioned this might have been related to @lavamind doing some puppet work, triggering a restart of gitlab-runner or docker.
If possible could we confirm this is what happened? Is there some safeguard we could put in place to prevent such restarts while a job is running? I feel like this might be another pain point of shoe-horning shadow sims into CI jobs - for most CI jobs it's probably no big deal to get killed and have to restart, but in this case we lost 20h of computation.
Edited by anarcat