kernel NULL deref regression with linux kernel 6.1.0-34-amd64
This was the scenario earlier today:
root@dal-node-01:~# gnt-node list
Node                       DTotal  DFree MTotal MNode  MFree Pinst Sinst
dal-node-01.torproject.org   5.2T 783.5G 503.5G 54.1G 224.2G    11    10
dal-node-02.torproject.org      ?      ?      ?     ?      ?    14    12
dal-node-03.torproject.org   5.2T   2.8T 503.5G 13.2G 406.8G    13    12After rebooting -02, it came back to the cluster, but then several drdb disks were not syncing properly (WFConnection and WFBitMapT in /proc/drbd) and there were kernel errors all around, for example:
[ 3461.048094] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 3461.055550] #PF: supervisor instruction fetch in kernel mode
[ 3461.061610] #PF: error_code(0x0010) - not-present pageWe noticed there was a kernel upgrade recently and reverted from 6.1.0-34 to -33, and now we don't see sync issues or kernel errors anymore.
Update: it also happens on tb-build-03.
Edited  by anarcat