Skip to content
Snippets Groups Projects
raid.md 10.8 KiB
Newer Older
  • Learn to ignore specific revisions
  • [[_TOC_]]
    
    # Software RAID
    
    ## Replacing a drive
    
    
    If a drive fails in a server, the procedure is essentially to open a
    ticket, wait for the drive change, partition and re-add it to the RAID
    array. The following procdure assumes that `sda` failed and `sdb` is
    good in a RAID-1 array, but can vary with other RAID configurations or
    drive models.
    
     1. file a ticket upstream
    
        [Hetzner Support](https://robot.your-server.de/support/), for example, has an excellent service which
        asks you the disk serial number (available in the SMART email
        notification) and the SMART log (output of `smartctl -x
        /dev/sda`). Then they will turn off the machine, replace the disk,
        and start it up again.
    
     2. wait for the server to return with the new disk
    
        Hetzner will send an email to the tpa alias when that is done.
    
     3. partition the new drive (`sda`) to match the old (`sdb`):
    
            sfdisk -d /dev/sdb | sfdisk --no-reread /dev/sda --force
    
     4. re-add the new disk to the RAID array:
    
            mdadm /dev/md0 -a /dev/sda
    
    Note that Hetzner also has [pretty good documentation on how to deal
    with SMART output](https://wiki.hetzner.de/index.php/Seriennummern_von_Festplatten_und_Hinweise_zu_defekten_Festplatten/en).
    
    
    # Hardware RAID
    
    ## MegaCLI operation
    
    
    anarcat's avatar
    anarcat committed
    Some TPO machines --particularly [at cymru](howto/new-machine-cymru) -- have hardware RAID with `megaraid`
    
    controllers. Those are controlled with the `MegaCLI` command that is
    ... rather hard to use.
    
    First, alias the megacli command because the package (derived from the
    upstream RPM by Alien) installs it in a strange location:
    
        alias megacli=/opt/MegaRAID/MegaCli/MegaCli
    
    This will confirm you are using hardware raid:
    
    
        root@moly:/home/anarcat# lspci | grep -i megaraid
    
        05:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05)
    
    This will show the RAID levels of each enclosure, for example this is
    RAID-10:
    
        root@moly:/home/anarcat# megacli -LdPdInfo -aALL | grep "RAID Level"
        RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
    
    
    This is an example of a simple RAID-1 setup:
    
        root@chi-node-04:~# megacli -LdPdInfo -aALL | grep "RAID Level"
        RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
    
    
    This lists a summary of all the disks, for example the first disk has
    failed here:
    
        root@moly:/home/anarcat# megacli -PDList -aALL | grep -e '^Enclosure' -e '^Slot' -e '^PD' -e '^Firmware' -e '^Raw' -e '^Inquiry'
        Enclosure Device ID: 252
        Slot Number: 0
        Enclosure position: 0
        PD Type: SAS
        Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
        Firmware state: Failed
        Inquiry Data: SEAGATE ST3600057SS     [REDACTED]
        Enclosure Device ID: 252
        Slot Number: 1
        Enclosure position: 0
        PD Type: SAS
        Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
        Firmware state: Online, Spun Up
        Inquiry Data: SEAGATE ST3600057SS     [REDACTED]
        Enclosure Device ID: 252
        Slot Number: 2
        Enclosure position: 0
        PD Type: SAS
        Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
        Firmware state: Online, Spun Up
        Inquiry Data: SEAGATE ST3600057SS     [REDACTED]
        Enclosure Device ID: 252
        Slot Number: 3
        Enclosure position: 0
        PD Type: SAS
        Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
        Firmware state: Online, Spun Up
        Inquiry Data: SEAGATE ST3600057SS     [REDACTED]
    
    This will make the drive blink (slot number 0 in enclosure 252):
    
        megacli -PdLocate -start -physdrv[252:0] -aALL
    
    
    Take the disk offline:
    
        megacli -PDOffline -PhysDrv '[252:0]' -a0
    
    Mark the disk as missing:
    
        megacli -PDMarkMissing -PhysDrv '[252:0]' -a0
    
    Prepare the disk for removal:
    
        megacli -PDPrpRmv -PhysDrv '[252:0]' -a0
    
    Reboot the machine, replace the disk, then inspect status again, you
    may see "Unconfigured(good)" as a status:
    
        root@moly:~# megacli -PDList -aALL | grep -e '^Enclosure Device' -e '^Slot' -e '^Firmware' 
        Enclosure Device ID: 252
        Slot Number: 0
        Firmware state: Unconfigured(good), Spun Up
        [...]
    
    Then you need to re-add the disk to the array:
    
        megacli -PdReplaceMissing -PhysDrv[252:0] -Array0 -row0 -a0
        megacli -PDRbld -Start -PhysDrv[252:0] -a0
    
    Example output:
    
        root@moly:~# megacli -PdReplaceMissing -PhysDrv[252:0] -Array0 -row0 -a0
                                             
        Adapter: 0: Missing PD at Array 0, Row 0 is replaced.
    
        Exit Code: 0x00
        root@moly:~# megacli -PDRbld -Start -PhysDrv[252:0] -a0
                                             
        Started rebuild progress on device(Encl-252 Slot-0)
    
        Exit Code: 0x00
    
    Then the rebuild should have started:
    
        root@moly:~# megacli -PDList -aALL | grep -e '^Enclosure Device' -e '^Slot' -e '^Firmware' 
        Enclosure Device ID: 252
        Slot Number: 0
        Firmware state: Rebuild
        [...]
    
    To follow progress:
    
        watch /opt/MegaRAID/MegaCli/MegaCli64  -PDRbld -ShowProg -PhysDrv[252:0] -a0
    
    
    ### Rebuilding the Debian package
    
    The Debian package is based on a binary RPM provided by upstream ([LSI
    corporation](https://en.wikipedia.org/wiki/LSI_Corporation)). Unfortunately, upstream was acquired by
    [Broadcom](https://en.wikipedia.org/wiki/Broadcom_Inc.) in 2014, after which their MegaCLI software development
    seem to have stopped. Since then the `lsi.com` domain redirects to
    `broadcom.com` and those packages -- that were already hard to find --
    are getting even harder to find.
    
    It seems the [broadcom search page](https://www.broadcom.com/support/download-search?pg=&pf=&pn=&pa=&po=&dk=megacli&pl=) is the best place to find the
    megaraid stuff. In that link you should get "search results" and under
    "Management Software and Tools" there should be a link to some
    "MegaCLI". The latest is currently (as of 2021) 5.5 P2 (dated
    2014-01-19!). Note that this version number differs from the actual
    version number of the megacli binary (8.07.14).  A direct link to the
    package is currently:
    
        https://docs.broadcom.com/docs-and-downloads/raid-controllers/raid-controllers-common-files/8-07-14_MegaCLI.zip
    
    Obviously, it seems like upstream does not mind breaking those links at
    any time, so you might have to redo the search to find it. In any
    case, the package is based on a RPM buried in the ZIP file. So this
    should get you a package:
    
        unzip 8-07-14_MegaCLI.zip
        fakeroot alien Linux/MegaCli-8.07.14-1.noarch.rpm
    
    This gives you a `megacli_8.07.14-2_all.deb` package which normally
    gets upload to the proprietary archive on `alberti`.
    
    An alternative is to use existing packages like the ones from
    [le-vert.net](https://hwraid.le-vert.net/wiki/DebianPackages). In particular, `megactl` is a free software
    alternative that works on `chi-node-13`, yet not packaged in Debian so
    currently not in use:
    
        root@chi-node-13:~# megasasctl
        a0       PERC 6/i Integrated      encl:1 ldrv:1  batt:good
        a0d0       465GiB RAID 1   1x2  optimal
        a0e32s0     465GiB  a0d0  online   errs: media:0  other:819
        a0e32s1     465GiB  a0d0  online   errs: media:0  other:819
    
        root@chi-node-13:~# megasasctl
        a0       PERC 6/i Integrated      encl:1 ldrv:1  batt:good
        a0d0       465GiB RAID 1   1x2  optimal
        a0e32s0     465GiB  a0d0  online   errs: media:0  other:819
        a0e32s1     465GiB  a0d0  online   errs: media:0  other:819
    
    
    ## Pager playbook
    
    Nagios should be monitoring hardware RAID on servers that support
    it. This is normally auto-detected by Puppet (in the `raid`
    module/class) but grep around for `megaraid` otherwise. The `raid`
    module should have a good README file describing how it works.
    
    
    ### Failed disk
    
    
    A normal RAID-1 Nagios check output looks like this:
    
        OK: 0:0:RAID-1:2 drives:465.25GB:Optimal Drives:2
    
    A failed RAID-10 check output looks like this:
    
        CRITICAL: 0:0:RAID-10:4 drives:1.089TB:Degraded Drives:3
    
    It actually has the numbers backwards: in the above situation, there
    was only *one* degraded drive, and 3 healthy ones. See above for how
    to restore a drive in a MegaRAID array.
    
    
    ### Disks with "other" errors
    
    The following warning may seem innocuous but actually reports that
    drives have "errors:
    
        WARNING: 0:0:RAID-1:2 drives:465.25GB:Optimal Drives:2 (1530 Errors: 0 media, 0 predictive, 1530 other) 
    
    The `1530 Errors` part is the key here. They are "other" errors. This
    can be reproduced with the `megacli` command:
    
        # megacli -PDList -aALL | grep -e '^Enclosure Device' -e '^Slot' -e '^Firmware' -e "Error Count"
        Enclosure Device ID: 32
        Slot Number: 0
        Media Error Count: 0
        Other Error Count: 765
        Firmware state: Online, Spun Up
        Enclosure Device ID: 32
        Slot Number: 1
        Media Error Count: 0
        Other Error Count: 765
        Firmware state: Online, Spun Up
    
    The actual error should also be visible in the logs:
    
        megacli -AdpEventLog -GetLatest 100 -f events.log -aALL
    
    ... then in `events.log`, the key part is:
    
        Event Description: Unexpected sense: PD 00(e0x20/s0) Path 1221000000000000, CDB: 4d 00 4d 00 00 00 00 00 20 00, Sense: 5/24/00
    
    The `Sense` field is [Key Code Qualifier][] ("an error-code returned
    by a SCSI device") which, for 5/24/00 means "Illegal Request - invalid
    field in CDB (Command Descriptor Block) ". According to [this
    discussion][] it seems that *newer* versions of the `megacli` binary
    trigger those errors when older drives are in use. Those errors can be
    safely ignored.
    
    [this discussion]: https://serverfault.com/questions/482705/megacli-causes-drive-other-error
    [Key Code Qualifier]: https://en.wikipedia.org/wiki/Key_Code_Qualifier
    
    
    anarcat's avatar
    anarcat committed
    # SMART monitoring
    
    
    Some servers will fail to properly detect disk drives in their SMART
    configuration. In particular, `smartd` does not support:
    
     * virtual disks (e.g. `/dev/nbd0`)
     * MMC block devices (e.g. `/dev/mmcblk0`, commonly found on ARM
       devices)
     * out of the box, CCISS raid devices (e.g. `/dev/cciss/c0d0`)
    
    The latter can be configured with the following snippet in
    `/etc/smartd.conf`:
    
        #DEVICESCAN -d removable -n standby -m root -M exec /usr/share/smartmontools/smartd-runner
        DEFAULT -n standby -m root -M exec /usr/share/smartmontools/smartd-runner
        /dev/cciss/c0d0 -d cciss,0
        /dev/cciss/c0d0 -d cciss,1
        /dev/cciss/c0d0 -d cciss,2
        /dev/cciss/c0d0 -d cciss,3
        /dev/cciss/c0d0 -d cciss,4
        /dev/cciss/c0d0 -d cciss,5
    
    Notice how the `DEVICESCAN` is commented out to be replaced by the
    CCISS configuration. One line for each drive should be added (and no,
    it does not autodetect all drives unfortunately). This hack was
    deployed on `listera` which uses that hardware RAID.
    
    Other hardware RAID controllers are better supported. For example, the
    `megaraid` controller on `moly` was correctly detected by `smartd`
    which accurately found a broken hard drive.
    
    
    ## References
    
    
    Here are some external documentation links:
    
     * <https://cs.uwaterloo.ca/twiki/view/CF/MegaRaid>
     * <https://raid.wiki.kernel.org/index.php/Hardware_Raid_Setup_using_MegaCli>
    
    anarcat's avatar
    anarcat committed
     * <https://sysadmin.compxtreme.ro/how-to-replace-an-lsi-raid-disk-with-megacli/>
    
    anarcat's avatar
    anarcat committed
     * <https://wikitech.wikimedia.org/wiki/MegaCli>