RAID 5 Inativo na inicialização - erro: encontrado dois discos com o índice 1 para RAID md / 0

1

Meu sistema operacional é o Ubuntu 12.04.2 LTS com este kernel: 3.2.0-49-generic # 75-Ubuntu SMP Ter Jun 18 17:39:32 UTC 2013 x86_64 x86_64 x86_64 GNU / Linux

Eu tenho uma matriz RAID 5 composta de 3 discos rígidos e, de repente, quando ela começa a ficar inativa no momento da inicialização. Como o diretório pessoal é montado nele, o sistema não pode inicializar e está solicitando intervenção manual do usuário. Eu encontrei relatórios semelhantes em fóruns, mas a maioria deles tem um disco rígido defeituoso que não é o caso para mim.

Parar a matriz (mdadm --stop / dev / md0) e iniciá-la novamente (mdadm --assemble --scan / dev / md0) não mostra nenhum erro (não há nenhuma reclamação ou recriação de matriz) e, em seguida, pode ser montada corretamente (montagem manual) então por que não pode ser levantada na inicialização?

Após verificar o smartctl para todos os discos rígidos que compõem a matriz de raid (sda, sdb, sdc), não pude observar nenhum erro (nenhum Current_Pending_Sector, UDMA_CRC_Error_Count, Offline_Uncorrectable). Testes curtos e longos já foram executados.

Uma coisa que notei, que é a causa do problema, é que o grub-probe retorna este erro: "error: encontrou dois discos com o índice 1 para o RAID md / 0."

Executando o mesmo comando com -v (saída detalhada), posso identificar duas linhas comentando "grub-probe: info: array encontrado md / 0 (mdraid1x)." logo após sondar hd0 e hd1 que são mapeados em sda e sdb correspondentemente. Então o sdc não pode ser lido pelos metadados do grub para o ataque? As pessoas que enfrentam este problema sugeriram atualizar os metadados do raid de 0.90 para 1.x mas meu raid já está usando 1.2.

Eu tentei manualmente fazer com que o disco rígido sdc falhasse duas vezes (a primeira vez que o removi e o adicionei novamente e pela segunda vez usando o mdadm --zero-superblock / dev / sdc) e forcei o RAID a reconstruir, mas o erro não pode ir embora então agora estou preso. Alguém tem uma pista sobre o problema e como ele pode ser corrigido?

Abaixo, há uma lista de comandos e suas saídas que usei para diagnosticar o problema:

/ proc / stat após o boot

# cat /proc/mdstat 
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : inactive sdc1[3](S) sda1[4](S) sdb[5](S)
      5860540617 blocks super 1.2

unused devices: <none>

/etc/mdadm/mdadm.conf

# cat /etc/mdadm/mdadm.conf 
# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default (built-in), scan all partitions (/proc/partitions) and all
# containers for MD superblocks. alternatively, specify devices to scan, using
# wildcards if desired.
#DEVICE partitions containers

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# definitions of existing MD arrays
ARRAY /dev/md/0 metadata=1.2 UUID=1b273efc:62f3bc36:4579f11d:15bbc75e name=ubuntu:0

# This file was auto-generated on Mon, 27 Aug 2012 17:33:16 +0300
# by mkconf $Id$

mdadm --examine --scan

# mdadm --examine --scan
ARRAY /dev/md/0 metadata=1.2 UUID=1b273efc:62f3bc36:4579f11d:15bbc75e name=ubuntu:0

mdadm --detail

# mdadm --detail --scan
mdadm: cannot open /dev/md/0: No such file or directory

# mdadm --detail --scan /dev/md0 
mdadm: md device /dev/md0 does not appear to be active.

mdadm --stop / dev / md0 & amp; & amp; mdadm --assemble --scan / dev / md0 & amp; & amp; mdadm --detail / dev / md0

# mdadm --stop /dev/md0 
mdadm: stopped /dev/md0

# mdadm --assemble --scan /dev/md0
mdadm: /dev/md0 has been started with 3 drives.

# cat /proc/mdstat 
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : active raid5 sda1[4] sdc1[3] sdb1[5]
      3907025920 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU]

unused devices: <none>

# mdadm --detail /dev/md0 
/dev/md0:
        Version : 1.2
  Creation Time : Sat Mar 24 15:31:43 2012
     Raid Level : raid5
     Array Size : 3907025920 (3726.03 GiB 4000.79 GB)
  Used Dev Size : 1953512960 (1863.02 GiB 2000.40 GB)
   Raid Devices : 3
  Total Devices : 3
    Persistence : Superblock is persistent

    Update Time : Sun Jul 21 22:53:21 2013
          State : clean 
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 512K

           Name : ubuntu:0
           UUID : 1b273efc:62f3bc36:4579f11d:15bbc75e
         Events : 319386

    Number   Major   Minor   RaidDevice State
       4       8        1        0      active sync   /dev/sda1
       5       8       17        1      active sync   /dev/sdb1
       3       8       33        2      active sync   /dev/sdc1

mdadm --detail --scan & amp; & amp; mdadm --examine --scan

# mdadm --detail --scan
ARRAY /dev/md0 metadata=1.2 name=ubuntu:0 UUID=1b273efc:62f3bc36:4579f11d:15bbc75e

# mdadm --examine --scan
ARRAY /dev/md/0 metadata=1.2 UUID=1b273efc:62f3bc36:4579f11d:15bbc75e name=ubuntu:0

grub-probe -v /

# grub-probe -v /
grub-probe: info: cannot open '/boot/grub/device.map'.
grub-probe: info: Scanning for dmraid_nv RAID devices on disk hd0.
grub-probe: info: the size of hd0 is 3907029168.
grub-probe: info: the size of hd0 is 3907029168.
grub-probe: info: Scanning for dmraid_nv RAID devices on disk hd1.
grub-probe: info: the size of hd1 is 3907029168.
grub-probe: info: the size of hd1 is 3907029168.
grub-probe: info: Scanning for dmraid_nv RAID devices on disk hd2.
grub-probe: info: the size of hd2 is 3907029168.
grub-probe: info: the size of hd2 is 3907029168.
grub-probe: info: Scanning for dmraid_nv RAID devices on disk hd3.
grub-probe: info: the size of hd3 is 250069680.
grub-probe: info: the size of hd3 is 250069680.
grub-probe: info: scanning hd0 for LVM.
grub-probe: info: the size of hd0 is 3907029168.
grub-probe: info: no LVM signature found.
grub-probe: info: the size of hd0 is 3907029168.
grub-probe: info: scanning hd1 for LVM.
grub-probe: info: the size of hd1 is 3907029168.
grub-probe: info: no LVM signature found.
grub-probe: info: the size of hd1 is 3907029168.
grub-probe: info: scanning hd2 for LVM.
grub-probe: info: the size of hd2 is 3907029168.
grub-probe: info: no LVM signature found.
grub-probe: info: the size of hd2 is 3907029168.
grub-probe: info: scanning hd3 for LVM.
grub-probe: info: the size of hd3 is 250069680.
grub-probe: info: no LVM signature found.
grub-probe: info: the size of hd3 is 250069680.
grub-probe: info: Scanning for mdraid09 RAID devices on disk hd0.
grub-probe: info: the size of hd0 is 3907029168.
grub-probe: info: the size of hd0 is 3907029168.
grub-probe: info: Scanning for mdraid09 RAID devices on disk hd1.
grub-probe: info: the size of hd1 is 3907029168.
grub-probe: info: the size of hd1 is 3907029168.
grub-probe: info: Scanning for mdraid09 RAID devices on disk hd2.
grub-probe: info: the size of hd2 is 3907029168.
grub-probe: info: the size of hd2 is 3907029168.
grub-probe: info: Scanning for mdraid09 RAID devices on disk hd3.
grub-probe: info: the size of hd3 is 250069680.
grub-probe: info: the size of hd3 is 250069680.
grub-probe: info: Scanning for mdraid1x RAID devices on disk hd0.
grub-probe: info: the size of hd0 is 3907029168.
grub-probe: info: the size of hd0 is 3907029168.
grub-probe: info: Scanning for mdraid1x RAID devices on disk hd1.
grub-probe: info: the size of hd1 is 3907029168.
grub-probe: info: Found array md/0 (mdraid1x).
grub-probe: info: the size of hd1 is 3907029168.
grub-probe: info: Scanning for mdraid1x RAID devices on disk hd2.
grub-probe: info: the size of hd2 is 3907029168.
grub-probe: info: the size of hd2 is 3907029168.
grub-probe: info: Scanning for mdraid1x RAID devices on disk hd3.
grub-probe: info: the size of hd3 is 250069680.
grub-probe: info: the size of hd3 is 250069680.
grub-probe: info: Scanning for mdraid09 RAID devices on disk hd0.
grub-probe: info: the size of hd0 is 3907029168.
grub-probe: info: the size of hd0 is 3907029168.
grub-probe: info: Scanning for mdraid09 RAID devices on disk hd0,msdos1.
grub-probe: info: the size of hd0 is 3907029168.
grub-probe: info: Scanning for mdraid09 RAID devices on disk hd1.
grub-probe: info: the size of hd1 is 3907029168.
grub-probe: info: the size of hd1 is 3907029168.
grub-probe: info: Scanning for mdraid09 RAID devices on disk hd1,msdos1.
grub-probe: info: the size of hd1 is 3907029168.
grub-probe: info: Scanning for mdraid09 RAID devices on disk hd2.
grub-probe: info: the size of hd2 is 3907029168.
grub-probe: info: the size of hd2 is 3907029168.
grub-probe: info: Scanning for mdraid09 RAID devices on disk hd2,msdos1.
grub-probe: info: the size of hd2 is 3907029168.
grub-probe: info: Scanning for mdraid09 RAID devices on disk hd3.
grub-probe: info: the size of hd3 is 250069680.
grub-probe: info: the size of hd3 is 250069680.
grub-probe: info: Scanning for mdraid09 RAID devices on disk hd3,msdos2.
grub-probe: info: the size of hd3 is 250069680.
grub-probe: info: Scanning for mdraid09 RAID devices on disk hd3,msdos1.
grub-probe: info: the size of hd3 is 250069680.
grub-probe: info: Scanning for mdraid1x RAID devices on disk hd0.
grub-probe: info: the size of hd0 is 3907029168.
grub-probe: info: the size of hd0 is 3907029168.
grub-probe: info: Scanning for mdraid1x RAID devices on disk hd0,msdos1.
grub-probe: info: the size of hd0 is 3907029168.
grub-probe: info: Found array md/0 (mdraid1x).
grub-probe: info: Scanning for mdraid1x RAID devices on disk hd1.
grub-probe: info: the size of hd1 is 3907029168.
grub-probe: info: the size of hd1 is 3907029168.
grub-probe: info: Scanning for mdraid1x RAID devices on disk hd1,msdos1.
grub-probe: info: the size of hd1 is 3907029168.
error: found two disks with the index 1 for RAID md/0.
grub-probe: info: Scanning for mdraid1x RAID devices on disk hd2.
grub-probe: info: the size of hd2 is 3907029168.
grub-probe: info: the size of hd2 is 3907029168.
grub-probe: info: Scanning for mdraid1x RAID devices on disk hd2,msdos1.
grub-probe: info: the size of hd2 is 3907029168.
grub-probe: info: Scanning for mdraid1x RAID devices on disk hd3.
grub-probe: info: the size of hd3 is 250069680.
grub-probe: info: the size of hd3 is 250069680.
grub-probe: info: Scanning for mdraid1x RAID devices on disk hd3,msdos2.
grub-probe: info: the size of hd3 is 250069680.
grub-probe: info: Scanning for mdraid1x RAID devices on disk hd3,msdos1.
grub-probe: info: the size of hd3 is 250069680.
grub-probe: info: scanning md/0 for LVM.
grub-probe: info: no LVM signature found.
grub-probe: info: scanning hd0 for LVM.
grub-probe: info: the size of hd0 is 3907029168.
grub-probe: info: no LVM signature found.
grub-probe: info: the size of hd0 is 3907029168.
grub-probe: info: scanning hd0,msdos1 for LVM.
grub-probe: info: the size of hd0 is 3907029168.
grub-probe: info: no LVM signature found.
grub-probe: info: scanning hd1 for LVM.
grub-probe: info: the size of hd1 is 3907029168.
grub-probe: info: no LVM signature found.
grub-probe: info: the size of hd1 is 3907029168.
grub-probe: info: scanning hd1,msdos1 for LVM.
grub-probe: info: the size of hd1 is 3907029168.
grub-probe: info: no LVM signature found.
grub-probe: info: scanning hd2 for LVM.
grub-probe: info: the size of hd2 is 3907029168.
grub-probe: info: no LVM signature found.
grub-probe: info: the size of hd2 is 3907029168.
grub-probe: info: scanning hd2,msdos1 for LVM.
grub-probe: info: the size of hd2 is 3907029168.
grub-probe: info: no LVM signature found.
grub-probe: info: scanning hd3 for LVM.
grub-probe: info: the size of hd3 is 250069680.
grub-probe: info: no LVM signature found.
grub-probe: info: the size of hd3 is 250069680.
grub-probe: info: scanning hd3,msdos2 for LVM.
grub-probe: info: the size of hd3 is 250069680.
grub-probe: info: no LVM signature found.
grub-probe: info: scanning hd3,msdos1 for LVM.
grub-probe: info: the size of hd3 is 250069680.
grub-probe: info: no LVM signature found.
grub-probe: info: /dev/sdd1 starts from 2048.
grub-probe: info: opening the device hd3.
grub-probe: info: the size of hd3 is 250069680.
grub-probe: info: Partition 0 starts from 2048.
grub-probe: info: opening hd3,msdos1.
grub-probe: info: the size of hd3 is 250069680.
ext2

/boot/grub/device.map

# cat /boot/grub/device.map
(hd0)   /dev/sda
(hd1)   /dev/sdb
(hd2)   /dev/sdc
(hd3)   /dev/sdd
    
por Vangelis Tasoulas 22.07.2013 / 09:09

1 resposta

0

Eu resolvi o problema removendo cada disco do ataque (um por um), zerando o superbloco e o MBR, adicionei-os de volta ao ataque e esperei pela reconstrução.

Depois que eu fiz isso para / dev / sdb, o problema foi resolvido e agora o grub-probe mostra apenas uma linha com "grub-probe: info: array encontrado md / 0 (mdraid1x)." em vez de dois como aconteceu antes (veja a pergunta).

Portanto, deve ser o contrário do que eu pensava no começo sobre o erro de índice. Meu pensamento era que este índice deveria estar presente em todo disco sendo parte do raid, é por isso que eu estava apagando o sdc que o grub-probe não mostrava nenhum "grub-probe: info: array encontrado md / 0 (mdraid1x)." mensagem.

Eventualmente parece que apenas um deles deve tê-lo e se estiver em mais de um disco rígido este erro "erro: encontrado dois discos com o índice 1 para RAID md / 0" é gerado.

    
por Vangelis Tasoulas 24.07.2013 / 21:51