Por que meu raid5 está sempre ressincronizando? (nomes de dispositivos não são persistentes?)

2

Eu tenho uma invasão de software Intel RST no Ubuntu 14.04 com o mdadm: (Unidades de 4 x 6 TB no RAID 5, criadas com nomes de dispositivos / dev

sudo mdadm -C /dev/md/imsm /dev/sda /dev/sdb /dev/sdh /dev/sdi -n 4 -e imsm
sudo mdadm -C /dev/md/vol0 /dev/md/imsm -n 4 -l 5

a saída agora (após várias reinicializações)

sudo mdadm --query --detail /dev/md/vol0 
/dev/md/vol0:
      Container : /dev/md/imsm0, member 0
     Raid Level : raid5
     Array Size : 17581557760 (16767.08 GiB 18003.52 GB)
  Used Dev Size : -1
   Raid Devices : 4
  Total Devices : 4

          State : clean, resyncing 
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-asymmetric
     Chunk Size : 128K

  Resync Status : 54% complete


           UUID : 9adaf3f8:d899c72b:fdf41fd1:07ee0399
    Number   Major   Minor   RaidDevice State
       3       8      112        0      active sync   /dev/sdh
       2       8       16        1      active sync   /dev/sdb
       1       8        0        2      active sync   /dev/sda
       0       8      144        3      active sync   /dev/sdj

Poderia o problema de sempre ser ressincronizado, que o sistema renomeia os dispositivos de forma inconsistente (/ dev / sda - > torna-se subitamente / dev / sdi por exemplo) após o boot?

sudo mdadm --detail --scan
ARRAY /dev/md/imsm0 metadata=imsm UUID=e409a30d:353a9b11:1f9a221a:7ed7cd21
ARRAY /dev/md/vol0 container=/dev/md/imsm0 member=0 UUID=9adaf3f8:d899c72b:fdf41fd1:07ee0399

A saída da ferramenta mdadm:

sudo mdadm --examine /dev/md/imsm0 
/dev/md/imsm0:
          Magic : Intel Raid ISM Cfg Sig.
        Version : 1.3.00
    Orig Family : 68028309
         Family : 68028309
     Generation : 00002f29
     Attributes : All supported
           UUID : e409a30d:353a9b11:1f9a221a:7ed7cd21
       Checksum : 85b1b0cb correct
    MPB Sectors : 2
          Disks : 4
   RAID Devices : 1

  Disk00 Serial : WD-WXL1H84E5WHF
          State : active
             Id : 00000002
    Usable Size : 11721038862 (5589.03 GiB 6001.17 GB)

[vol0]:
           UUID : 9adaf3f8:d899c72b:fdf41fd1:07ee0399
     RAID Level : 5 <-- 5
        Members : 4 <-- 4
          Slots : [UUUU] <-- [UUUU]
    Failed disk : none
      This Slot : 0
     Array Size : 35163115520 (16767.08 GiB 18003.52 GB)
   Per Dev Size : 11721038848 (5589.03 GiB 6001.17 GB)
  Sector Offset : 0
    Num Stripes : 45785308
     Chunk Size : 128 KiB <-- 128 KiB
       Reserved : 0
  Migrate State : repair
      Map State : normal <-- normal
     Checkpoint : 5191081 (1024)
    Dirty State : clean

  Disk01 Serial : WD-WX51DA476UL6
          State : active
             Id : 00000001
    Usable Size : 11721038862 (5589.03 GiB 6001.17 GB)

  Disk02 Serial : WD-WX51DA476P65
          State : active
             Id : 00000000
    Usable Size : 11721038862 (5589.03 GiB 6001.17 GB)

  Disk03 Serial : WD-WX51DA476HS5
          State : active
             Id : 00000003
    Usable Size : 11721038862 (5589.03 GiB 6001.17 GB)

Então o estado sujo diz limpo? Por que então é ressincronizado? Alguém sabe onde um problema em potencial poderia estar?

e minha saída dmesg cauda mostra: Eu tenho que dizer que eu não tenho uma porta SAS ata7 (eu acho que este é o controlador Marvell SAS que é comutado no BIOS), eu só tenho 6 portas SAT e 2 (desligadas) portas SAS)

[ 4064.913017] sr 0:0:0:0: command ffff8802fc4ccc00 timed out
[ 4064.913043] sas: Enter sas_scsi_recover_host busy: 1 failed: 1
[ 4064.913048] sas: ata7: end_device-0:0: cmd error handler
[ 4064.913092] sas: ata7: end_device-0:0: dev error handler
[ 4064.913529] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1
[ 4064.913874] sr 0:0:0:0: command ffff8802fb703b00 timed out
[ 4064.913896] sas: Enter sas_scsi_recover_host busy: 1 failed: 1
[ 4064.913900] sas: ata7: end_device-0:0: cmd error handler
[ 4064.913984] sas: ata7: end_device-0:0: dev error handler
[ 4064.914356] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1
[ 4064.915269] sr 0:0:0:0: command ffff8802fc4ccc00 timed out
[ 4064.915297] sas: Enter sas_scsi_recover_host busy: 1 failed: 1
[ 4064.915302] sas: ata7: end_device-0:0: cmd error handler
[ 4064.915382] sas: ata7: end_device-0:0: dev error handler
[ 4064.915777] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1
[ 4064.923419] md: md127 stopped.
[ 4064.927256] md: bind<sdc>
[ 4064.927350] md: bind<sdb>
[ 4064.927427] md: bind<sda>
[ 4064.927505] md: bind<sdi>
[ 4065.497163] sr 0:0:0:0: command ffff880304de9700 timed out
[ 4065.497181] sas: Enter sas_scsi_recover_host busy: 1 failed: 1
[ 4065.497184] sas: ata7: end_device-0:0: cmd error handler
[ 4065.497255] sas: ata7: end_device-0:0: dev error handler
[ 4065.497650] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1
[ 4065.498026] sr 0:0:0:0: command ffff8802fb703e00 timed out
[ 4065.498041] sas: Enter sas_scsi_recover_host busy: 1 failed: 1
[ 4065.498043] sas: ata7: end_device-0:0: cmd error handler
[ 4065.498106] sas: ata7: end_device-0:0: dev error handler
[ 4065.498503] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1
[ 4065.499352] sr 0:0:0:0: command ffff880304de9700 timed out
[ 4065.499372] sas: Enter sas_scsi_recover_host busy: 1 failed: 1
[ 4065.499375] sas: ata7: end_device-0:0: cmd error handler
[ 4065.499483] sas: ata7: end_device-0:0: dev error handler
[ 4065.499803] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1
[ 4071.294317] md: md126 stopped.
[ 4071.294421] md: bind<sdi>
[ 4071.294481] md: bind<sda>
[ 4071.294533] md: bind<sdb>
[ 4071.294596] md: bind<sdc>
[ 4071.296579] md/raid:md126: not clean -- starting background reconstruction
[ 4071.296595] md/raid:md126: device sdc operational as raid disk 0
[ 4071.296596] md/raid:md126: device sdb operational as raid disk 1
[ 4071.296597] md/raid:md126: device sda operational as raid disk 2
[ 4071.296598] md/raid:md126: device sdi operational as raid disk 3
[ 4071.296900] md/raid:md126: allocated 0kB
[ 4071.296920] md/raid:md126: raid level 5 active with 4 out of 4 devices, algorithm 0
[ 4071.296922] RAID conf printout:
[ 4071.296923]  --- level:5 rd:4 wd:4
[ 4071.296925]  disk 0, o:1, dev:sdc
[ 4071.296926]  disk 1, o:1, dev:sdb
[ 4071.296927]  disk 2, o:1, dev:sda
[ 4071.296929]  disk 3, o:1, dev:sdi
[ 4071.296944] md126: detected capacity change from 0 to 18003515146240
[ 4071.297632]  md126: unknown partition table
[ 4072.773368] md: md126 switched to read-write mode.
[ 4072.773686] md: resync of RAID array md126
[ 4072.773690] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
[ 4072.773692] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for resync.
[ 4072.773698] md: using 128k window, over a total of 5860519424k.
    
por Gabriel 20.08.2015 / 16:12

0 respostas