Eu tenho um servidor CentOS 6.6 com 13 discos em um RAID 6. Algumas semanas atrás, eu atualizei para 17 discos, dois deles configurados como sobressalentes. A reformulação funcionou normalmente no começo. Mas em 69% parou.
md2 : active raid6 sdj1[0] sdg1[18](S) sdh1[2] sdi1[5] sdm1[15] sds1[12] sdr1[14] sdk1[9] sdo1[6] sdn1[13] sdl1[8] sdd1[20] sdf1[19] sdq1[16] sdb1[10] sde1[17](S) sdc1[21]
19533803520 blocks super 1.2 level 6, 1024k chunk, algorithm 2 [15/15] [UUUUUUUUUUUUUUU]
[=============>.......] reshape = 69.0% (1347861324/1953380352) finish=46103134.8min speed=0K/sec
Eu já tentei parar o ataque e reiniciá-lo novamente, mas a mudança começará, mas pare novamente após alguns minutos. Se eu reiniciar o servidor, a mudança de comportamento não será iniciada:
md2 : active raid6 sdj1[0] sdg1[18](S) sdh1[2] sdi1[5] sdm1[15] sds1[12] sdr1[14] sdk1[9] sdo1[6] sdn1[13] sdl1[8] sdd1[20] sdf1[19] sdq1[16] sdb1[10] sde1[17](S) sdc1[21]
19533803520 blocks super 1.2 level 6, 1024k chunk, algorithm 2 [15/15] [UUUUUUUUUUUUUUU]
resync=PENDING
Apenas se eu reiniciar o ataque novamente, ele iniciará o processo de reformulação e parará como acima.
Nos logs do dmesg e de mensagens, acabei de encontrar:
dmesg
md/raid:md2: reshape: not enough stripes. Needed 1024
mensagens
23:14:56 data kernel: md/raid:md2: not clean -- starting background reconstruction
23:14:56 data kernel: md/raid:md2: reshape will continue
23:14:56 data kernel: md/raid:md2: device sdj1 operational as raid disk 0
23:14:56 data kernel: md/raid:md2: device sdh1 operational as raid disk 2
23:14:56 data kernel: md/raid:md2: device sdi1 operational as raid disk 5
23:14:56 data kernel: md/raid:md2: device sdn1 operational as raid disk 11
23:14:56 data kernel: md/raid:md2: device sds1 operational as raid disk 3
23:14:56 data kernel: md/raid:md2: device sdm1 operational as raid disk 1
23:14:56 data kernel: md/raid:md2: device sdf1 operational as raid disk 14
23:14:56 data kernel: md/raid:md2: device sdd1 operational as raid disk 13
23:14:56 data kernel: md/raid:md2: device sdb1 operational as raid disk 10
23:14:56 data kernel: md/raid:md2: device sdq1 operational as raid disk 7
23:14:56 data kernel: md/raid:md2: device sdr1 operational as raid disk 4
23:14:56 data kernel: md/raid:md2: device sdl1 operational as raid disk 8
23:14:56 data kernel: md/raid:md2: device sdk1 operational as raid disk 9
23:14:56 data kernel: md/raid:md2: device sdc1 operational as raid disk 12
23:14:56 data kernel: md/raid:md2: device sdo1 operational as raid disk 6
23:14:56 data kernel: md/raid:md2: allocated 0kB
23:14:56 data kernel: md/raid:md2: raid level 6 active with 15 out of 15 devices, algorithm 2
23:14:56 data kernel: md2: Warning: Device sdi1 is misaligned
23:14:56 data kernel: md2: detected capacity change from 0 to 20002614804480
23:14:56 data kernel: md2: unknown partition table
23:14:56 data kernel: XFS (md2): Mounting Filesystem
23:14:56 data kernel: md/raid:md2: reshape: not enough stripes. Needed 1024
23:14:56 data kernel: XFS (md2): Ending clean mount
Então eu corrigi as listras:
cat /sys/block/md2/md/stripe_cache_size
16384
Mas a remodelação ainda não está funcionando e o mesmo erro ainda aparece nos registros.
Alguém tem alguma ideia?
Tags centos software-raid