Estou executando o 14.04 LTS com um array RAID de software de 5 unidades com unidades de 2 TB. mdadm, lvm e xfs. Meu principal dispositivo de boot é um SSD de 256GB.
Houve uma queda de energia e, quando a energia voltou, o sistema não inicializava. Ao tentar inicializar, o seguinte é rolado na tela repetidamente, então não consegui passar pelo processo de inicialização:
Incrementally starting RAID arrays...
mdadm: CREATE user root not found
mdadm: CREATE group disk not found
Incrementally started RAID arrays.
Existe um bug no launchpad para isso ( link ), mas não existe não parece ser uma solução definitiva - ou pelo menos passos facilmente reproduzíveis por mim.
A inicialização no modo de recuperação lista as seguintes informações:
[ 2.482387] md: bind<sdb1>
[ 2.408390] md: bind<sda1>
[ 2.438005] md: bind<sdc1>
[ 2.986691] Switched to clocksource tsc
Incrementally starting RAID arrays...
mdadm: CREATE user root not found
mdadm: CREATE group disk not found
[ 31.755948] md/raid:md0: device sdc1 operational as raid disk 1
[ 31.756886] md/raid:md0: device sda1 operational as raid disk 0
[ 31.756861] md/raid:md0: device sdb1 operational as raid disk 2
[ 31.756115] md/raid:md0: device sdd1 operational as raid disk 3
[ 31.756531] md/raid:md0: allocated 0kB
[ 31.756647] md/raid:md0: raid level 5 active with 4 out of 5 devices, algorithm 2
[ 31.756735] md0: detected capacity change from 0 to 8001591181312
mdadm: started array /dev/md0
Incrementally started RAID arrays.
[ 31.757933] random: nonblocking pool is initialized
[ 31.758184] md0: unknown partition table
[ 31.781641] bio: create slab <bio-1> at 1
Incrementally starting RAID arrays...
mdadm: CREATE user root not found
mdadm: CREATE group disk not found
Incrementally started RAID arrays.
Então, ao inicializar em um Live CD, todas as unidades parecem bem via dados SMART. Se eu tentar executar mdadm --assemble --scan
, recebo o seguinte aviso:
mdadm: WARNING /dev/sde1 and /dev/sde appear to have very similar superblocks.
If they are really different, please --zero the superblock on one
If they are the same or overlap, please remove one from the
DEVICE list in mdadm.conf.
A matriz não está montada.
Eu capturei todas as informações do dispositivo RAID aqui:
/dev/sda1:
Magic : a92b4efc
Version : 0.90.00
UUID : d5f6a94e:185828ec:b1902148:b8793263
Creation Time : Tue Feb 15 18:47:10 2011
Raid Level : raid5
Used Dev Size : 1953513472 (1863.02 GiB 2000.40 GB)
Array Size : 7814053888 (7452.06 GiB 8001.59 GB)
Raid Devices : 5
Total Devices : 5
Preferred Minor : 0
Update Time : Tue Aug 2 11:43:38 2016
State : clean
Active Devices : 5
Working Devices : 5
Failed Devices : 0
Spare Devices : 0
Checksum : 1af33e59 - correct
Events : 105212
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 0 8 33 0 active sync /dev/sdc1
0 0 8 33 0 active sync /dev/sdc1
1 1 8 65 1 active sync /dev/sde1
2 2 8 49 2 active sync /dev/sdd1
3 3 8 81 3 active sync /dev/sdf1
4 4 8 1 4 active sync /dev/sda1
/dev/sdb1:
Magic : a92b4efc
Version : 0.90.00
UUID : d5f6a94e:185828ec:b1902148:b8793263
Creation Time : Tue Feb 15 18:47:10 2011
Raid Level : raid5
Used Dev Size : 1953513472 (1863.02 GiB 2000.40 GB)
Array Size : 7814053888 (7452.06 GiB 8001.59 GB)
Raid Devices : 5
Total Devices : 5
Preferred Minor : 0
Update Time : Tue Aug 2 11:43:38 2016
State : clean
Active Devices : 5
Working Devices : 5
Failed Devices : 0
Spare Devices : 0
Checksum : 1af33e6d - correct
Events : 105212
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 2 8 49 2 active sync /dev/sdd1
0 0 8 33 0 active sync /dev/sdc1
1 1 8 65 1 active sync /dev/sde1
2 2 8 49 2 active sync /dev/sdd1
3 3 8 81 3 active sync /dev/sdf1
4 4 8 1 4 active sync /dev/sda1
/dev/sdc1:
Magic : a92b4efc
Version : 0.90.00
UUID : d5f6a94e:185828ec:b1902148:b8793263
Creation Time : Tue Feb 15 18:47:10 2011
Raid Level : raid5
Used Dev Size : 1953513472 (1863.02 GiB 2000.40 GB)
Array Size : 7814053888 (7452.06 GiB 8001.59 GB)
Raid Devices : 5
Total Devices : 5
Preferred Minor : 0
Update Time : Tue Aug 2 11:43:38 2016
State : clean
Active Devices : 5
Working Devices : 5
Failed Devices : 0
Spare Devices : 0
Checksum : 1af33e7b - correct
Events : 105212
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 1 8 65 1 active sync /dev/sde1
0 0 8 33 0 active sync /dev/sdc1
1 1 8 65 1 active sync /dev/sde1
2 2 8 49 2 active sync /dev/sdd1
3 3 8 81 3 active sync /dev/sdf1
4 4 8 1 4 active sync /dev/sda1
/dev/sdd1:
Magic : a92b4efc
Version : 0.90.00
UUID : d5f6a94e:185828ec:b1902148:b8793263
Creation Time : Tue Feb 15 18:47:10 2011
Raid Level : raid5
Used Dev Size : 1953513472 (1863.02 GiB 2000.40 GB)
Array Size : 7814053888 (7452.06 GiB 8001.59 GB)
Raid Devices : 5
Total Devices : 5
Preferred Minor : 0
Update Time : Tue Aug 2 11:43:38 2016
State : clean
Active Devices : 5
Working Devices : 5
Failed Devices : 0
Spare Devices : 0
Checksum : 1af33e8f - correct
Events : 105212
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 3 8 81 3 active sync /dev/sdf1
0 0 8 33 0 active sync /dev/sdc1
1 1 8 65 1 active sync /dev/sde1
2 2 8 49 2 active sync /dev/sdd1
3 3 8 81 3 active sync /dev/sdf1
4 4 8 1 4 active sync /dev/sda1
/dev/sde1:
Magic : a92b4efc
Version : 0.90.00
UUID : d5f6a94e:185828ec:b1902148:b8793263
Creation Time : Tue Feb 15 18:47:10 2011
Raid Level : raid5
Used Dev Size : 1953513472 (1863.02 GiB 2000.40 GB)
Array Size : 7814053888 (7452.06 GiB 8001.59 GB)
Raid Devices : 5
Total Devices : 5
Preferred Minor : 0
Update Time : Tue Aug 2 11:43:38 2016
State : clean
Active Devices : 5
Working Devices : 5
Failed Devices : 0
Spare Devices : 0
Checksum : 1af33e41 - correct
Events : 105212
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 4 8 1 4 active sync /dev/sda1
0 0 8 33 0 active sync /dev/sdc1
1 1 8 65 1 active sync /dev/sde1
2 2 8 49 2 active sync /dev/sdd1
3 3 8 81 3 active sync /dev/sdf1
4 4 8 1 4 active sync /dev/sda1
Original /etc/mdadm/mdadm.conf (nada maluco):
# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#
# by default (built-in), scan all partitions (/proc/partitions) and all
# containers for MD superblocks. alternatively, specify devices to scan, using
# wildcards if desired.
#DEVICE partitions containers
# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes
# automatically tag new arrays as belonging to the local system
HOMEHOST <system>
# instruct the monitoring daemon where to send mail alerts
MAILADDR root
# definitions of existing MD arrays
ARRAY /dev/md0 metadata=0.90 UUID=d5f6a94e:185828ec:b1902148:b8793263
# This file was auto-generated on Wed, 09 May 2012 23:34:51 -0400
# by mkconf $Id$
Portanto, se eu executar sudo mdadm --assemble /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1
(dependendo de quais unidades são as unidades de ataque), a matriz será montada corretamente e eu poderei acessar os arquivos.
Eu tentei puxar o poder de todas as unidades RAID, o sistema ainda não inicializa (o mesmo loop infinito).
Eu tentei chroot
e definir cada dispositivo na matriz em /etc/mdadm/mdadm.conf e, em seguida, atualizar o initramfs, que é onde estou agora e o sistema ainda não inicializa.
Aqui está o novo /etc/mdadm/mdadm.conf
:
# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#
# by default (built-in), scan all partitions (/proc/partitions) and all
# containers for MD superblocks. alternatively, specify devices to scan, using
# wildcards if desired.
#DEVICE partitions containers
DEVICE /dev/sd[abcde]1
# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes
# automatically tag new arrays as belonging to the local system
HOMEHOST <system>
# instruct the monitoring daemon where to send mail alerts
MAILADDR root
# definitions of existing MD arrays
#ARRAY /dev/md0 metadata=0.90 UUID=d5f6a94e:185828ec:b1902148:b8793263
ARRAY /dev/md0 devices=/dev/sda1,/dev/sdb1,/dev/sdc1,/dev/sdd1,/dev/sde1
# This file was auto-generated on Wed, 09 May 2012 23:34:51 -0400
# by mkconf $Id$
O que eu não entendo é o que está fazendo com que o sistema não seja montado na inicialização, quando eu posso montar manualmente especificando os dispositivos.
Uma das coisas que parecem estranhas é que gravei o processo de inicialização na câmera lenta e não vejo /dev/sde
ou /dev/sde1
nas mensagens de inicialização. Eu vou olhar para isso, mas realmente não sei o que procurar.
Atualização - Sáb 13 de agosto
Eu tenho feito mais investigações. Então, fazendo um sudo fdisk -l
mostra o seguinte para as unidades no RAID 5:
ubuntu@ubuntu:~$ sudo fdisk -l
Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
81 heads, 63 sectors/track, 765633 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0xca36f687
Device Boot Start End Blocks Id System
/dev/sda1 2048 3907029167 1953513560 fd Linux raid autodetect
WARNING: GPT (GUID Partition Table) detected on '/dev/sdc'! The util fdisk doesn't support GPT. Use GNU Parted.
Disk /dev/sdc: 2000.4 GB, 2000398933504 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029167 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 33553920 bytes
Disk identifier: 0x00000000
Device Boot Start End Blocks Id System
/dev/sdc1 1 3907029166 1953514583 ee GPT
Partition 1 does not start on physical sector boundary.
Disk /dev/sdd: 2000.4 GB, 2000398934016 bytes
81 heads, 63 sectors/track, 765633 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xf66042a2
Device Boot Start End Blocks Id System
/dev/sdd1 2048 3907029167 1953513560 fd Linux raid autodetect
Disk /dev/sde: 2000.4 GB, 2000398934016 bytes
81 heads, 63 sectors/track, 765633 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x2006adb2
Device Boot Start End Blocks Id System
/dev/sde1 2048 3907029167 1953513560 fd Linux raid autodetect
Disk /dev/sdf: 2000.4 GB, 2000398934016 bytes
81 heads, 63 sectors/track, 765633 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x0008b3d6
Device Boot Start End Blocks Id System
/dev/sdf1 2048 3907029167 1953513560 fd Linux raid autodetect
Disk /dev/sdg: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xd46f102b
Device Boot Start End Blocks Id System
/dev/sdg1 64 3907029167 1953514552 fd Linux raid autodetect
Então, obviamente, aqui /dev/sdg1
começa em uma localização de setor diferente das outras partições RAID. Então, o próximo passo foi examinar a unidade /dev/sdg
. Como você pode ver pelos 4 comandos a seguir, o mdadm não examina a unidade /dev/sdg
e detecta o RAID como acontece com as outras unidades ( /dev/sda
usado como exemplo abaixo). Esta é uma dica sobre o que está realmente errado?
ubuntu@ubuntu:~$ sudo mdadm --examine /dev/sdg
/dev/sdg:
MBR Magic : aa55
Partition[0] : 3907029104 sectors at 64 (type fd)
ubuntu@ubuntu:~$ sudo mdadm --examine /dev/sdg1
/dev/sdg1:
Magic : a92b4efc
Version : 0.90.00
UUID : d5f6a94e:185828ec:b1902148:b8793263
Creation Time : Tue Feb 15 18:47:10 2011
Raid Level : raid5
Used Dev Size : 1953513472 (1863.02 GiB 2000.40 GB)
Array Size : 7814053888 (7452.06 GiB 8001.59 GB)
Raid Devices : 5
Total Devices : 5
Preferred Minor : 0
Update Time : Sun Aug 14 03:04:59 2016
State : clean
Active Devices : 5
Working Devices : 5
Failed Devices : 0
Spare Devices : 0
Checksum : 1b029700 - correct
Events : 105212
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 3 8 81 3 active sync /dev/sdf1
0 0 8 33 0 active sync /dev/sdc1
1 1 8 65 1 active sync /dev/sde1
2 2 8 49 2 active sync /dev/sdd1
3 3 8 81 3 active sync /dev/sdf1
4 4 8 1 4 active sync /dev/sda1
ubuntu@ubuntu:~$ sudo mdadm --examine /dev/sda
/dev/sda:
Magic : a92b4efc
Version : 0.90.00
UUID : d5f6a94e:185828ec:b1902148:b8793263
Creation Time : Tue Feb 15 18:47:10 2011
Raid Level : raid5
Used Dev Size : 1953513472 (1863.02 GiB 2000.40 GB)
Array Size : 7814053888 (7452.06 GiB 8001.59 GB)
Raid Devices : 5
Total Devices : 5
Preferred Minor : 0
Update Time : Sun Aug 14 03:04:59 2016
State : clean
Active Devices : 5
Working Devices : 5
Failed Devices : 0
Spare Devices : 0
Checksum : 1b0296b2 - correct
Events : 105212
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 4 8 1 4 active sync /dev/sda1
0 0 8 33 0 active sync /dev/sdc1
1 1 8 65 1 active sync /dev/sde1
2 2 8 49 2 active sync /dev/sdd1
3 3 8 81 3 active sync /dev/sdf1
4 4 8 1 4 active sync /dev/sda1
ubuntu@ubuntu:~$ sudo mdadm --examine /dev/sda1
/dev/sda1:
Magic : a92b4efc
Version : 0.90.00
UUID : d5f6a94e:185828ec:b1902148:b8793263
Creation Time : Tue Feb 15 18:47:10 2011
Raid Level : raid5
Used Dev Size : 1953513472 (1863.02 GiB 2000.40 GB)
Array Size : 7814053888 (7452.06 GiB 8001.59 GB)
Raid Devices : 5
Total Devices : 5
Preferred Minor : 0
Update Time : Sun Aug 14 03:04:59 2016
State : clean
Active Devices : 5
Working Devices : 5
Failed Devices : 0
Spare Devices : 0
Checksum : 1b0296b2 - correct
Events : 105212
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 4 8 1 4 active sync /dev/sda1
0 0 8 33 0 active sync /dev/sdc1
1 1 8 65 1 active sync /dev/sde1
2 2 8 49 2 active sync /dev/sdd1
3 3 8 81 3 active sync /dev/sdf1
4 4 8 1 4 active sync /dev/sda1
Por último, estou confuso executando sudo mdadm --assemble --scan -v
(modo detalhado) porque parece dar um aviso sobre a unidade ( /dev/sdf
) e a primeira (e única) partição ( /dev/sdf1
) parecendo a mesma e então pare de montar. Veja aqui:
ubuntu@ubuntu:~$ sudo mdadm --assemble --scan -v
mdadm: looking for devices for /dev/md0
mdadm: Cannot assemble mbr metadata on /dev/sdh1
mdadm: Cannot assemble mbr metadata on /dev/sdh
mdadm: no recogniseable superblock on /dev/sdb5
mdadm: Cannot assemble mbr metadata on /dev/sdb2
mdadm: Cannot assemble mbr metadata on /dev/sdb1
mdadm: Cannot assemble mbr metadata on /dev/sdb
mdadm: no RAID superblock on /dev/sdg
mdadm: no RAID superblock on /dev/sdc1
mdadm: no RAID superblock on /dev/sdc
mdadm: cannot open device /dev/sr0: No medium found
mdadm: no RAID superblock on /dev/loop0
mdadm: no RAID superblock on /dev/ram15
mdadm: no RAID superblock on /dev/ram14
mdadm: no RAID superblock on /dev/ram13
mdadm: no RAID superblock on /dev/ram12
mdadm: no RAID superblock on /dev/ram11
mdadm: no RAID superblock on /dev/ram10
mdadm: no RAID superblock on /dev/ram9
mdadm: no RAID superblock on /dev/ram8
mdadm: no RAID superblock on /dev/ram7
mdadm: no RAID superblock on /dev/ram6
mdadm: no RAID superblock on /dev/ram5
mdadm: no RAID superblock on /dev/ram4
mdadm: no RAID superblock on /dev/ram3
mdadm: no RAID superblock on /dev/ram2
mdadm: no RAID superblock on /dev/ram1
mdadm: no RAID superblock on /dev/ram0
mdadm: /dev/sdg1 is identified as a member of /dev/md0, slot 3.
mdadm: /dev/sdf1 is identified as a member of /dev/md0, slot 1.
mdadm: /dev/sdf is identified as a member of /dev/md0, slot 1.
mdadm: WARNING /dev/sdf1 and /dev/sdf appear to have very similar superblocks.
If they are really different, please --zero the superblock on one
If they are the same or overlap, please remove one from the
DEVICE list in mdadm.conf.
A essa altura, estou pensando no que devo fazer a seguir?
Agradecemos antecipadamente por sua ajuda!
Atualização - 23 de setembro de 2016
Então, tentei a opção 1 acima na unidade que iniciou no setor 64. Falha na unidade, removi-a da matriz e particionei novamente o espaço. Em seguida, adicionei-o de volta e deixei-o reconstruir. Também executei um teste SMART off-line na unidade. Tudo foi passado e a unidade foi adicionada de volta ao array sem problemas.
Eu não sei o que motivou este próximo passo, mas tentei selecionar uma revisão de kernel diferente no menu do grub. Através das opções avançadas de inicialização, eu NÃO POSSO inicializar a partir de 3.13.0-92-generic
nem posso inicializar a partir de 3.13.0-86-generic
. Ambos entram no loop infinito.
No entanto, eu posso arrancar de 3.13.0-63-generic
e parece que todos os outros núcleos mais antigos (embora eu não tenha testado todos eles). Obviamente, o sistema não está funcionando 100%: embora me leve para a GUI, não consigo logar. Eu tenho que mudar para um terminal e fazer o login dessa maneira. No entanto, o array está montado e pronto para funcionar e o Samba está funcionando bem.
Então, meu próximo pensamento foi analisar o que era diferente entre as imagens initrd
. Eu expandi a imagem não funcional e a imagem funcional e comparei todos os arquivos não-binários e, apesar de ser um novato, não vi nada de errado.
Neste ponto, parece estar me apontando uma diferença entre as imagens do kernel, mas estou muito fora da minha profundidade aqui e não tenho certeza do que devo fazer a seguir.
Por favor, ajude?