Problema ao configurar o RAID no Ubuntu Server 16.04.3

0

Eu segui este guia para configurar uma matriz RAID 1 na minha instalação do Ubuntu Server.

Aqui está um resumo dos passos que eu dei para configurar:

# NOTE: All of this was done with a fresh installation of the operating system.

# References:
#   The device files for my hard-drives are:
#     /dev/sdb
#     /dev/sdc
#   The device file for my main drive - where the operating system is installed is:
#     /dev/sda
#   The directory that will be mounted to the RAID array is:
#     /mnt/raid1

# Reset any RAID array info on the drives.
mdadm --zero-superblock /dev/sdb
mdadm --zero-superblock /dev/sdc

# Create the RAID array:
mdadm --create --verbose /dev/md/raid1 --level=1 --raid-devices=2 /dev/sdb /dev/sdc

# Create a file system on the array:
mkfs.ext4 -F /dev/md/raid1

# Create the directory to mount the RAID array on:
mkdir -p /mnt/raid1

# Manually mount the filesystem:
mount /dev/md/raid1 /mnt/raid1

# Add the entry in /etc/fstab to make sure the filesystem is manually mounted at boot-time:
echo '/dev/md/raid1 /mnt/raid1 ext4 defaults,nofail,discard 0 0' | tee -a /etc/fstab

# Add the entry in /etc/mdadm/mdadm.conf to make sure the RAID array is assembled automatically at boot-time:
mdadm --detail --scan | tee -a /etc/mdadm/mdadm.conf

# Update the initial RAM file system:
update-initramfs -u

# I then proceed to create a test file and write to it and make sure the RAID array is actually working:
touch /mnt/raid1/test.txt
echo "testing testing testing" | tee -a /mnt/raid1/test.txt
cat /mnt/raid1/test.txt

Quando reiniciei o computador, tudo parecia funcionar bem. O arquivo ainda estava lá e seu conteúdo estava intacto.

Mas quando eu testo a matriz RAID para redundância, não funciona:

  1. eu desligo o sistema.
  2. eu desconectei uma das unidades do sistema.
  3. eu ligo o sistema.
  4. A matriz RAID está inativa; o sistema de arquivos não está montado.

Para referência, veja como o arquivo /etc/fstab se parece:

# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
# / was on /dev/sda3 during installation
UUID=e501436e-23c9-4994-97f3-e5c8e6165c2c /               ext4    errors=remount-ro 0       1
# /boot was on /dev/sda1 during installation
UUID=642669a6-3d69-46fd-af0f-0c5c00df384b /boot           ext4    defaults        0       2
# swap was on /dev/sda2 during installation
UUID=bb572957-4a2c-4ac0-9037-773b428ec0fd none            swap    sw              0       0
/dev/md/raid1 /mnt/raid1 ext4 defaults,nofail,discard 0 0

Para referência, veja como o arquivo /etc/mdadm/mdadm.conf se parece:

# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default (built-in), scan all partitions (/proc/partitions) and all
# containers for MD superblocks. alternatively, specify devices to scan, using
# wildcards if desired.
#DEVICE partitions containers

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# definitions of existing MD arrays

# This file was auto-generated on Sat, 05 May 2018 15:31:12 -0400
# by mkconf $Id$
ARRAY /dev/md/raid1 metadata=1.2 name=andromeda:raid1 UUID=a5585762:a311c411:157f85e5:4e4537fe

Aqui estão alguns outros testes que eu executei para tentar depurar isso:

# One drive connected to the system.
ls /mnt/raid1/
    (nothing)
ls /dev
    md127
    sdb
cat /proc/mdstat
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
    md127 : inactive sdb[1](S)
      976631512 blocks super 1.2

    unused devices: <none>
mdadm --detail /dev/md127
    /dev/md127:
            Version : 1.2
        Raid Level : raid0
    Total Devices : 1
        Persistence : Superblock is persistent

            State : inactive

            Name : mycomputer:raid1  (local to host mycomputer)
            UUID : a5585762:a311c411:157f85e5:4e4537fe
            Events : 1557

        Number   Major   Minor   RaidDevice

        -       8       16        -        /dev/sdb
mdadm --detail /dev/sdb
    mdadm: /dev/sdb does not appear to be an md device
mdadm --detail /dev/sdc
    mdadm: cannot open /dev/sdc: No such file or directory


# One drive (other drive this time) connected to the system.
ls /mnt/raid1/
    (nothing)
ls /dev
    md127
    sdb    # Why is this not /dev/sdc?
cat /proc/mdstat
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
    md127 : inactive sdb[0](S)
    976631512 blocks super 1.2

    unused devices: <none>
mdadm --detail /dev/md127
    /dev/md127:
            Version : 1.2
        Raid Level : raid0
    Total Devices : 1
        Persistence : Superblock is persistent

            State : inactive

            Name : mycomputer:raid1  (local to host mycomputer)
            UUID : a5585762:a311c411:157f85e5:4e4537fe
            Events : 1557

        Number   Major   Minor   RaidDevice

        -       8       16        -        /dev/sdb
mdadm --detail /dev/sdb
    mdadm: /dev/sdb does not appear to be an md device
mdadm --detail /dev/sdc
    mdadm: cannot open /dev/sdc: No such file or directory


# Both drives connected to the system
ls /mnt/raid1/
    lost+found
    testfile.txt
cat /mnt/raid1/testfile.txt
    testing testing testing
ls /dev
    md
    md127
    sdb
    sdc
ls /dev/md
    raid1
ls /dev/md -la
    total 0
    drwxr-xr-x  2 root root   60 May  6 13:40 .
    drwxr-xr-x 20 root root 4120 May  6 13:40 ..
    lrwxrwxrwx  1 root root    8 May  6 13:40 raid1 -> ../md127
cat /proc/mdstat
    Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
    md127 : active raid1 sdc[1] sdb[0]
        976631488 blocks super 1.2 [2/2] [UU]
        bitmap: 0/8 pages [0KB], 65536KB chunk

    unused devices: <none>
mdadm --detail /dev/md127
    /dev/md127:
            Version : 1.2
    Creation Time : Sat May  5 17:44:07 2018
        Raid Level : raid1
        Array Size : 976631488 (931.39 GiB 1000.07 GB)
    Used Dev Size : 976631488 (931.39 GiB 1000.07 GB)
    Raid Devices : 2
    Total Devices : 2
        Persistence : Superblock is persistent

    Intent Bitmap : Internal

        Update Time : Sun May  6 13:44:27 2018
            State : clean
    Active Devices : 2
    Working Devices : 2
    Failed Devices : 0
    Spare Devices : 0

            Name : mycomputer:raid1  (local to host mycomputer)
            UUID : a5585762:a311c411:157f85e5:4e4537fe
            Events : 1557

        Number   Major   Minor   RaidDevice State
        0       8       16        0      active sync   /dev/sdb
        1       8       32        1      active sync   /dev/sdc
mdadm --detail /dev/md/raid1
    /dev/md/raid1:
            Version : 1.2
    Creation Time : Sat May  5 17:44:07 2018
        Raid Level : raid1
        Array Size : 976631488 (931.39 GiB 1000.07 GB)
    Used Dev Size : 976631488 (931.39 GiB 1000.07 GB)
    Raid Devices : 2
    Total Devices : 2
        Persistence : Superblock is persistent

    Intent Bitmap : Internal

        Update Time : Sun May  6 13:44:27 2018
            State : clean
    Active Devices : 2
    Working Devices : 2
    Failed Devices : 0
    Spare Devices : 0

            Name : mycomputer:raid1  (local to host mycomputer)
            UUID : a5585762:a311c411:157f85e5:4e4537fe
            Events : 1557

        Number   Major   Minor   RaidDevice State
        0       8       16        0      active sync   /dev/sdb
        1       8       32        1      active sync   /dev/sdc
mdadm --detail /dev/sdb
    mdadm: /dev/sdb does not appear to be an md device
mdadm --detail /dev/sdc
    mdadm: /dev/sdc does not appear to be an md device


# No drives connected to the system.
ls /mnt/raid1/
    (nothing)
ls /dev
    (nothing)
cat /proc/mdstat
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
    unused devices: <none>

Eu esperava que os detalhes de mdadm mostrassem a matriz RAID como degradada, não inativa. O que está acontecendo? A matriz RAID não está se remontando corretamente na inicialização? Por quê?

Além disso, mdadm --detail /dev/sdb e mdadm --detail /dev/sdc não devem mostrar que esses são de fato md devices

As minhas suposições estão corretas sobre como a matriz RAID deve funcionar? Deve ainda funcionar quando eu desconectar um dos discos rígidos do sistema? O que é /dev/md127 ? Como chegou lá? Por que /dev/md/raid1 está apontando (?) Para ele?

Por que /dev/md/raid1 não está presente quando pelo menos um dos discos rígidos está desconectado?

Não importa qual disco rígido eu desplugue, o disco rígido que ainda está conectado ao sistema parece ser tratado pelo arquivo de dispositivo /dev/sdb ? Eu teria esperado que oscilasse entre /dev/sdb e /dev/sdc dependendo de qual disco rígido foi deixado conectado. Por que isso acontece? É isso que está causando problemas com a matriz RAID?

Peço desculpas pela enxurrada de perguntas, mas estou realmente perdida tentando entender por que isso não está funcionando como eu esperava.

    
por Max Jacob 06.05.2018 / 21:48

0 respostas