Falhas de HDD (muitos erros de UDMA_CRC_Error_Count e Power-Off_Retract_Count)

0

Recentemente eu tenho problemas com o meu disco rígido no notebook que está no compartimento óptico (em vez de drive de CD-ROM). Quando copio arquivos grandes e verifico o md5um, tenho dois valores diferentes:

$ ls -lh gigant.7z 
-rw-r--r-- 1 borgo borgo 4.4G Dec  4 16:32 gigant.7z
$ md5sum gigant.7z 
a2afa04e0e91c0730ee8afd0c1f59944  gigant.7z
$ cp gigant.7z /media/external/
$ md5sum /media/external/gigant.7z 
56910c0c1d81c65afd5b652dc5d6f62f  /media/external/gigant.7z

No diário, tenho muitos desses erros:

(lots of "failed command: WRITE FPDMA QUEUE")
Dec 21 09:27:09 borgland kernel: ata2.00: failed command: WRITE FPDMA QUEUED
Dec 21 09:27:09 borgland kernel: ata2.00: cmd 61/38:e8:e0:d4:6a/00:00:00:00:00/40 tag 29 ncq dma 28672 out
                                      res     50/00:18:40:d8:6a/00:01:00:00:00/40 Emask 0x10 (ATA bus error
Dec 21 09:27:09 borgland kernel: ata2.00: status: { DRDY }
Dec 21 09:27:09 borgland kernel: ata2.00: failed command: WRITE FPDMA QUEUED
Dec 21 09:27:09 borgland kernel: ata2.00: cmd     61/18:f0:28:d5:6a/00:00:00:00:00/40 tag 30 ncq dma 12288 out
                                      res     50/00:18:40:d8:6a/00:01:00:00:00/40 Emask 0x10 (ATA bus error
Dec 21 09:27:09 borgland kernel: ata2.00: status: { DRDY }
Dec 21 09:27:09 borgland kernel: ata2: hard resetting link
Dec 21 09:27:10 borgland kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Dec 21 09:27:10 borgland kernel: ata2.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
Dec 21 09:27:10 borgland kernel: ata2.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
Dec 21 09:27:10 borgland kernel: ata2.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
Dec 21 09:27:10 borgland kernel: ata2.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
Dec 21 09:27:10 borgland kernel: ata2.00: configured for UDMA/33
Dec 21 09:27:10 borgland kernel: ata2: EH complete

Quando eu executei smartctl -a , recebo isto:

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate     0x000b   100   100   062    Pre-fail  Always       -       0
2 Throughput_Performance  0x0005   100   100   040    Pre-fail  Offline      -       0
3 Spin_Up_Time            0x0007   233   233   033    Pre-fail  Always       -       1
4 Start_Stop_Count        0x0012   098   098   000    Old_age   Always       -       3731
5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0
7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail  Always       -       0
8 Seek_Time_Performance   0x0005   100   100   040    Pre-fail  Offline      -       0
9 Power_On_Hours          0x0012   072   072   000    Old_age   Always       -       12436
10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail  Always       -       0
12 Power_Cycle_Count       0x0032   099   099   000    Old_age   Always       -       3013
191 G-Sense_Error_Rate      0x000a   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       2069692476
193 Load_Cycle_Count        0x0012   082   082   000    Old_age   Always       -       187032
194 Temperature_Celsius     0x0002   193   193   000    Old_age   Always       -       31 (Min/Max 2/49)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -       42467
223 Load_Retry_Count        0x000a   100   100   000    Old_age   Always       -       0

Dois parâmetros parecem ruins:

192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       2069692476
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -       42467

Eu também faço testes com o mhdd:
img mhdd 1
img mhdd 2
Em mhdd parece ok. Eu corro mhdd com o qemu, assim:

qemu-system-x86_64 -hda /dev/sdb -cdrom ~/kvm/mhdd32ver4.6.iso -boot d

Saída do fdisk:

$ sudo fdisk -l /dev/sdb
Disk /dev/sdb: 298.1 GiB, 320072933376 bytes, 625142448 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x41620d87

Device     Boot  Start       End   Sectors   Size Id Type
/dev/sdb1  *      2048    718847    716800   350M  7 HPFS/NTFS/exFAT
/dev/sdb2       718848 625139711 624420864 297.8G  7 HPFS/NTFS/exFAT

Então, minha pergunta é:
O que há de errado com o meu disco rígido? É problemas com o hdd (mas não há badblocks e teste do mhdd parece bom) ou talvez com a minha baía (lotes de UDMA_RCR_Error_Count )? Não consigo interpretar Power-Off_Retract_Count erros, mas talvez esteja correlacionado com hard resetting link no diário.

    
por Borgo 21.12.2017 / 10:36

0 respostas