Recentemente eu tenho problemas com o meu disco rígido no notebook que está no compartimento óptico (em vez de drive de CD-ROM). Quando copio arquivos grandes e verifico o md5um, tenho dois valores diferentes:
$ ls -lh gigant.7z
-rw-r--r-- 1 borgo borgo 4.4G Dec 4 16:32 gigant.7z
$ md5sum gigant.7z
a2afa04e0e91c0730ee8afd0c1f59944 gigant.7z
$ cp gigant.7z /media/external/
$ md5sum /media/external/gigant.7z
56910c0c1d81c65afd5b652dc5d6f62f /media/external/gigant.7z
No diário, tenho muitos desses erros:
(lots of "failed command: WRITE FPDMA QUEUE")
Dec 21 09:27:09 borgland kernel: ata2.00: failed command: WRITE FPDMA QUEUED
Dec 21 09:27:09 borgland kernel: ata2.00: cmd 61/38:e8:e0:d4:6a/00:00:00:00:00/40 tag 29 ncq dma 28672 out
res 50/00:18:40:d8:6a/00:01:00:00:00/40 Emask 0x10 (ATA bus error
Dec 21 09:27:09 borgland kernel: ata2.00: status: { DRDY }
Dec 21 09:27:09 borgland kernel: ata2.00: failed command: WRITE FPDMA QUEUED
Dec 21 09:27:09 borgland kernel: ata2.00: cmd 61/18:f0:28:d5:6a/00:00:00:00:00/40 tag 30 ncq dma 12288 out
res 50/00:18:40:d8:6a/00:01:00:00:00/40 Emask 0x10 (ATA bus error
Dec 21 09:27:09 borgland kernel: ata2.00: status: { DRDY }
Dec 21 09:27:09 borgland kernel: ata2: hard resetting link
Dec 21 09:27:10 borgland kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Dec 21 09:27:10 borgland kernel: ata2.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
Dec 21 09:27:10 borgland kernel: ata2.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
Dec 21 09:27:10 borgland kernel: ata2.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
Dec 21 09:27:10 borgland kernel: ata2.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
Dec 21 09:27:10 borgland kernel: ata2.00: configured for UDMA/33
Dec 21 09:27:10 borgland kernel: ata2: EH complete
Quando eu executei smartctl -a
, recebo isto:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 062 Pre-fail Always - 0
2 Throughput_Performance 0x0005 100 100 040 Pre-fail Offline - 0
3 Spin_Up_Time 0x0007 233 233 033 Pre-fail Always - 1
4 Start_Stop_Count 0x0012 098 098 000 Old_age Always - 3731
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0
7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 100 100 040 Pre-fail Offline - 0
9 Power_On_Hours 0x0012 072 072 000 Old_age Always - 12436
10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 099 099 000 Old_age Always - 3013
191 G-Sense_Error_Rate 0x000a 100 100 000 Old_age Always - 0
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 2069692476
193 Load_Cycle_Count 0x0012 082 082 000 Old_age Always - 187032
194 Temperature_Celsius 0x0002 193 193 000 Old_age Always - 31 (Min/Max 2/49)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 42467
223 Load_Retry_Count 0x000a 100 100 000 Old_age Always - 0
Dois parâmetros parecem ruins:
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 2069692476
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 42467
Eu também faço testes com o mhdd:
img mhdd 1
img mhdd 2
Em mhdd parece ok. Eu corro mhdd com o qemu, assim:
qemu-system-x86_64 -hda /dev/sdb -cdrom ~/kvm/mhdd32ver4.6.iso -boot d
Saída do fdisk:
$ sudo fdisk -l /dev/sdb
Disk /dev/sdb: 298.1 GiB, 320072933376 bytes, 625142448 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x41620d87
Device Boot Start End Sectors Size Id Type
/dev/sdb1 * 2048 718847 716800 350M 7 HPFS/NTFS/exFAT
/dev/sdb2 718848 625139711 624420864 297.8G 7 HPFS/NTFS/exFAT
Então, minha pergunta é:
O que há de errado com o meu disco rígido? É problemas com o hdd (mas não há badblocks e teste do mhdd parece bom) ou talvez com a minha baía (lotes de UDMA_RCR_Error_Count
)? Não consigo interpretar Power-Off_Retract_Count
erros, mas talvez esteja correlacionado com hard resetting link
no diário.