Eu recebi vários e-mails do daemon 'smartd'
com Assunto: 'Erro SMART (CurrentPendingSector)'
dizendo que
The following warning/error was logged by the smartd daemon:
Device: /dev/sda, 1 Currently unreadable (pending) sectors
Ele me enviou 80 desses e-mails durante vários meses.
Eu corri 'e2fsck -cc', 'smartctl' e 'gsmartcontrol'.
--
ID ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
...
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 1179816
...
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 17
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 1
...
- estes são destacados em rosa por 'gsmartcontrol', não em vermelho.
ou seja, ele relata 1.179.816 setores realocados (é significativo ??) e 17 eventos de realocação.
Ainda assim, "pior" é igual ao "valor".
-
/ var / log / messages tem mensagens ocasionais
Jul 24 03:12:46 turtle smartd[1443]: Device: /dev/sda,
1 Currently unreadable (pending) sectors
mensagens; total 38 nos últimos dias (!)
-
# smartctl -l error /dev/sda
relata vários erros (abaixo).
Como eu os interpreto? Devo substituir o disco rígido?
Obrigado.
A saída detalhada de "smartctl" está abaixo.
# smartctl -H -A /dev/sda
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 062 Pre-fail Always - 0
2 Throughput_Performance 0x0005 105 100 040 Pre-fail Offline - 4572
3 Spin_Up_Time 0x0007 223 100 033 Pre-fail Always - 2
4 Start_Stop_Count 0x0012 098 098 000 Old_age Always - 3671
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 1179816
7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 120 100 040 Pre-fail Offline - 40
9 Power_On_Hours 0x0012 030 030 000 Old_age Always - 30819
10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 099 099 000 Old_age Always - 2205
191 G-Sense_Error_Rate 0x000a 100 095 000 Old_age Always - 1
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 97
193 Load_Cycle_Count 0x0012 001 001 000 Old_age Always - 1865772
194 Temperature_Celsius 0x0002 177 100 000 Old_age Always - 31 (Lifetime Min/Max 9/48)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 17
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 1
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 200 190 000 Old_age Always - 38
'
# sudo smartctl -i /dev/sda
=== START OF INFORMATION SECTION ===
Model Family: Hitachi Travelstar 5K100 series
Device Model: HTS541060G9AT00
Serial Number: MPB3LAX5KUDB1M
Firmware Version: MB3OA60A
User Capacity: 60,011,642,880 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 6
ATA Standard is: ATA/ATAPI-6 T13 1410D revision 3a
..
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
-
# smartctl -l error /dev/sda
=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
ATA Error Count: 80 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 80 occurred at disk power-on lifetime: 28086 hours (1170 days + 6 hours)
When the command that caused the error occurred, the device was active or idle.
'Após a conclusão do comando, os registros foram:
ER ST SC SN CL CL DH
40 51 3f 50 28 2c e1 Erro: setores UNC 63 em LBA = 0x012c2850 = 19671120 '
'Os comandos que levam ao comando que causou o erro foram:
CR FR SC SN CL CL DH DC Comando Powered_Up_Time / Feature_Name
c8 ff 3f 50 28 2c e1 00 04: 33: 56.000 LER DMA
c8 ff 3f 00 00 00 e0 00 04: 33: 56.000 LER DMA
c6 ff 10 00 02 00 a0 00 04: 33: 56.000 SET MODO MÚLTIPLO
10 ff 3f 01 00 00 ae 00 04: 33: 56.000 RECALIBRA [OBS-4]
91 ff 3f 01 00 00 ae 00 04: 33: 56.000 INICIALIZAR PARÂMETROS DO DISPOSITIVO [OBS-6]
Erro 79 ocorreu no tempo de vida de inicialização do disco: 15200 horas (633 dias + 8 horas)
Quando o comando que causou o erro ocorreu, o dispositivo estava ativo ou ocioso.
Após a conclusão do comando, os registros foram:
ER ST SC SN CL CL DH
84 51 00 ae 3e 2f e4 Erro: CICV, ABRT em LBA = 0x042f3eae = 70205102 '
Os comandos que levam ao comando que causou o erro foram:
CR FR SC SN CL CL DH DC Comando Powered_Up_Time / Feature_Name
c8 00 08 a7 3e 2f e4 00 00: 00: 30.600 LER DMA
c8 00 00 af 62 2c e4 00 00: 00: 30.600 LER DMA
c8 00 00 af 61 2c e4 00 00: 00: 30.600 LER DMA
c8 00 00 af 60 2c e4 00 00: 00: 30.600 LER DMA
c8 00 00 af 5f 2c e4 00 00: 00: 30.600 LER DMA
Erro 78 ocorrido ...