Erro DRDY do disco rígido: é um travamento

5

Estou usando o IBM Thinkpad, 1.7GHz, 512 RAM com o Linux Mint 9 instalado. Eu tenho duas partições além do root.

Uma das partições se tornou somente leitura ontem, após o que reiniciei o sistema. É extremamente lento junto com o erro DRDY: O meu disco rígido travou? Erro Log ao inicializar.

Differences between boot sector and its backup.
failed command : READ DMA
BMDMA : stat 0X25
ata 1.00 : status : { DRDY ERR }
ata 1.00 : status :{ UNC }
Buffer I/O error on logical device, logical block 65467

saída smartctl para a partição:

mint mint # smartctl -a /dev/sda1
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model:     TOSHIBA MK4026GAX RoHS
Serial Number:    X5LY1623T
Firmware Version: PA107E
User Capacity:    40,007,761,920 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   6
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Thu Feb 17 06:48:25 2011 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84)    Offline data collection activity
                    was suspended by an interrupting command from host.
                    Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:          ( 153) seconds.
Offline data collection
capabilities:              (0x1b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    No Conveyance Self-test supported.
                    No Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    No General Purpose Logging support.
Short self-test routine 
recommended polling time:      (   2) minutes.
Extended self-test routine
recommended polling time:      (  30) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   050    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0005   100   100   050    Pre-fail  Offline      -       0
  3 Spin_Up_Time            0x0027   100   100   001    Pre-fail  Always       -       310
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       3968
  5 Reallocated_Sector_Ct   0x0033   100   100   050    Pre-fail  Always       -       40
  7 Seek_Error_Rate         0x000b   100   100   050    Pre-fail  Always       -       0
  8 Seek_Time_Performance   0x0005   100   100   050    Pre-fail  Offline      -       0
  9 Power_On_Hours          0x0032   082   082   000    Old_age   Always       -       7257
 10 Spin_Retry_Count        0x0033   179   100   030    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       3484
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       489
193 Load_Cycle_Count        0x0032   064   064   000    Old_age   Always       -       367150
194 Temperature_Celsius     0x0022   100   100   000    Old_age   Always       -       36 (Lifetime Min/Max 14/57)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       33
197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       82
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       1
199 UDMA_CRC_Error_Count    0x0032   200   253   000    Old_age   Always       -       0
220 Disk_Shift              0x0002   100   100   000    Old_age   Always       -       101
222 Loaded_Hours            0x0032   085   085   000    Old_age   Always       -       6146
223 Load_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
224 Load_Friction           0x0022   100   100   000    Old_age   Always       -       0
226 Load-in_Time            0x0026   100   100   000    Old_age   Always       -       227
240 Head_Flying_Hours       0x0001   100   100   001    Pre-fail  Offline      -       0

SMART Error Log Version: 1
ATA Error Count: 2371 (device log contains only the most recent five errors)
    CR = Command Register [HEX]
    FR = Features Register [HEX]
    SC = Sector Count Register [HEX]
    SN = Sector Number Register [HEX]
    CL = Cylinder Low Register [HEX]
    CH = Cylinder High Register [HEX]
    DH = Device/Head Register [HEX]
    DC = Device Command Register [HEX]
    ER = Error register [HEX]
    ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 2371 occurred at disk power-on lifetime: 7256 hours (302 days + 8 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 05 1a 1b 00 e0  Error: UNC 5 sectors at LBA = 0x00001b1a = 6938

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 05 1a 1b 00 e0 00      00:03:10.061  READ DMA
  f8 00 00 00 00 00 e0 00      00:03:10.061  READ NATIVE MAX ADDRESS
  ec 00 00 00 00 00 a0 02      00:03:10.053  IDENTIFY DEVICE
  ef 03 45 00 00 00 a0 02      00:03:10.053  SET FEATURES [Set transfer mode]
  f8 00 00 00 00 00 e0 00      00:03:10.053  READ NATIVE MAX ADDRESS

Error 2370 occurred at disk power-on lifetime: 7256 hours (302 days + 8 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 05 1a 1b 00 e0  Error: UNC 5 sectors at LBA = 0x00001b1a = 6938

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 05 1a 1b 00 e0 00      00:03:03.328  READ DMA
  f8 00 00 00 00 00 e0 00      00:03:03.327  READ NATIVE MAX ADDRESS
  ec 00 00 00 00 00 a0 02      00:03:03.320  IDENTIFY DEVICE
  ef 03 45 00 00 00 a0 02      00:03:03.319  SET FEATURES [Set transfer mode]
  f8 00 00 00 00 00 e0 00      00:03:03.319  READ NATIVE MAX ADDRESS

Error 2369 occurred at disk power-on lifetime: 7256 hours (302 days + 8 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 05 1a 1b 00 e0  Error: UNC 5 sectors at LBA = 0x00001b1a = 6938

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 05 1a 1b 00 e0 00      00:02:56.582  READ DMA
  f8 00 00 00 00 00 e0 00      00:02:56.582  READ NATIVE MAX ADDRESS
  ec 00 00 00 00 00 a0 02      00:02:56.574  IDENTIFY DEVICE
  ef 03 45 00 00 00 a0 02      00:02:56.574  SET FEATURES [Set transfer mode]
  f8 00 00 00 00 00 e0 00      00:02:56.574  READ NATIVE MAX ADDRESS

Error 2368 occurred at disk power-on lifetime: 7256 hours (302 days + 8 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 05 1a 1b 00 e0  Error: UNC 5 sectors at LBA = 0x00001b1a = 6938

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 05 1a 1b 00 e0 00      00:02:49.809  READ DMA
  f8 00 00 00 00 00 e0 00      00:02:49.809  READ NATIVE MAX ADDRESS
  ec 00 00 00 00 00 a0 02      00:02:49.801  IDENTIFY DEVICE
  ef 03 45 00 00 00 a0 02      00:02:49.801  SET FEATURES [Set transfer mode]
  f8 00 00 00 00 00 e0 00      00:02:49.801  READ NATIVE MAX ADDRESS

Error 2367 occurred at disk power-on lifetime: 7256 hours (302 days + 8 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 05 1a 1b 00 e0  Error: UNC 5 sectors at LBA = 0x00001b1a = 6938

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 05 1a 1b 00 e0 00      00:02:43.056  READ DMA
  f8 00 00 00 00 00 e0 00      00:02:43.056  READ NATIVE MAX ADDRESS
  ec 00 00 00 00 00 a0 02      00:02:43.048  IDENTIFY DEVICE
  ef 03 45 00 00 00 a0 02      00:02:43.048  SET FEATURES [Set transfer mode]
  f8 00 00 00 00 00 e0 00      00:02:43.047  READ NATIVE MAX ADDRESS

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


Device does not support Selective Self Tests/Logging

Preciso ter um novo disco rígido no meu PC?

    
por pranjal 17.02.2011 / 07:58

2 respostas

2

O registro de erros do SMART contém informações úteis:

Error: UNC 5 sectors at LBA = 0x00001b1a = 6938

Isso significa um erro incorrigível. O último comando foi um READ DMA, então é um erro de leitura. Parece setores 6938 a 6943 não são legíveis.

Além disso, nos atributos SMART, podemos ver que há 40 setores realocados com sucesso, 82 setores esperando para serem realocados e 1 erro incorrigível (provavelmente o do log):

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   100   100   050    Pre-fail  Always       -       40
197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       82
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       1

Tudo indica que a unidade está falhando, portanto, faça backup dos dados imediatamente. Se você não puder copiar os dados por causa dos erros, use o ddrescue para criar uma imagem da partição ignorando os blocos defeituosos; este tutorial é muito útil.

    
por 16.10.2013 / 12:11
-2

Você deve considerar os erros DRDY como uma falha fatal de hardware.

    
por 15.07.2013 / 22:25

Tags