Eu tenho um Asus EeeBox PC-B204 (especificações originais aqui: link ) rodando o Ubuntu 14.04.5 LTS; A única coisa que mudou foi o disco rígido original para um SSD Kingston V300;
Até aí tudo bem. Mas ultimamente eu comecei a receber alguns erros, depois de algum tempo o sistema estava entrando em um modo readonly - nem mesmo a conexão através do ssh era possível;
Então eu removi o ssd, conectei-o a outro computador e executei alguns testes de detecção / reparo usando o Hirens Boot CD; Então, a partir do windows usando o Kingston SSD Manager, atualizei o firmware do ssd;
Montei o ssd de volta na minha máquina linux, em 2 dias sem congelamento / alternando para o modo somente leitura, mas recebo muitos desses (a cada 1-2 minutos):
[Wed Mar 1 20:41:03 2017] ata1: lost interrupt (Status 0x50) [Wed Mar 1 20:41:03 2017] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen [Wed Mar 1 20:41:03 2017] ata1.00: failed command: READ DMA [Wed Mar 1 20:41:03 2017] ata1.00: cmd c8/00:08:e0:d0:cc/00:00:00:00:00/ec tag 0 dma 4096 in [Wed Mar 1 20:41:03 2017] res 40/00:01:a0:20:cf/00:00:00:00:00/e0 Emask 0x4 (timeout) [Wed Mar 1 20:41:03 2017] ata1.00: status: { DRDY } [Wed Mar 1 20:41:03 2017] ata1: soft resetting link [Wed Mar 1 20:41:03 2017] ata1.00: configured for UDMA/33 [Wed Mar 1 20:41:03 2017] ata1: EH complete
Alguma idéia de como identificar o problema? É meu ssd quebrado? Ou meu controlador sata?
smartctl -a /dev/sda
retorna
smartctl 6.2 2013-07-26 r3841 [i686-linux-4.4.0-64-generic] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org START OF INFORMATION SECTION Model Family: SandForce Driven SSDs Device Model: KINGSTON SV300S37A120G Serial Number: 50026B785201EA6B LU WWN Device Id: 5 0026b7 85201ea6b Firmware Version: 60AABBF0 User Capacity: 120,033,041,920 bytes [120 GB] Sector Size: 512 bytes logical/physical Rotation Rate: Solid State Device Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS, ACS-2 T13/2015-D revision 3 SATA Version is: SATA 3.0, 6.0 Gb/s (current: 1.5 Gb/s) Local Time is: Wed Mar 1 21:35:08 2017 EET SMART support is: Available - device has SMART capability. SMART support is: Enabled START OF READ SMART DATA SECTION SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x05) Offline data collection activity was aborted by an interrupting command from host. Auto Offline Data Collection: Disabled. Self-test execution status: ( 33) The self-test routine was interrupted by the host with a hard or soft reset. Total time to complete Offline data collection: ( 0) seconds. Offline data collection capabilities: (0x7d) SMART execute Offline immediate. No Auto Offline data collection support. Abort Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 48) minutes. Conveyance self-test routine recommended polling time: ( 2) minutes. SCT capabilities: (0x0025) SCT Status supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x0032 095 095 050 Old_age Always - 2/22219071 5 Retired_Block_Count 0x0033 100 100 003 Pre-fail Always - 0 9 Power_On_Hours_and_Msec 0x0032 095 095 000 Old_age Always - 4408h+22m+17.710s 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 659 171 Program_Fail_Count 0x000a 100 100 000 Old_age Always - 0 172 Erase_Fail_Count 0x0032 100 100 000 Old_age Always - 0 174 Unexpect_Power_Loss_Ct 0x0030 000 000 000 Old_age Offline - 69 177 Wear_Range_Delta 0x0000 000 000 000 Old_age Offline - 99 181 Program_Fail_Count 0x000a 100 100 000 Old_age Always - 0 182 Erase_Fail_Count 0x0032 100 100 000 Old_age Always - 0 187 Reported_Uncorrect 0x0012 100 100 000 Old_age Always - 0 189 Airflow_Temperature_Cel 0x0000 041 048 000 Old_age Offline - 41 (Min/Max 19/48) 194 Temperature_Celsius 0x0022 041 048 000 Old_age Always - 41 (Min/Max 19/48) 195 ECC_Uncorr_Error_Count 0x001c 105 105 000 Old_age Offline - 2/22219071 196 Reallocated_Event_Count 0x0033 100 100 003 Pre-fail Always - 0 201 Unc_Soft_Read_Err_Rate 0x001c 105 105 000 Old_age Offline - 2/22219071 204 Soft_ECC_Correct_Rate 0x001c 105 105 000 Old_age Offline - 2/22219071 230 Life_Curve_Status 0x0013 100 100 000 Pre-fail Always - 100 231 SSD_Life_Left 0x0013 099 099 010 Pre-fail Always - 1 233 SandForce_Internal 0x0032 000 000 000 Old_age Always - 2924 234 SandForce_Internal 0x0032 000 000 000 Old_age Always - 2142 241 Lifetime_Writes_GiB 0x0032 000 000 000 Old_age Always - 2142 242 Lifetime_Reads_GiB 0x0032 000 000 000 Old_age Always - 4201 SMART Error Log not supported SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Interrupted (host reset) 10% 4327 - # 2 Extended offline Interrupted (host reset) 90% 4314 - # 3 Extended offline Interrupted (host reset) 10% 4312 - # 4 Short offline Completed without error 00% 10 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
Obrigado.