Eu recebo muitos dos seguintes logs no dmesg:
[26159.277230] EXT4-fs error (device sdb1): ext4_iget:3875: inode #394497188: comm mv: bad extended attribute block 22794025699896
[26219.914802] EXT4-fs error (device sdb1): ext4_iget:3875: inode #394497188: comm rm: bad extended attribute block 22794025699896
[26219.979362] EXT4-fs error (device sdb1): ext4_iget:3875: inode #394497188: comm rm: bad extended attribute block 22794025699896
[26310.378878] EXT4-fs error (device sdb1): ext4_iget:3875: inode #394497185: comm rm: bad extended attribute block 57759718172350
[26310.444128] EXT4-fs error (device sdb1): ext4_iget:3875: inode #394497186: comm rm: bad extended attribute block 162206451263262
[26310.509166] EXT4-fs error (device sdb1): ext4_iget:3875: inode #394497187: comm rm: bad extended attribute block 219797061798141
[26310.574269] EXT4-fs error (device sdb1): ext4_iget:3875: inode #394497188: comm rm: bad extended attribute block 22794025699896
[26310.639499] EXT4-fs error (device sdb1): ext4_iget:3875: inode #394497189: comm rm: bad extended attribute block 276433459229836
[26310.704830] EXT4-fs error (device sdb1): ext4_iget:3875: inode #394497190: comm rm: bad extended attribute block 122783965770376
[26310.770272] EXT4-fs error (device sdb1): ext4_iget:3875: inode #394497191: comm rm: bad extended attribute block 275806918528226
[26310.836024] EXT4-fs error (device sdb1): ext4_iget:3875: inode #394497192: comm rm: bad extended attribute block 14876582473268
[26310.901775] EXT4-fs error (device sdb1): ext4_iget:3875: inode #394497193: comm rm: bad extended attribute block 266647592419907
[26310.967680] EXT4-fs error (device sdb1): ext4_lookup:1044: inode #394472149: comm rm: deleted inode referenced: 394497194
....
seguido por
[432047.072017] EXT4-fs (sdb1): error count: 127466
[432047.072020] EXT4-fs (sdb1): initial error at 1430213419: ext4_mb_generate_buddy:739
[432047.072024] EXT4-fs (sdb1): last error at 1434663035: ext4_iget:3875: inode 394497616
[434896.488164] EXT4-fs (sdb1): re-mounted. Opts: (null)
[484610.455623] EXT4-fs error (device sdb1): htree_dirblock_to_tree:587: inode #404544002: block 6422531607: comm updatedb.mlocat: bad entry in directory: directory entry across blocks - offset=0(0), inode=540226868, rec_len=11824, name_len=54
fsck me fornece a seguinte saída:
Pass 1: Checking inodes, blocks, and sizes
Inode 7697183 has an invalid extent node (blk 122161590, lblk 0)
Clear? yes
Inode 7697183, i_blocks is 704, should be 0. Fix? yes
Inode 11592466 has an invalid extent node (blk 184029687, lblk 0)
Clear? yes
Inode 11592466, i_blocks is 265176, should be 0. Fix? yes
Inode 11592467 has an invalid extent node (blk 184029688, lblk 0)
Clear? yes
Inode 11592467, i_blocks is 265176, should be 0. Fix? yes
Inode 11592468 has an invalid extent node (blk 184029689, lblk 0)
Clear? yes
[...]
Running additional passes to resolve blocks claimed by more than one inode...
Pass 1B: Rescanning for multiply-claimed blocks
Multiply-claimed block(s) in inode 2: 7167
Multiply-claimed block(s) in inode 8: 7172 7169 7170 7173 7179 7180 7184 86548858 7186 7194 7197 7200 86562277 86563835 7203 86569985 7207 86571589 7209
Multiply-claimed block(s) in inode 11: 7210 6279 6280
Multiply-claimed block(s) in inode 99118: 1740354
Multiply-claimed block(s) in inode 99351: 1610277
Multiply-claimed block(s) in inode 99665: 1610924
Multiply-claimed block(s) in inode 99811: 1611233
Multiply-claimed block(s) in inode 99876: 7213
Multiply-claimed block(s) in inode 99887: 1611380
Multiply-claimed block(s) in inode 99913: 1611444
Multiply-claimed block(s) in inode 99959: 1611562
Multiply-claimed block(s) in inode 99976: 1611598
Multiply-claimed block(s) in inode 99981: 1611584
Multiply-claimed block(s) in inode 99997: 1611609
[...]
Pass 1C: Scanning directories for inodes with multiply-claimed blocks
Pass 1D: Reconciling multiply-claimed blocks
File /project/iwslt2014c/euronews2014_IT_TR/TR/05.mary.nn.EN.2/system/Logs/log.samples.44.i13hpc27.44.86445 (inode #6902170, mod time Tue Jan 20 14:27:44 2015)
has 1 multiply-claimed block(s), shared with 1 file(s):
/project/mtqt/project/wmt15/ende/temp/4262.log (inode #394497188, mod time Mon Jul 27 23:06:33 2105)
Clone multiply-claimed blocks? yes
File /project/iwslt2014c/euronews2014_IT_TR/TR/05.mary.nn.EN.2/system/Logs/log.samples.52.i13hpc27.52.86445 (inode #6902187, mod time Tue Jan 20 14:27:46 2015)
has 1 multiply-claimed block(s), shared with 1 file(s):
/project/mtqt/project/wmt15/ende/temp/4262.log (inode #394497188, mod time Mon Jul 27 23:06:33 2105)
Clone multiply-claimed blocks? yes
File /project/iwslt2014c/euronews2014_IT_TR/TR/05.mary.nn.EN.2/system/Logs/log.samples.54.i13hpc27.54.86445 (inode #6902191, mod time Tue Jan 20 14:27:43 2015)
has 1 multiply-claimed block(s), shared with 1 file(s):
/project/mtqt/project/wmt15/ende/temp/4262.log (inode #394497188, mod time Mon Jul 27 23:06:33 2105)
Clone multiply-claimed blocks? yes
[...]
A saída do fsck é enorme. Especialmente a seção 1B, onde lista todos os blocos reivindicados, preenche vários GB. E isso leva ao problema que o fsck corre muito tempo, então eu não podia deixar terminar. Mas até agora parecia que o problema concentra um único arquivo de log temporário ( /project/mtqt/project/wmt15/ende/temp/4262.log
), que infelizmente não consigo excluir (Erro de IO).
Eu recebo um erro semelhante em um servidor similar, usando o mesmo controlador RAID (Logic / Symbios Logic MegaRAID SAS 2208 [Thunderbolt] (rev 05)). Então eu suspeito que talvez haja um problema de configuração com o controlador RAID.
O que posso fazer para me livrar deste problema? Faz sentido, simplesmente fazer um backup de tudo e depois reformatar e restaurar? Ou existe uma solução mais eloquente? Posso de alguma forma me livrar desse arquivo de log, que parece causar o problema? Qualquer conselho é apreciado!
Tags fsck raid linux filesystems