Dereferência do ponteiro NULL do kernel 8.04 LTS do Ubuntu

3

Eu estava executando o rsync na minha torre Dell XPS Core 2 Duo quando ele congelou. A máquina está executando o Ubuntu 8.04 LTS, 3 GB de RAM e o software RAID 5 (mdadm) em 3 discos. O sistema está no 4º disco. No reinício eu encontrei esta linda jóia em /var/log/kern.log:

Oct 31 02:38:33 myhostname kernel: [617414.584615] Unable to handle kernel NULL pointer dereference at 0000000000000070 RIP:

Então esta manhã aconteceu novamente, mas havia mais informações no log (veja abaixo). Eu estou querendo saber se alguém pode dar alguma idéia do que isso significa. Infelizmente a máquina está em um centro de dados a 3.000 milhas de distância de mim agora, então trocar a memória será complicado.

Agradecemos antecipadamente por qualquer sugestão!

Nov  1 01:24:55 myhostname kernel: [34780.996038] Unable to handle kernel NULL pointer dereference at 0000000000000070 RIP:
Nov  1 01:24:55 myhostname kernel: [34780.996050]  [<ffffffff80470a60>] _spin_lock+0x0/0x10
Nov  1 01:24:55 myhostname kernel: [34780.996099] PGD bb0b5067 PUD bbc91067 PMD 0
Nov  1 01:24:55 myhostname kernel: [34780.996121] Oops: 0002 [1] SMP
Nov  1 01:24:55 myhostname kernel: [34780.996140] CPU 1
Nov  1 01:24:55 myhostname kernel: [34780.996156] Modules linked in: nfs lockd nfs_acl sunrpc autofs4 iptable_filter ip_tables x_tables ipv6 parport_pc lp parport loop af_packet serio_raw psmouse button dcdbas intel_agp snd_hda_intel shpchp pci_hotplug iTCO_wdt iTCO_vendor_support evdev snd_pcm snd_timer snd_page_alloc snd_hwdep snd soundcore pcspkr ext3 jbd mbcache sg sr_mod cdrom sd_mod 8139too ata_generic pata_acpi usbhid hid ata_piix 8139cp mii libata scsi_mod ehci_hcd uhci_hcd e1000 usbcore raid10 raid456 async_xor async_memcpy async_tx xor raid1 raid0 multipath linear md_mod thermal processor fan fbcon tileblit font bitblit softcursor fuse
Nov  1 01:24:55 myhostname kernel: [34780.996422] Pid: 171, comm: kswapd0 Not tainted 2.6.24-16-server #1
Nov  1 01:24:55 myhostname kernel: [34780.996442] RIP: 0010:[<ffffffff80470a60>]  [<ffffffff80470a60>] _spin_lock+0x0/0x10
Nov  1 01:24:55 myhostname kernel: [34780.996474] RSP: 0018:ffff8100b904fd48  EFLAGS: 00010202
Nov  1 01:24:55 myhostname kernel: [34780.996492] RAX: 0000000000000001 RBX: ffff8100167d23c8 RCX: 0000000000000000
Nov  1 01:24:55 myhostname kernel: [34780.996514] RDX: 0000000000000001 RSI: 00000000000000d0 RDI: 0000000000000070
Nov  1 01:24:55 myhostname kernel: [34780.996535] RBP: ffff8100167d2550 R08: 0000000000000000 R09: 0000000000000000
Nov  1 01:24:55 myhostname kernel: [34780.996555] R10: 0000000000000000 R11: ffffffff88232010 R12: 0000000000000028
Nov  1 01:24:55 myhostname kernel: [34780.996576] R13: ffff8100167d24d8 R14: 0000000000000000 R15: 0000000000000000
Nov  1 01:24:55 myhostname kernel: [34780.996597] FS:  0000000000000000(0000) GS:ffff8100bd001700(0000) knlGS:0000000000000000
Nov  1 01:24:55 myhostname kernel: [34780.996628] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
Nov  1 01:24:55 myhostname kernel: [34780.996647] CR2: 0000000000000070 CR3: 00000000bbd44000 CR4: 00000000000006e0
Nov  1 01:24:55 myhostname kernel: [34780.996668] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Nov  1 01:24:55 myhostname kernel: [34780.996688] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Nov  1 01:24:55 myhostname kernel: [34780.996710] Process kswapd0 (pid: 171, threadinfo ffff8100b904e000, task ffff8100b90487e0)
Nov  1 01:24:55 myhostname kernel: [34780.996741] Stack:  ffffffff802dc5b2 ffff8100167d23c8 0000000000000080 0000000000000028
Nov  1 01:24:55 myhostname kernel: [34780.996779]  ffff8100b904fd80 0000000000000028 ffffffff802cb244 ffff8100167d20d8
Nov  1 01:24:55 myhostname kernel: [34780.996815]  ffff810092da43d8 00000000001c4cec 0000000000067714 000000000000009b
Nov  1 01:24:55 myhostname kernel: [34780.996839] Call Trace:
Nov  1 01:24:55 myhostname kernel: [34780.996868]  [remove_inode_buffers+0x42/0x100] remove_inode_buffers+0x42/0x100
Nov  1 01:24:55 myhostname kernel: [34780.996891]  [shrink_icache_memory+0x1f4/0x2a0] shrink_icache_memory+0x1f4/0x2a0
Nov  1 01:24:55 myhostname kernel: [34780.996916]  [shrink_slab+0x124/0x180] shrink_slab+0x124/0x180
Nov  1 01:24:55 myhostname kernel: [34780.996939]  [kswapd+0x391/0x560] kswapd+0x391/0x560
Nov  1 01:24:55 myhostname kernel: [34780.996965]  [<ffffffff80254200>] autoremove_wake_function+0x0/0x30
Nov  1 01:24:55 myhostname kernel: [34780.996989]  [kswapd+0x0/0x560] kswapd+0x0/0x560
Nov  1 01:24:55 myhostname kernel: [34780.997009]  [kthread+0x4b/0x80] kthread+0x4b/0x80
Nov  1 01:24:55 myhostname kernel: [34780.997029]  [child_rip+0xa/0x12] child_rip+0xa/0x12
Nov  1 01:24:55 myhostname kernel: [34780.997053]  [kthread+0x0/0x80] kthread+0x0/0x80
Nov  1 01:24:55 myhostname kernel: [34780.997072]  [child_rip+0x0/0x12] child_rip+0x0/0x12
Nov  1 01:24:55 myhostname kernel: [34780.997091]
Nov  1 01:24:55 myhostname kernel: [34780.997104]
Nov  1 01:24:55 myhostname kernel: [34780.997105] Code: f0 ff 0f 79 09 f3 90 83 3f 00 7e f9 eb f2 c3 90 f0 81 2f 00
Nov  1 01:24:55 myhostname kernel: [34780.997184] RIP  [<ffffffff80470a60>] _spin_lock+0x0/0x10
Nov  1 01:24:55 myhostname kernel: [34780.997205]  RSP <ffff8100b904fd48>
Nov  1 01:24:55 myhostname kernel: [34780.997221] CR2: 0000000000000070
Nov  1 01:24:55 myhostname kernel: [34780.997458] ---[ end trace 26a2b00c44abedb6 ]---
    
por Jason Plank 01.11.2009 / 22:07

1 resposta

2

Ok, então este é um kernel bastante normal. Provavelmente é causado pelo "Process kswapd0" ter feito algo indesejável no disco.

Coisas a verificar: 1) execute o smartctl em todos os discos, verifique se estão operando dentro das tolerâncias recomendadas.

2) cutuque o dmesg e / var / log / messages e veja se alguma coisa desagradável aconteceu ao mesmo tempo.

3) Pesquise nos fóruns do Launchpad e do ubuntu para obter pistas sobre o que pode ter causado isso, ou peça ao #ubuntu no IRC do freenode para alguns indicadores. Você provavelmente será solicitado por mais informações como lspci, lsmod e assim por diante.

Provavelmente, alguém já teve um problema semelhante.

4) execute memtest86 durante a noite, veja se aparece algum erro de memória ofuscante.

    
por 01.11.2009 / 22:19