Processos relacionados a LVMs pendem completamente

4

Esse problema pertence a um nó KVM no qual as VMs obtêm seu armazenamento via LVM. Portanto, cada VM tem seu próprio volume lógico. Todas as noites, algumas máquinas virtuais são copiadas para backup (instantâneo - dd [..] | ssh [..] - nada especial). No entanto, ontem à noite isso de alguma forma fodeu o sistema LVM. 2-3 minutos após o início do segundo backup, o kernel começou a registrar "tarefas interrompidas" - resumindo, relatou três processos qemu-kvm como pendentes e o processo dd. Pelo menos uma das VMs (que é um servidor gerenciado, monitorado por nós) caiu - para ser mais preciso: ainda estava em execução, mas os serviços não respondiam mais. O VNC mostrou tarefas interrompidas dentro da VM. Após uma reinicialização a frio (e uma migração - veja abaixo) a VM estava bem, mas o processo dd nunca terminou ( kill -9 não faz nada) e comandos como lvdisplay não funcionam mais - eles simplesmente não distribuem nada. O lvmetad também não pode ser reiniciado e todo processo que pertence ao LVM não pode ser eliminado. Eles apenas ficam em estado de disco para sempre, enquanto o nó geralmente funciona bem. A VM que foi desativada precisou ser migrada para outro nó, pois virsh shutdown também não funcionava mais - "dispositivo ou recurso ocupado". Mas as outras VMs também continuam fazendo seu trabalho.

Tivemos isso em outro nó algumas semanas atrás, onde a VM "snapshotted" também caiu, executou uma atualização do kernel de 4.4 para 4.9 (como tivemos que reinicializar a máquina de qualquer maneira) e não viu um problema como isso de novo. Mas como o nó que mostrou o problema hoje tem um tempo de atividade de dois meses, isso não diz que isso é realmente fixo. Então, alguém pode ver mais nesses registros do que nós? Seria muito apreciado. Obrigado pela leitura!

Apr 28 00:37:15 vnode19 kernel: INFO: task qemu-kvm:32970 blocked for more than 120 seconds.
Apr 28 00:37:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:37:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:37:15 vnode19 kernel: qemu-kvm        D ffff88734767f908     0 32970      1 0x00000080
Apr 28 00:37:15 vnode19 kernel: ffff88734767f908 ffff880166d65900 ffff887048ef0000 ffff887347680000
Apr 28 00:37:15 vnode19 kernel: 0000000000000000 7fffffffffffffff 0000000000000000 ffff88492b5b8a00
Apr 28 00:37:15 vnode19 kernel: ffff88734767f920 ffffffff816b2425 ffff887f7f116cc0 ffff88734767f9d0
Apr 28 00:37:15 vnode19 kernel: Call Trace:
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b5137>] schedule_timeout+0x237/0x2d0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81309826>] ? generic_make_request+0x106/0x1d0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b1b96>] io_schedule_timeout+0xa6/0x110
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8124b537>] do_blockdev_direct_IO+0xca7/0x2d20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8124d5f3>] __blockdev_direct_IO+0x43/0x50
Apr 28 00:37:15 vnode19 kernel: [<ffffffff812479d8>] blkdev_direct_IO+0x58/0x80
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81190a3d>] generic_file_direct_write+0xad/0x170
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81190bc2>] __generic_file_write_iter+0xc2/0x1e0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247cd0>] blkdev_write_iter+0x90/0x130
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247c40>] ? bd_unlink_disk_holder+0xe0/0xe0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120d8a1>] do_readv_writev+0x1f1/0x2b0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff811308cf>] ? __audit_syscall_entry+0xaf/0x100
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120d9e9>] vfs_writev+0x39/0x50
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120e9e8>] SyS_pwritev+0xb8/0xe0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71
Apr 28 00:37:15 vnode19 kernel: INFO: task qemu-kvm:33655 blocked for more than 120 seconds.
Apr 28 00:37:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:37:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:37:15 vnode19 kernel: qemu-kvm        D ffff886a1dd23908     0 33655      1 0x00000080
Apr 28 00:37:15 vnode19 kernel: ffff886a1dd23908 ffff8875c6e442c0 ffff88582127ac80 ffff886a1dd24000
Apr 28 00:37:15 vnode19 kernel: 0000000000000000 7fffffffffffffff 0000000000000000 ffff886d0d021e00
Apr 28 00:37:15 vnode19 kernel: ffff886a1dd23920 ffffffff816b2425 ffff887f7f496cc0 ffff886a1dd239d0
Apr 28 00:37:15 vnode19 kernel: Call Trace:
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b5137>] schedule_timeout+0x237/0x2d0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81309826>] ? generic_make_request+0x106/0x1d0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b1b96>] io_schedule_timeout+0xa6/0x110
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8124b537>] do_blockdev_direct_IO+0xca7/0x2d20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8124d5f3>] __blockdev_direct_IO+0x43/0x50
Apr 28 00:37:15 vnode19 kernel: [<ffffffff812479d8>] blkdev_direct_IO+0x58/0x80
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81190a3d>] generic_file_direct_write+0xad/0x170
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81190bc2>] __generic_file_write_iter+0xc2/0x1e0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247cd0>] blkdev_write_iter+0x90/0x130
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247c40>] ? bd_unlink_disk_holder+0xe0/0xe0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120d8a1>] do_readv_writev+0x1f1/0x2b0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff811308cf>] ? __audit_syscall_entry+0xaf/0x100
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120d9e9>] vfs_writev+0x39/0x50
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120e9e8>] SyS_pwritev+0xb8/0xe0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71
Apr 28 00:37:15 vnode19 kernel: INFO: task qemu-kvm:33661 blocked for more than 120 seconds.
Apr 28 00:37:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:37:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:37:15 vnode19 kernel: qemu-kvm        D ffff8855341f3728     0 33661      1 0x00000080
Apr 28 00:37:15 vnode19 kernel: ffff8855341f3728 ffff880166d642c0 ffff886916a4c2c0 ffff8855341f4000
Apr 28 00:37:15 vnode19 kernel: ffff880d40fc8c18 ffff880d40fc8c00 ffffffff00000000 fffffffe00000001
Apr 28 00:37:15 vnode19 kernel: ffff8855341f3740 ffffffff816b2425 ffff886916a4c2c0 ffff8855341f37d0
Apr 28 00:37:15 vnode19 kernel: Call Trace:
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b4c05>] rwsem_down_write_failed+0x1f5/0x320
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81343233>] call_rwsem_down_write_failed+0x13/0x20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b44ad>] ? down_write+0x2d/0x40
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa06dfdfe>] __origin_write+0x6e/0x210 [dm_snapshot]
Apr 28 00:37:15 vnode19 kernel: [<ffffffff811918ae>] ? mempool_alloc+0x6e/0x170
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa06e0007>] do_origin.isra.14+0x67/0x90 [dm_snapshot]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa06e0092>] origin_map+0x62/0x80 [dm_snapshot]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04acf8a>] __map_bio+0x3a/0x110 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04ae73f>] __split_and_process_bio+0x24f/0x3f0 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04ae94a>] dm_make_request+0x6a/0xd0 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81309826>] generic_make_request+0x106/0x1d0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81309967>] submit_bio+0x77/0x150
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81300deb>] ? bio_alloc_bioset+0x1ab/0x2d0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8124ccb7>] do_blockdev_direct_IO+0x2427/0x2d20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8124d5f3>] __blockdev_direct_IO+0x43/0x50
Apr 28 00:37:15 vnode19 kernel: [<ffffffff812479d8>] blkdev_direct_IO+0x58/0x80
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81190a3d>] generic_file_direct_write+0xad/0x170
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81190bc2>] __generic_file_write_iter+0xc2/0x1e0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247cd0>] blkdev_write_iter+0x90/0x130
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120cf59>] __vfs_write+0xc9/0x110
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120d5b2>] vfs_write+0xa2/0x1a0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120e537>] SyS_pwrite64+0x87/0xb0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71
Apr 28 00:37:15 vnode19 kernel: INFO: task dmeventd:33781 blocked for more than 120 seconds.
Apr 28 00:37:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:37:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:37:15 vnode19 kernel: dmeventd        D ffff8803493b7af8     0 33781      1 0x00000080
Apr 28 00:37:15 vnode19 kernel: ffff8803493b7af8 ffff880166da1640 ffff880b15a50000 ffff8803493b8000
Apr 28 00:37:15 vnode19 kernel: ffff880d40fc8c18 ffff880d40fc8c00 ffffffff00000000 fffffffe00000001
Apr 28 00:37:15 vnode19 kernel: ffff8803493b7b10 ffffffff816b2425 ffff880b15a50000 ffff8803493b7b98
Apr 28 00:37:15 vnode19 kernel: Call Trace:
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b4c05>] rwsem_down_write_failed+0x1f5/0x320
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81343233>] call_rwsem_down_write_failed+0x13/0x20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b44ad>] ? down_write+0x2d/0x40
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa06df172>] snapshot_status+0x82/0x1a0 [dm_snapshot]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04b51a6>] retrieve_status+0xa6/0x1b0 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04b6363>] table_status+0x63/0xa0 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04b6300>] ? dm_get_live_or_inactive_table.isra.3+0x30/0x30 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04b6015>] ctl_ioctl+0x255/0x4d0 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81095806>] ? __dequeue_signal+0x106/0x1b0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81095a1b>] ? recalc_sigpending+0x1b/0x50
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04b62a3>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81220872>] do_vfs_ioctl+0x2d2/0x4b0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff811308cf>] ? __audit_syscall_entry+0xaf/0x100
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81220ac9>] SyS_ioctl+0x79/0x90
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71
Apr 28 00:37:15 vnode19 kernel: INFO: task dd:33790 blocked for more than 120 seconds.
Apr 28 00:37:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:37:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:37:15 vnode19 kernel: dd              D ffff885238e1f828     0 33790  33746 0x00000080
Apr 28 00:37:15 vnode19 kernel: ffff885238e1f828 ffff883f77ce42c0 ffff884a64088000 ffff885238e20000
Apr 28 00:37:15 vnode19 kernel: ffff880d40fc8c18 ffff880d40fc8c00 ffffffff00000000 fffffffe00000001
Apr 28 00:37:15 vnode19 kernel: ffff885238e1f840 ffffffff816b2425 ffff884a64088000 ffff885238e1f8d0
Apr 28 00:37:15 vnode19 kernel: Call Trace:
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b4c05>] rwsem_down_write_failed+0x1f5/0x320
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81343233>] call_rwsem_down_write_failed+0x13/0x20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b44ad>] ? down_write+0x2d/0x40
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa06e0d32>] snapshot_map+0x62/0x390 [dm_snapshot]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04acf8a>] __map_bio+0x3a/0x110 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04ae73f>] __split_and_process_bio+0x24f/0x3f0 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04ae94a>] dm_make_request+0x6a/0xd0 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81309826>] generic_make_request+0x106/0x1d0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81309967>] submit_bio+0x77/0x150
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8124d6ba>] mpage_bio_submit+0x2a/0x40
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8124e0b0>] mpage_readpages+0x130/0x160
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff811e0428>] ? alloc_pages_current+0x88/0x120
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247add>] blkdev_readpages+0x1d/0x20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8119bfbc>] __do_page_cache_readahead+0x19c/0x220
Apr 28 00:37:15 vnode19 kernel: [<ffffffff810b4c39>] ? try_to_wake_up+0x49/0x3d0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8119c175>] ondemand_readahead+0x135/0x260
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04ae0aa>] ? dm_any_congested+0x4a/0x50 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8119c30c>] page_cache_async_readahead+0x6c/0x70
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81190748>] generic_file_read_iter+0x438/0x680
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81215e79>] ? pipe_write+0x3d9/0x430
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247da7>] blkdev_read_iter+0x37/0x40
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120ce56>] __vfs_read+0xc6/0x100
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120d45f>] vfs_read+0x7f/0x130
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120e2d5>] SyS_read+0x55/0xc0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71
Apr 28 00:39:15 vnode19 kernel: INFO: task qemu-kvm:32970 blocked for more than 120 seconds.
Apr 28 00:39:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:39:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:39:15 vnode19 kernel: qemu-kvm        D ffff88734767f908     0 32970      1 0x00000080
Apr 28 00:39:15 vnode19 kernel: ffff88734767f908 ffff880166d65900 ffff887048ef0000 ffff887347680000
Apr 28 00:39:15 vnode19 kernel: 0000000000000000 7fffffffffffffff 0000000000000000 ffff88492b5b8a00
Apr 28 00:39:15 vnode19 kernel: ffff88734767f920 ffffffff816b2425 ffff887f7f116cc0 ffff88734767f9d0
Apr 28 00:39:15 vnode19 kernel: Call Trace:
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b5137>] schedule_timeout+0x237/0x2d0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81309826>] ? generic_make_request+0x106/0x1d0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b1b96>] io_schedule_timeout+0xa6/0x110
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8124b537>] do_blockdev_direct_IO+0xca7/0x2d20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8124d5f3>] __blockdev_direct_IO+0x43/0x50
Apr 28 00:39:15 vnode19 kernel: [<ffffffff812479d8>] blkdev_direct_IO+0x58/0x80
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81190a3d>] generic_file_direct_write+0xad/0x170
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81190bc2>] __generic_file_write_iter+0xc2/0x1e0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247cd0>] blkdev_write_iter+0x90/0x130
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247c40>] ? bd_unlink_disk_holder+0xe0/0xe0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120d8a1>] do_readv_writev+0x1f1/0x2b0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff811308cf>] ? __audit_syscall_entry+0xaf/0x100
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120d9e9>] vfs_writev+0x39/0x50
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120e9e8>] SyS_pwritev+0xb8/0xe0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71
Apr 28 00:39:15 vnode19 kernel: INFO: task qemu-kvm:33655 blocked for more than 120 seconds.
Apr 28 00:39:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:39:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:39:15 vnode19 kernel: qemu-kvm        D ffff886a1dd23908     0 33655      1 0x00000080
Apr 28 00:39:15 vnode19 kernel: ffff886a1dd23908 ffff8875c6e442c0 ffff88582127ac80 ffff886a1dd24000
Apr 28 00:39:15 vnode19 kernel: 0000000000000000 7fffffffffffffff 0000000000000000 ffff886d0d021e00
Apr 28 00:39:15 vnode19 kernel: ffff886a1dd23920 ffffffff816b2425 ffff887f7f496cc0 ffff886a1dd239d0
Apr 28 00:39:15 vnode19 kernel: Call Trace:
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b5137>] schedule_timeout+0x237/0x2d0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81309826>] ? generic_make_request+0x106/0x1d0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b1b96>] io_schedule_timeout+0xa6/0x110
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8124b537>] do_blockdev_direct_IO+0xca7/0x2d20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8124d5f3>] __blockdev_direct_IO+0x43/0x50
Apr 28 00:39:15 vnode19 kernel: [<ffffffff812479d8>] blkdev_direct_IO+0x58/0x80
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81190a3d>] generic_file_direct_write+0xad/0x170
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81190bc2>] __generic_file_write_iter+0xc2/0x1e0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247cd0>] blkdev_write_iter+0x90/0x130
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247c40>] ? bd_unlink_disk_holder+0xe0/0xe0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120d8a1>] do_readv_writev+0x1f1/0x2b0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff811308cf>] ? __audit_syscall_entry+0xaf/0x100
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120d9e9>] vfs_writev+0x39/0x50
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120e9e8>] SyS_pwritev+0xb8/0xe0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71
Apr 28 00:39:15 vnode19 kernel: INFO: task qemu-kvm:33661 blocked for more than 120 seconds.
Apr 28 00:39:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:39:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:39:15 vnode19 kernel: qemu-kvm        D ffff8855341f3728     0 33661      1 0x00000080
Apr 28 00:39:15 vnode19 kernel: ffff8855341f3728 ffff880166d642c0 ffff886916a4c2c0 ffff8855341f4000
Apr 28 00:39:15 vnode19 kernel: ffff880d40fc8c18 ffff880d40fc8c00 ffffffff00000000 fffffffe00000001
Apr 28 00:39:15 vnode19 kernel: ffff8855341f3740 ffffffff816b2425 ffff886916a4c2c0 ffff8855341f37d0
Apr 28 00:39:15 vnode19 kernel: Call Trace:
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b4c05>] rwsem_down_write_failed+0x1f5/0x320
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81343233>] call_rwsem_down_write_failed+0x13/0x20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b44ad>] ? down_write+0x2d/0x40
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa06dfdfe>] __origin_write+0x6e/0x210 [dm_snapshot]
Apr 28 00:39:15 vnode19 kernel: [<ffffffff811918ae>] ? mempool_alloc+0x6e/0x170
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa06e0007>] do_origin.isra.14+0x67/0x90 [dm_snapshot]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa06e0092>] origin_map+0x62/0x80 [dm_snapshot]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04acf8a>] __map_bio+0x3a/0x110 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04ae73f>] __split_and_process_bio+0x24f/0x3f0 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04ae94a>] dm_make_request+0x6a/0xd0 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81309826>] generic_make_request+0x106/0x1d0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81309967>] submit_bio+0x77/0x150
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81300deb>] ? bio_alloc_bioset+0x1ab/0x2d0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8124ccb7>] do_blockdev_direct_IO+0x2427/0x2d20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8124d5f3>] __blockdev_direct_IO+0x43/0x50
Apr 28 00:39:15 vnode19 kernel: [<ffffffff812479d8>] blkdev_direct_IO+0x58/0x80
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81190a3d>] generic_file_direct_write+0xad/0x170
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81190bc2>] __generic_file_write_iter+0xc2/0x1e0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247cd0>] blkdev_write_iter+0x90/0x130
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120cf59>] __vfs_write+0xc9/0x110
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120d5b2>] vfs_write+0xa2/0x1a0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120e537>] SyS_pwrite64+0x87/0xb0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71
Apr 28 00:39:15 vnode19 kernel: INFO: task dmeventd:33781 blocked for more than 120 seconds.
Apr 28 00:39:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:39:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:39:15 vnode19 kernel: dmeventd        D ffff8803493b7af8     0 33781      1 0x00000080
Apr 28 00:39:15 vnode19 kernel: ffff8803493b7af8 ffff880166da1640 ffff880b15a50000 ffff8803493b8000
Apr 28 00:39:15 vnode19 kernel: ffff880d40fc8c18 ffff880d40fc8c00 ffffffff00000000 fffffffe00000001
Apr 28 00:39:15 vnode19 kernel: ffff8803493b7b10 ffffffff816b2425 ffff880b15a50000 ffff8803493b7b98
Apr 28 00:39:15 vnode19 kernel: Call Trace:
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b4c05>] rwsem_down_write_failed+0x1f5/0x320
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81343233>] call_rwsem_down_write_failed+0x13/0x20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b44ad>] ? down_write+0x2d/0x40
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa06df172>] snapshot_status+0x82/0x1a0 [dm_snapshot]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04b51a6>] retrieve_status+0xa6/0x1b0 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04b6363>] table_status+0x63/0xa0 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04b6300>] ? dm_get_live_or_inactive_table.isra.3+0x30/0x30 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04b6015>] ctl_ioctl+0x255/0x4d0 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81095806>] ? __dequeue_signal+0x106/0x1b0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81095a1b>] ? recalc_sigpending+0x1b/0x50
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04b62a3>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81220872>] do_vfs_ioctl+0x2d2/0x4b0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff811308cf>] ? __audit_syscall_entry+0xaf/0x100
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81220ac9>] SyS_ioctl+0x79/0x90
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71
Apr 28 00:39:15 vnode19 kernel: INFO: task dd:33790 blocked for more than 120 seconds.
Apr 28 00:39:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:39:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:39:15 vnode19 kernel: dd              D ffff885238e1f828     0 33790  33746 0x00000080
Apr 28 00:39:15 vnode19 kernel: ffff885238e1f828 ffff883f77ce42c0 ffff884a64088000 ffff885238e20000
Apr 28 00:39:15 vnode19 kernel: ffff880d40fc8c18 ffff880d40fc8c00 ffffffff00000000 fffffffe00000001
Apr 28 00:39:15 vnode19 kernel: ffff885238e1f840 ffffffff816b2425 ffff884a64088000 ffff885238e1f8d0
Apr 28 00:39:15 vnode19 kernel: Call Trace:
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b4c05>] rwsem_down_write_failed+0x1f5/0x320
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81343233>] call_rwsem_down_write_failed+0x13/0x20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b44ad>] ? down_write+0x2d/0x40
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa06e0d32>] snapshot_map+0x62/0x390 [dm_snapshot]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04acf8a>] __map_bio+0x3a/0x110 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04ae73f>] __split_and_process_bio+0x24f/0x3f0 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04ae94a>] dm_make_request+0x6a/0xd0 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81309826>] generic_make_request+0x106/0x1d0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81309967>] submit_bio+0x77/0x150
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8124d6ba>] mpage_bio_submit+0x2a/0x40
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8124e0b0>] mpage_readpages+0x130/0x160
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff811e0428>] ? alloc_pages_current+0x88/0x120
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247add>] blkdev_readpages+0x1d/0x20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8119bfbc>] __do_page_cache_readahead+0x19c/0x220
Apr 28 00:39:15 vnode19 kernel: [<ffffffff810b4c39>] ? try_to_wake_up+0x49/0x3d0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8119c175>] ondemand_readahead+0x135/0x260
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04ae0aa>] ? dm_any_congested+0x4a/0x50 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8119c30c>] page_cache_async_readahead+0x6c/0x70
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81190748>] generic_file_read_iter+0x438/0x680
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81215e79>] ? pipe_write+0x3d9/0x430
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247da7>] blkdev_read_iter+0x37/0x40
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120ce56>] __vfs_read+0xc6/0x100
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120d45f>] vfs_read+0x7f/0x130
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120e2d5>] SyS_read+0x55/0xc0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71
    
por smartenbergen 28.04.2017 / 19:01

1 resposta

0

Suponho que você descartou problemas reais de disco físico.

Suponho também que você esteja certificando-se de que o host e nenhuma das VMs tenham nomes VG sobrepostos. Isso poderia causar loucura como você está descrevendo.

O que você está vendo soa como "sono ininterrupto", onde a caixa pensa que está esperando por IO, e nada pode mudar isso. Matar -9 nem vai fazer isso. Eu costumava ver isso com backups em fita. Eu vi isso mais recentemente ao fazer coisas estúpidas, como montar um LVM de VMs no host e esquecer de desmontá-lo durante a execução da VM. Isso é sempre divertido.

A ferramenta mais útil que encontrei para uma situação como a que você está descrevendo é dmsetup . Ele permite que você descompacte manualmente o LVM. Eu não sei se isso vai te tirar de uma situação de sono ininterrupta.

Outra possibilidade é que você está usando discos lentos e algo realmente leva mais de 120 segundos.

Eu uso arquivos de disco do tipo ala qemu-img em vez de LVM. Eu costumava usar o LVM como você está descrevendo no Xen, mas nunca tive problemas que eu obviamente não causasse a mim mesmo.

-Dylan

    
por 28.04.2017 / 19:32