100.0% interrompe em um núcleo

0

Por que interrupts não se espalha por todos os núcleos?

Cpu0  :  0.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,100.0%si,  0.0%st
Cpu1  : 25.2%us, 32.6%sy,  0.0%ni, 12.6%id, 26.2%wa,  0.0%hi,  3.3%si,  0.0%st
Cpu2  : 29.0%us, 15.0%sy,  0.0%ni, 29.3%id, 26.7%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  : 16.0%us, 21.7%sy,  0.0%ni, 34.3%id, 27.7%wa,  0.0%hi,  0.3%si,  0.0%st
Cpu4  : 26.0%us, 14.3%sy,  0.0%ni, 33.7%id, 25.7%wa,  0.0%hi,  0.3%si,  0.0%st
Cpu5  : 15.0%us, 15.0%sy,  0.0%ni, 44.2%id, 25.2%wa,  0.0%hi,  0.7%si,  0.0%st
Cpu6  : 13.0%us, 13.3%sy,  0.0%ni, 42.2%id, 31.2%wa,  0.0%hi,  0.3%si,  0.0%st
Cpu7  :  9.7%us, 11.0%sy,  0.0%ni, 56.3%id, 23.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu8  : 13.0%us, 12.6%sy,  0.0%ni, 49.2%id, 25.2%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu9  :  9.6%us,  7.3%sy,  0.0%ni, 69.1%id, 13.6%wa,  0.0%hi,  0.3%si,  0.0%st
Cpu10 :  8.9%us,  7.9%sy,  0.0%ni, 54.8%id, 28.1%wa,  0.0%hi,  0.3%si,  0.0%st

sem nenhuma razão aparente meu servidor começa a funcionar mal, depois de verificar top notei que apenas um núcleo manipula 100% de interrupções.

cat / proc / interrupts

            CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7       CPU8       CPU9       CPU10      CPU11      
   0:        213          0          0          0          0          0          0          0          0          0          0          0   IO-APIC-edge      timer
   8:          1          0          0          0          0          0          0          0          0          0          0          0   IO-APIC-edge      rtc0
   9:          1          0          0          0          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   acpi
  16:        557          0          0          0          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   ehci_hcd:usb2, uhci_hcd:usb6, uhci_hcd:usb7, uhci_hcd:usb8
  17:    4373632      89953          0          0          0   10737111          0          0          0          0          0   22943776   IO-APIC-fasteoi   firewire_ohci
  19:         48          0          0          0          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb3, uhci_hcd:usb4, uhci_hcd:usb5
  24:        378          0          0          0          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   nouveau
  34:        232          0          0          0          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   hda_intel
  64:          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      aerdrv, PCIe PME
  65:          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      aerdrv, PCIe PME
  66:          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      aerdrv, PCIe PME
  67:          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      PCIe PME, pciehp
  68:          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      PCIe PME, pciehp
  69:          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      PCIe PME, pciehp
  70:   27356052          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      mpt2sas0
  71:     360910          0          0          0      10388     366203          0     660341          0          0          0    1011704   PCI-MSI-edge      ahci
  72:          7          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      eth0
  73:    3223115          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      eth0-TxRx-0
  74:          6          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      eth1
  75:    3573711          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      eth1-TxRx-0
  76:          6          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      eth2
  77:    3548069          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      eth2-TxRx-0
  78:          6          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      eth3
  79:    3290681          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      eth3-TxRx-0
  80:          6          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      eth4
  81:    3319709          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      eth4-TxRx-0
  82:          7          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      eth5
  83:    3294914          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      eth5-TxRx-0
  84:        223          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      hda_intel
  85:          4          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      ioat-msix
  86:          4          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      ioat-msix
  87:          4          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      ioat-msix
  88:          4          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      ioat-msix
  89:          4          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      ioat-msix
  90:          4          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      ioat-msix
  91:          4          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      ioat-msix
  92:          4          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      ioat-msix
 NMI:      20083      11292       9555      10288       8470       9085       7319       7726       6190       6286       5305       5966   Non-maskable interrupts
 LOC:   12625312   12863741   12757467   12819307   12735818   12636631   12594014   12340042   12351248   11896407   11976946   11309230   Local timer interrupts
 SPU:          0          0          0          0          0          0          0          0          0          0          0          0   Spurious interrupts
 PMI:      20083      11292       9555      10288       8470       9085       7319       7726       6190       6286       5305       5966   Performance monitoring interrupts
 IWI:          0          0          0          0          0          0          0          0          0          0          0          0   IRQ work interrupts
 RES:    2102300   11881309   11859706   12689803   11274676   10461216    9626798    8188722    7976358    6329291    6344685    4528014   Rescheduling interrupts
 CAL:     732819   20016455      15519      15361      17958      23935      23377      43079      40287     108860      70814     257653   Function call interrupts
 TLB:       7589      72270      46673      99284      46373     121129      43286     101506      34109      78720      28570      70600   TLB shootdowns
 TRM:          0          0          0          0          0          0          0          0          0          0          0          0   Thermal event interrupts
 THR:          0          0          0          0          0          0          0          0          0          0          0          0   Threshold APIC interrupts
 MCE:          0          0          0          0          0          0          0          0          0          0          0          0   Machine check exceptions
 MCP:         44         44         44         44         44         44         44         44         44         44         44         44   Machine check polls
 ERR:          0
 MIS:          0

cat / proc / interrupts indica que todas as redes são manipuladas pelo CPU0, acho que aqui está o problema.

A rede está configurada com BONDING_OPTS="modo = 4 miimon = 100 xmit_hash_policy = layer3 + 4"

O que eu tentei:

  1. executando o irqbalance
  2. reinicializar
por user2783132 28.11.2013 / 20:18

1 resposta

0

Verifique a afinidade da CPU de interrupção com:

cat /proc/irq/70/smp_affinity 

Eu escolhi 70 porque está ligado ao mpt2sas0, o driver da placa de armazenamento. Você pode querer repetir o cheque para todos os outros mergulhadores também, especificamente placas de rede, se você estiver atendendo muito tráfego.

Você quer que essa configuração relate o valor ffffffff porque isso significa que toda a CPU pode atender a essa interrupção.

Você pode seguir este documento da RedHat como referência.

    
por 28.11.2013 / 21:49

Tags