Eu tenho Server com 32 core CPU e tem NC364T
1G ethernet e notei quando a taxa de pacotes atingiu 250kpps
(200.000 pacotes por segundo) eu comecei a ver quedas intermediárias em ifconfig
output e no meu sistema ksoftirqd/0
processo começa a consumir CPU porque fica com apenas CPU0
em vez de spread load em toda a CPU.
[root@server1 ~]# ifconfig | grep drop
RX errors 0 dropped 3019497 overruns 0 frame 0
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
Isso é o que eu tentei resolver.
aumenta o tamanho do buffer do anel RX para max 4096
ethtool -G enp9s0f0 rx 4096
Tente ajustar a afinidade, mas ainda sem sorte
cat "ff" > /proc/irq/86/smp_affinity
Mas ainda estou vendo gotas, aqui está a saída
[root@server1 ~]# ethtool -S enp9s0f0 | grep rx
rx_packets: 21351639073
rx_bytes: 2150885430317
rx_broadcast: 3520806
rx_multicast: 1054
rx_errors: 0
rx_length_errors: 0
rx_over_errors: 0
rx_crc_errors: 0
rx_frame_errors: 0
rx_no_buffer_count: 49483446
rx_missed_errors: 2997479
rx_long_length_errors: 0
rx_short_length_errors: 0
rx_align_errors: 0
rx_flow_control_xon: 0
rx_flow_control_xoff: 0
rx_csum_offload_good: 21089322623
rx_csum_offload_errors: 107538
rx_header_split: 0
alloc_rx_buff_failed: 0
rx_smbus: 0
rx_dma_failed: 0
rx_hwtstamp_cleared: 0
configuração de detalhes
[root@server1 ~]# ethtool -k enp9s0f0
Features for enp9s0f0:
rx-checksumming: on
tx-checksumming: on
tx-checksum-ipv4: off [fixed]
tx-checksum-ip-generic: on
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
tx-tcp-segmentation: on
tx-tcp-ecn-segmentation: off [fixed]
tx-tcp6-segmentation: on
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off [fixed]
receive-hashing: on
highdma: on [fixed]
rx-vlan-filter: on [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-mpls-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off
rx-all: off
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
busy-poll: off [fixed]
tx-sctp-segmentation: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off [fixed]
detalhes Nic
[root@server1 ~]# lspci | grep -i eth
09:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06)
09:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06)
0a:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06)
0a:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06)
Ele suporta MSI
também lspci -vv
, mas não tem MSI-X
09:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06)
Subsystem: Hewlett-Packard Company NC364T PCI Express Quad Port Gigabit Server Adapter
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin B routed to IRQ 86
NUMA node: 0
Region 0: Memory at f7de0000 (32-bit, non-prefetchable) [size=128K]
Region 1: Memory at f7d00000 (32-bit, non-prefetchable) [size=512K]
Region 2: I/O ports at 6000 [size=32]
Capabilities: [c8] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: 00000000fee00c18 Data: 0000
Capabilities: [e0] Express (v1) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset- SlotPowerLimit 0.000W
DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 256 bytes, MaxReadReq 4096 bytes
DevSta: CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr+ TransPend-
LnkCap: Port #2, Speed 2.5GT/s, Width x4, ASPM L0s, Exit Latency L0s <4us, L1 <64us
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
AERCap: First Error Pointer: 14, GenCap- CGenEn- ChkCap- ChkEn-
Capabilities: [140 v1] Device Serial Number 00-26-55-ff-ff-d2-1a-6c
Kernel driver in use: e1000e
Kernel modules: e1000e
Tags performance ethernet linux