Estou tentando solucionar isso por um dia inteiro sem sucesso.
Eu tenho dois servidores, server1 e server2, ambos executando o Ubuntu 14.04.5 LTS e conectados a um switch Cisco sg200-08 via tronco LAG com LACP. O switch ip é 172.128.1.254/24 e as interfaces nos servidores são mostradas abaixo, incluindo a rota e a tabela arp para os ips relevantes:
no servidor1:
root@server1:~# ip addr show bond0
5: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether 00:11:0a:10:03:29 brd ff:ff:ff:ff:ff:ff
inet 172.128.1.129/24 brd 172.128.1.255 scope global bond0
valid_lft forever preferred_lft forever
root@server1:~# ip addr show bond0.53
13: bond0.53@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether 00:11:0a:10:03:29 brd ff:ff:ff:ff:ff:ff
inet 192.168.53.1/24 brd 192.168.53.255 scope global bond0.53
valid_lft forever preferred_lft forever
root@server1:~# ip route get 192.168.53.2
192.168.53.2 dev bond0.53 src 192.168.53.1
cache
root@server1:~# arp -n | grep '192.168.53.2'
192.168.53.2 (incomplete) bond0.53
No servidor2:
root@server2:~# ip addr show bond0
5: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether 00:15:17:2e:ab:b4 brd ff:ff:ff:ff:ff:ff
inet 172.128.1.130/24 brd 172.128.1.255 scope global bond0
valid_lft forever preferred_lft foreve
root@server2:~# ip addr show bond0.53
22: bond0.53@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether 00:15:17:2e:ab:b4 brd ff:ff:ff:ff:ff:ff
inet 192.168.53.2/24 brd 192.168.53.255 scope global bond0.53
valid_lft forever preferred_lft forever
root@server2:~# ip route get 192.168.53.1
192.168.53.1 dev bond0.53 src 192.168.53.2
cache
root@server2:~# arp -n | grep '192.168.53.1'
192.168.53.1 ether 00:11:0a:10:03:29 C bond0.53
Quando eu pingo server2 do server1, não consigo ver nenhuma resposta do arp voltando ao server1:
root@server1:~# tcpdump -ennqt -i bond0 \( arp or icmp \)
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on bond0, link-type EN10MB (Ethernet), capture size 65535 bytes
00:11:0a:10:03:29 > ff:ff:ff:ff:ff:ff, 802.1Q, length 46: vlan 53, p 0, ethertype ARP, Request who-has 192.168.53.2 tell 192.168.53.1, length 28
00:11:0a:10:03:29 > ff:ff:ff:ff:ff:ff, 802.1Q, length 46: vlan 53, p 0, ethertype ARP, Request who-has 192.168.53.2 tell 192.168.53.1, length 28
00:11:0a:10:03:29 > ff:ff:ff:ff:ff:ff, 802.1Q, length 46: vlan 53, p 0, ethertype ARP, Request who-has 192.168.53.2 tell 192.168.53.1, length 28
00:11:0a:10:03:29 > ff:ff:ff:ff:ff:ff, 802.1Q, length 46: vlan 53, p 0, ethertype ARP, Request who-has 192.168.53.2 tell 192.168.53.1, length 28
00:11:0a:10:03:29 > ff:ff:ff:ff:ff:ff, 802.1Q, length 46: vlan 53, p 0, ethertype ARP, Request who-has 192.168.53.2 tell 192.168.53.1, length 28
00:11:0a:10:03:29 > ff:ff:ff:ff:ff:ff, 802.1Q, length 46: vlan 53, p 0, ethertype ARP, Request who-has 192.168.53.2 tell 192.168.53.1, length 28
mas no lado server2 eu posso ver a requisição arp do servidor1 E as respostas estão sendo enviadas de volta através da VLAN53:
root@server2:~# tcpdump -ennqt -i bond0 \( arp or icmp \)
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on bond0, link-type EN10MB (Ethernet), capture size 65535 bytes
00:11:0a:10:03:29 > ff:ff:ff:ff:ff:ff, 802.1Q, length 64: vlan 53, p 0, ethertype ARP, Request who-has 192.168.53.2 tell 192.168.53.1, length 46
00:15:17:2e:ab:b4 > 00:11:0a:10:03:29, 802.1Q, length 46: vlan 53, p 0, ethertype ARP, Reply 192.168.53.2 is-at 00:15:17:2e:ab:b4, length 28
00:11:0a:10:03:29 > ff:ff:ff:ff:ff:ff, 802.1Q, length 64: vlan 53, p 0, ethertype ARP, Request who-has 192.168.53.2 tell 192.168.53.1, length 46
00:15:17:2e:ab:b4 > 00:11:0a:10:03:29, 802.1Q, length 46: vlan 53, p 0, ethertype ARP, Reply 192.168.53.2 is-at 00:15:17:2e:ab:b4, length 28
00:11:0a:10:03:29 > ff:ff:ff:ff:ff:ff, 802.1Q, length 64: vlan 53, p 0, ethertype ARP, Request who-has 192.168.53.2 tell 192.168.53.1, length 46
00:15:17:2e:ab:b4 > 00:11:0a:10:03:29, 802.1Q, length 46: vlan 53, p 0, ethertype ARP, Reply 192.168.53.2 is-at 00:15:17:2e:ab:b4, length 28
00:11:0a:10:03:29 > ff:ff:ff:ff:ff:ff, 802.1Q, length 64: vlan 53, p 0, ethertype ARP, Request who-has 192.168.53.2 tell 192.168.53.1, length 46
00:15:17:2e:ab:b4 > 00:11:0a:10:03:29, 802.1Q, length 46: vlan 53, p 0, ethertype ARP, Reply 192.168.53.2 is-at 00:15:17:2e:ab:b4, length 28
00:11:0a:10:03:29 > ff:ff:ff:ff:ff:ff, 802.1Q, length 64: vlan 53, p 0, ethertype ARP, Request who-has 192.168.53.2 tell 192.168.53.1, length 46
00:15:17:2e:ab:b4 > 00:11:0a:10:03:29, 802.1Q, length 46: vlan 53, p 0, ethertype ARP, Reply 192.168.53.2 is-at 00:15:17:2e:ab:b4, length 28
00:11:0a:10:03:29 > ff:ff:ff:ff:ff:ff, 802.1Q, length 64: vlan 53, p 0, ethertype ARP, Request who-has 192.168.53.2 tell 192.168.53.1, length 46
00:15:17:2e:ab:b4 > 00:11:0a:10:03:29, 802.1Q, length 46: vlan 53, p 0, ethertype ARP, Reply 192.168.53.2 is-at 00:15:17:2e:ab:b4, length 28
Para o ping na direção oposta, só consigo ver isso no server2:
00:15:17:2e:ab:b4 > 00:11:0a:10:03:29, 802.1Q, length 102: vlan 53, p 0, ethertype IPv4, 192.168.53.2 > 192.168.53.1: ICMP echo request, id 6506, seq 1, length 64
00:15:17:2e:ab:b4 > 00:11:0a:10:03:29, 802.1Q, length 102: vlan 53, p 0, ethertype IPv4, 192.168.53.2 > 192.168.53.1: ICMP echo request, id 6506, seq 2, length 64
00:15:17:2e:ab:b4 > 00:11:0a:10:03:29, 802.1Q, length 102: vlan 53, p 0, ethertype IPv4, 192.168.53.2 > 192.168.53.1: ICMP echo request, id 6506, seq 3, length 64
00:15:17:2e:ab:b4 > 00:11:0a:10:03:29, 802.1Q, length 102: vlan 53, p 0, ethertype IPv4, 192.168.53.2 > 192.168.53.1: ICMP echo request, id 6506, seq 4, length 64
00:15:17:2e:ab:b4 > 00:11:0a:10:03:29, 802.1Q, length 102: vlan 53, p 0, ethertype IPv4, 192.168.53.2 > 192.168.53.1: ICMP echo request, id 6506, seq 5, length 64
00:15:17:2e:ab:b4 > 00:11:0a:10:03:29, 802.1Q, length 46: vlan 53, p 0, ethertype ARP, Request who-has 192.168.53.1 tell 192.168.53.2, length 28
00:15:17:2e:ab:b4 > 00:11:0a:10:03:29, 802.1Q, length 46: vlan 53, p 0, ethertype ARP, Request who-has 192.168.53.1 tell 192.168.53.2, length 28
00:15:17:2e:ab:b4 > 00:11:0a:10:03:29, 802.1Q, length 46: vlan 53, p 0, ethertype ARP, Request who-has 192.168.53.1 tell 192.168.53.2, length 28
Nenhuma configuração de firewall, arptables ou ebtables nos dois lados.
O sysctl do kernel não está bloqueando o tráfego ICMP.
Os laços estão altos e saudáveis.
O switch possui 2 portas em cada LAG configuradas como tronco para cada servidor e carrega 1 (nativo / default não marcado) e 51,52,53,54 marcado da vlan.
Eu posso pingar tanto 172.128.1.129 e 172.128.1.130 de bond0 ip do switch. Eu posso pingar 172.128.1.129 (server1) de outro PC linux conectado a
o switch (ip de 172.128.1.5) mas não 172.128.1.130 (server2).
Agradecemos antecipadamente por quaisquer sugestões, ideias e sugestões.
CORRECTION : posso fazer ping de dois servidores do terceiro host na rede
igorc@client:~$ ip -f inet addr show eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
inet 172.128.1.5/24 brd 172.128.1.255 scope global dynamic eth1
valid_lft 22497sec preferred_lft 22497sec
igorc@client:~$ ping -c 2 172.128.1.129
PING 172.128.1.129 (172.128.1.129) 56(84) bytes of data.
64 bytes from 172.128.1.129: icmp_seq=1 ttl=64 time=0.618 ms
64 bytes from 172.128.1.129: icmp_seq=2 ttl=64 time=0.541 ms
--- 172.128.1.129 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.541/0.579/0.618/0.045 ms
igorc@client:~$ ping -c 2 172.128.1.130
PING 172.128.1.130 (172.128.1.130) 56(84) bytes of data.
64 bytes from 172.128.1.130: icmp_seq=1 ttl=64 time=0.645 ms
64 bytes from 172.128.1.130: icmp_seq=2 ttl=64 time=0.693 ms
--- 172.128.1.130 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.645/0.669/0.693/0.024 ms
UPDATE : o vínculo nos dois servidores
root@server1:~# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 100
Down Delay (ms): 100
802.3ad info
LACP rate: slow
Min links: 0
Aggregator selection policy (ad_select): stable
Active Aggregator Info:
Aggregator ID: 1
Number of ports: 1
Actor Key: 17
Partner Key: 1
Partner Mac Address: 00:00:00:00:00:00
Slave Interface: eth2
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 2
Permanent HW addr: 00:11:0a:10:03:29
Aggregator ID: 1
Slave queue ID: 0
Slave Interface: eth0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 2
Permanent HW addr: 00:11:0a:10:03:28
Aggregator ID: 2
Slave queue ID: 0
root@server2:~# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 100
Down Delay (ms): 100
802.3ad info
LACP rate: slow
Min links: 0
Aggregator selection policy (ad_select): stable
Active Aggregator Info:
Aggregator ID: 2
Number of ports: 1
Actor Key: 17
Partner Key: 1
Partner Mac Address: 00:00:00:00:00:00
Slave Interface: p1p1
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:15:17:2e:ab:b4
Aggregator ID: 1
Slave queue ID: 0
Slave Interface: p1p2
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:15:17:2e:ab:b5
Aggregator ID: 2
Slave queue ID: 0