Using lxd to do vlan test (by quqi99)

2023-05-16

作者:张华 发表于:2022-08-15
版权声明:可以任意转载,转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明

问题

客户说sriov虚机里收不着arp reply, 他们的sriov虚机里是两个sriov网卡做一个ptk0 (bond ?), 由active NIC(pkt0_p)与standby NIC(pkt0_s)组成.

<no-ip>/fa:16:3e:d8:3f:b9(pkt0)
<no-ip>/fa:16:3e:d8:3f:b9(pkt0_p)
<no-ip>/fa:16:3e:70:be:ba(pkt0_s)
151.2.143.1/151.2.143.2/fa:16:3e:d8:3f:b9(pkt0.610@pkt0)
10.139.99.1/10.139.99.2/fa:16:3e:d8:3f:b9(pkt0.510@pkt0)
10.139.160.10/10.139.160.11/10.139.160.12/fa:16:3e:d8:3f:b9(pkt0.700@pkt0)

他说在active NIC作ICMP的心跳检查没问题,但是在standby NIC上做ARP到GW的心跳检查收不着arp reply (但下列数据似乎收着啦?)

1, arp for active port(fa:16:3e:d8:3f:b9)

$ tshark -r ./EXT_TMP-700.pcap-1.act.pcap eth.src==fa:16:3e:d8:3f:b9 and arp |tail -n1
357602 8141.824956 fa:16:3e:d8:3f:b9 → IETF-VRRP-VRID_64 ARP 60 Who has 10.139.160.254? Tell 10.139.160.10
$ tshark -r ./EXT_TMP-700.pcap-1.act.pcap eth.dst==fa:16:3e:d8:3f:b9 and arp |tail -n1
357603 8141.825416 IETF-VRRP-VRID_64 → fa:16:3e:d8:3f:b9 ARP 60 10.139.160.254 is at 00:00:5e:00:01:64

2, icmp for active port(fa:16:3e:d8:3f:b9)

$ tshark -r ./EXT_TMP-700.pcap-1.act.pcap eth.dst==fa:16:3e:d8:3f:b9 and icmp |tail -n1
358835 8169.867056 10.139.160.254 → 10.139.160.10 ICMP 102 Echo (ping) reply    id=0x000a, seq=15233/33083, ttl=64 (request in 358834)
$ tshark -r ./EXT_TMP-700.pcap-1.act.pcap eth.src==fa:16:3e:d8:3f:b9 and icmp |tail -n1
358834 8169.863263 10.139.160.10 → 10.139.160.254 ICMP 102 Echo (ping) request  id=0x000a, seq=15233/33083, ttl=64

3, arp for standby port(fa:16:3e:70:be:ba)

$ tshark -r ./EXT_TMP-700.pcap-1.act.pcap eth.src==fa:16:3e:70:be:ba and arp |tail -n1
358848 8170.244743 fa:16:3e:70:be:ba → Broadcast    ARP 60 Who has 10.139.160.254? (ARP Probe)
$ tshark -r ./EXT_TMP-700.pcap-1.act.pcap eth.dst==fa:16:3e:70:be:ba and arp |tail -n1
358849 8170.245117 IETF-VRRP-VRID_64 → fa:16:3e:70:be:ba ARP 60 10.139.160.254 is at 00:00:5e:00:01:64

4, icmp for standby port(fa:16:3e:70:be:ba)

$ tshark -r ./EXT_TMP-700.pcap-1.act.pcap eth.src==fa:16:3e:70:be:ba and icmp |tail -n1
<empty>
$ tshark -r ./EXT_TMP-700.pcap-1.act.pcap eth.dst==fa:16:3e:70:be:ba and icmp |tail -n1
<empty>

已经做过如下分析:

  • 确认下列的sriov ovn配置中用于external network的br-data里没有使用sriov NIC, 如果这里是sriov NIC,并且sriov NIC没有使用直通,而是使用mapvtap的话,可能存在发卡模式的问题,即一个host上的VM不能访问本chassis的网络,但可以访问其他chassis的网络.
juju config ovn-chassis-sriov-hugepages ovn-bridge-mappings
dcfabric:br-data sriovfabric1:br-data sriovfabric2:br-data
$ juju config ovn-chassis-sriov-hugepages bridge-interface-mappings
br-data:bond1
$ juju config ovn-chassis-sriov-hugepages sriov-device-mappings
sriovfabric1:ens3f0 sriovfabric1:ens6f0 sriovfabric2:ens3f1 sriovfabric2:ens6f1
$ juju config ovn-chassis-sriov-hugepages sriov-numvfs
ens3f0:32 ens3f1:32 ens6f0:32 ens6f1:32
  • 排除了lp bug 1875852, 客户没有使用vlan作为tenant network
  • 在PF上使用tcpdump只看到arp request是正常的.因为arp request是广播,那么在PF上能看到.但arp reply是单播,如果PF不是混杂模式(某些Intel sriov网卡有这个硬件bug不支持混杂模式)那么用PF上用tcpdump看不到arp reply是正常的.另外,在VF上是无法使用tcpdump的.
  • DHCP是禁用的.一般说来使用sr-iov ovn应该将sriov subnet打开dhcp. 但这里是禁用的,应该也没问题,因为客户会静态指定IP
  • 客户静态指定IP(由heat指定)与nova里分配的IP不一样,应该也不影响.因为sriov会bypass host,host上的SG不会影响它(主要是IP/MAC防欺骗的SW rule)
  • 实际IP与nova分配的IP不同,openstack应用层面的SG是不会影响到它,那sriov硬件层面的SG呢?确认spoof checking 也是off的.
i$ grep -E 'fa:16:3e:f8:42:fe|fa:16:3e:70:be:ba|fa:16:3e:8f:56:5a|fa:16:3e:d8:3f:b9' sos_commands/networking/ip_-s_-d_link
vf 30 MAC fa:16:3e:70:be:ba, spoof checking off, link-state auto, trust on
vf 31 MAC fa:16:3e:f8:42:fe, spoof checking off, link-state auto, trust on
vf 29 MAC fa:16:3e:8f:56:5a, spoof checking off, link-state auto, trust on
vf 30 MAC fa:16:3e:d8:3f:b9, spoof checking off, link-state auto, trust on
  • mac filting排除了(above spoof checking), 那vlan filting的问题呢?tcpdump数据显示客户似乎在虚机内部定义了一个vlan(pkt0.700@pkt0)

我们这篇文章的测试主要就是模拟这个vlan测试,当然这里不涉及sriov硬件.

vlan实验环境搭建

lxc remote add faster https://mirrors.tuna.tsinghua.edu.cn/lxc-images/ --protocol=simplestreams --public
lxc image list faster:
lxc remote list
#Failed creating instance record: Failed detecting root disk device: No root device could be found
#lxc profile device add default root disk path=/ pool=default
#lxc profile show default
#lxc launch ubuntu:focal master -p juju-default --config=user.network-config="$(cat network.yml)"
lxc launch faster:ubuntu/jammy test1
lxc launch faster:ubuntu/jammy test2

#add two NICs from NET1 for two containers
lxc network create NET1 ipv6.address=none ipv4.address=10.139.160.1/24
lxc network attach NET1 test1 eth1
lxc network attach NET1 test1 eth2
lxc network attach NET1 test2 eth1
lxc network attach NET1 test2 eth2

#https://developers.redhat.com/blog/2018/10/22/introduction-to-linux-interfaces-for-virtual-networking#vlan
#ip link add ptk0 type bond miimon 100 mode active-backup
#ip link set eth2 master ptk0
#ip link set eth1 master ptk0
lxc exec test1 -- /bin/bash
cat << EOF |tee /etc/netplan/11-test.yaml
network:
  version: 2
  renderer: networkd
  ethernets:
    eth1:
      addresses: []
      dhcp4: false
      dhcp6: false
      macaddress: 00:16:3e:15:bd:58
    eth2:
      addresses: []
      dhcp4: false
      dhcp6: false
      macaddress: 00:16:3e:68:72:0f
  bonds:
    ptk0:
      addresses: []
      dhcp4: false
      dhcp6: false
      interfaces:
        - eth1
        - eth2
      parameters:
        mode: active-backup
        primary: eth1
  vlans:
    ptk0.700:
      id: 700
      link: ptk0
      dhcp4: no
      addresses: [ 10.139.160.10/24 ]
      nameservers:
        search: [ domain.local ]
        addresses: [ 8.8.8.8 ]
EOF
netplan apply

lxc exec test2 -- /bin/bash
cat << EOF |tee /etc/netplan/11-test.yaml
network:
  version: 2
  renderer: networkd
  ethernets:
    eth1:
      addresses: []
      dhcp4: false
      dhcp6: false
      macaddress: 00:16:3e:1e:19:25
    eth2:
      addresses: []
      dhcp4: false
      dhcp6: false
      macaddress: 00:16:3e:f7:9e:22
  bonds:
    ptk0:
      addresses: []
      dhcp4: false
      dhcp6: false
      interfaces:
        - eth1
        - eth2
      parameters:
        mode: active-backup
        primary: eth1
  vlans:
    ptk0.700:
      id: 700
      link: ptk0
      dhcp4: no
      addresses: [ 10.139.160.11/24 ]
      nameservers:
        search: [ domain.local ]
        addresses: [ 8.8.8.8 ]
EOF
netplan apply

上面创建了两个lxd,并在两个lxd中创建了active/standby的bond (ptk0), 然后创建了一个vlan (ptk0.700), 要想上面的网络通,还得在host里设置trunk, 这样vlan网络就通了.
注意:上面需要使用macaddress为两个NIC来设置mac, 若不设置,在创建bond和vlan后会出现有所NIC的mac相同的情况.

$ sudo brctl show |grep NET1 -A3
NET1		8000.00163eeb79c4	no		veth2af34c1d
							veth3a5b458e
							veth82c292b2
							veth9b8e8cb6
#sudo bridge vlan add vid 2-4094 dev NET1 self
sudo bridge vlan add vid 700 dev NET1 self
sudo bridge vlan add vid 700 dev veth2af34c1d
sudo bridge vlan add vid 700 dev veth3a5b458e
sudo bridge vlan add vid 700 dev veth82c292b2
sudo bridge vlan add vid 700 dev veth9b8e8cb6
sudo bridge vlan show

此时,test1可以通过vlan700来ping test2

root@test1:~# ping 10.139.160.11 -c1
PING 10.139.160.11 (10.139.160.11) 56(84) bytes of data.
64 bytes from 10.139.160.11: icmp_seq=1 ttl=64 time=0.133 ms
root@test2:~# tcpdump -i eth1 -nn -e -l
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), snapshot length 262144 bytes
05:54:36.128602 00:16:3e:15:bd:58 > 00:16:3e:1e:19:25, ethertype 802.1Q (0x8100), length 102: vlan 700, p 0, ethertype IPv4 (0x0800), 10.139.160.10 > 10.139.160.11: ICMP echo request, id 37135, seq 1, length 64
05:54:36.128643 00:16:3e:1e:19:25 > 00:16:3e:15:bd:58, ethertype 802.1Q (0x8100), length 102: vlan 700, p 0, ethertype IPv4 (0x0800), 10.139.160.11 > 10.139.160.10: ICMP echo reply, id 37135, seq 1, length 64

但是仍然无法ping GW的

root@test1:~# ping 10.139.160.1 -c1
PING 10.139.160.1 (10.139.160.1) 56(84) bytes of data.
From 10.139.160.10 icmp_seq=1 Destination Host Unreachable
$ sudo tcpdump -i NET1 -nn -e -l
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on NET1, link-type EN10MB (Ethernet), snapshot length 262144 bytes
14:25:24.761131 00:16:3e:15:bd:58 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 700, p 0, ethertype ARP (0x0806), Request who-has 10.139.160.1 tell 10.139.160.10, length 28

无论是创建一个eth0.700, 还是创建一个vlan=700的tap0,均无法ping

#use eth0.700
sudo ip link add link eth0 name eth0.700 type vlan id 700
sudo brctl addif NET1 eth0.700
sudo ifconfig eth0.700 up
sudo ip addr add 10.139.160.254/24 dev eth0.700
sudo bridge vlan add vid 700 dev eth0.700

#use a tap
sudo ip tuntap add mode tap tap0
sudo ip link set tap0 master NET1
sudo bridge vlan add dev tap0 vid 700 pvid untagged master
sudo ip addr add 10.139.160.254/24 dev tap0
sudo bridge vlan show

测试1

那就将test2当成gw吧,然后我们从test1上ping它然后抓包
如果仅从active port使用icmp

root@test1:~# ping -I eth1 10.139.160.1 -c1
ping: Warning: source address might be selected on device other than: eth1
PING 10.139.160.1 (10.139.160.1) from 192.168.121.88 eth1: 56(84) bytes of data.
^C
--- 10.139.160.1 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

$ sudo tcpdump -i NET1 -nn -e -l
14:32:04.483156 00:16:3e:15:bd:58 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 10.139.160.1 tell 192.168.121.88, length 28
14:32:04.483185 00:16:3e:eb:79:c4 > 00:16:3e:15:bd:58, ethertype ARP (0x0806), length 42: Reply 10.139.160.1 is-at 00:16:3e:eb:79:c4, length 28

运行’ping -I eth1 10.139.160.11 -c1’与'ping -I eth2 10.139.160.11 -c1’均无输出

测试2

使用arping命令发送arp request时必须指定一个IP, 但standby port上又没有IP,所以通过’-S’指定了一个.

root@test1:~# arping -I ptk0.700 10.139.160.11 -S 10.139.160.2 -C1
ARPING 10.139.160.11
42 bytes from 00:16:3e:1e:19:25 (10.139.160.11): index=0 time=8.119 usec
root@test2:~# sudo tcpdump -i ptk0.700 -nn -e -l
09:08:16.814374 00:16:3e:15:bd:58 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 58: Request who-has 10.139.160.11 tell 10.139.160.2, length 44
09:08:16.814410 00:16:3e:1e:19:25 > 00:16:3e:15:bd:58, ethertype ARP (0x0806), length 42: Reply 10.139.160.11 is-at 00:16:3e:1e:19:25, length 28

运行’arping -I eth1 10.139.160.11 -S 10.139.160.2 -C1’与’arping -I eth2 10.139.160.11 -S 10.139.160.2 -C1’均无输出

root@test1:~# arping -I eth2 10.139.160.11 -S 10.139.160.2 -C1
ARPING 10.139.160.11
Timeout

那是因为eth1与eth2不是vlan=700?

Some Outputs

root@test1:~# cat /proc/net/bonding/ptk0 
Ethernet Channel Bonding Driver: v5.15.0-43-generic

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: eth1 (primary_reselect always)
Currently Active Slave: eth1
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
Peer Notification Delay (ms): 0

Slave Interface: eth1
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: 00:16:3e:15:bd:58
Slave queue ID: 0

Slave Interface: eth2
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: 00:16:3e:68:72:0f
Slave queue ID: 0

另一种纯CLI方法

上面的不使用netplan还设置网络,而是直接使用纯CLI命令来创建bond, 并且不采用vlan-filtering的方法- https://developers.redhat.com/blog/2017/09/14/vlan-filter-support-on-bridge#bridge_and_vlan

lxc launch faster:ubuntu/jammy test1
lxc launch faster:ubuntu/jammy test2
#add two NICs from NET1 for two containers
lxc network create NET1 ipv6.address=none ipv4.address=10.139.160.1/24
lxc network attach NET1 test1 eth1
lxc network attach NET1 test1 eth2
lxc network attach NET1 test2 eth1
lxc network attach NET1 test2 eth2

#inside test1
lxc exec test1 -- /bin/bash
sudo ip link add ptk0 type bond miimon 100 mode active-backup
sudo ip link set eth1 down
sudo ip link set eth1 master ptk0
sudo ip link set eth2 down
sudo ip link set eth2 master ptk0
sudo ip link set dev ptk0 address 00:16:3e:15:bd:58
sudo ip link set dev eth1 address 00:16:3e:15:bd:58
sudo ip link set dev eth2 address 00:16:3e:68:72:0f
sudo ip link set ptk0 up
sudo ip link add link ptk0 name ptk0.700 type vlan id 700
sudo ip addr add 10.139.160.10/24 dev ptk0.700

#inside test2
lxc exec test2 -- /bin/bash
sudo ip link add ptk0 type bond miimon 100 mode active-backup
sudo ip link set eth1 down
sudo ip link set eth1 master ptk0
sudo ip link set eth2 down
sudo ip link set eth2 master ptk0
sudo ip link set dev ptk0 address 00:16:3e:1e:19:25
sudo ip link set dev eth1 address 00:16:3e:1e:19:25
sudo ip link set dev eth2 address 00:16:3e:f7:9e:22
sudo ip link set ptk0 up
sudo ip link add link ptk0 name ptk0.700 type vlan id 700
sudo ip addr add 10.139.160.11/24 dev ptk0.700

#on host
sudo bridge vlan add vid 700 dev NET1 self
brctl show NET1 |grep veth |xargs -i sudo bridge vlan add vid 700 dev {}
sudo bridge vlan show

20220818 - sriov on lxd and kvm

试图想通过lxd来用两个sriov vf,但失败了,记录如下:

set up sriov env in my desktop - https://blog.csdn.net/quqi99/article/details/53488243
#Failed growing available VFs from 3 to 7 on device "enp6s0f0": write /sys/class/net/enp6s0f0/device/sriov_numvfs
sudo modprobe -r igb && sudo modprobe igb max_vfs=7
lspci -nn |grep 82576
ip link show enp6s0f0
$ cat /proc/cmdline 
BOOT_IMAGE=/boot/vmlinuz-5.15.0-39-generic root=UUID=20355b12-b4b2-4a30-b9e1-59fafe2d7633 ro transparent_hugepage=never hugepagesz=2M hugepages=128 default_hugepagesz=2M intel_iommu=pt intel_iommu=on pci=assign-busses mitigations=off nohpet nokaslr crashkernel=512M-:192M

#lxc launch faster:ubuntu/jammy i1
#lxc init faster:ubuntu/jammy i1
#lxc config device add i1 eth0 nic nictype=sriov parent=enp6s0f0
#lxc config device add i1 eth1 nic nictype=sriov parent=enp6s0f0
#lxc config device add i1 eth0 nic network="sriov0" name=eth0 hwaddr="da:da:9d:42:e5:f0"
lxc remote add faster https://mirrors.tuna.tsinghua.edu.cn/lxc-images/ --protocol=simplestreams --public
lxc init faster:ubuntu/jammy test1

#refer https://blog.csdn.net/quqi99/article/details/125004749
#lxd supports sriov now - https://github.com/lxc/lxd/pull/7678
lxc network create sriov0 --type=sriov parent=enp6s0f0
lxc launch faster:ubuntu/jammy test1
#Failed growing available VFs from 3 to 7 on device "enp6s0f0": write /sys/class/net/enp6s0f0/device/sriov_numvfs
lxc network attach sriov0 test1 eth1
lxc network attach sriov0 test1 eth2
lxc config device override i1 eth1 network=sriov0
lxc config device override i1 eth2 network=sriov0

lxc init faster:ubuntu/jammy i1
lxc config device add i1 eth0 nic nictype=sriov parent=enp6s0f0
lxc start i1
$ lxc start i1
Error: Failed to start device "eth0": All virtual functions on parent device "enp6s0f0" are already in use
Try `lxc info --show-log i1` for more info

sudo snap set lxd daemon.debug=true; sudo systemctl reload snap.lxd.daemon
sudo tail -f /var/snap/lxd/common/lxd/logs/lxd.log

上面失败了,接着我们改使用kvm,创建两个kvm虚机每个用两个sriov vf

lspci | grep net
sudo virt-install --name=i1 --ram=4096 --vcpus=1 --hvm --virt-type=kvm \
    --connect=qemu:///system --os-variant=ubuntu20.04 --accelerate \
    --disk=/images/i1.qcow2,bus=virtio,format=qcow2,cache=none,sparse=true,size=8 \
    --network=bridge=virbr1,model=rtl8139 --nographics -v \
    --hostdev=07:10.0 --hostdev=07:10.1 \
    --location 'https://mirrors.cloud.tencent.com/ubuntu/dists/focal/main/installer-amd64/' --extra-args='console=ttyS0,115200n8 serial'
sudo virsh --connect qemu:///system console i1
arp -a && ssh hua@192.168.101.237 -v

sudo mkdir /mnt/rootfs && sudo chown -R $USER /mnt/rootfs/
sudo mount -o ro /images/iso/ubuntu-20.04-legacy-server-amd64.iso /mnt/rootfs
cd /mnt/rootfs
sudo virt-install --name=i2 --ram=4096 --vcpus=1 --hvm --virt-type=kvm \
    --connect=qemu:///system --os-variant=ubuntu20.04 --accelerate \
    --disk=/images/i2.qcow2,bus=virtio,format=qcow2,cache=none,sparse=true,size=8 \
    --network=bridge=virbr1,model=rtl8139 --nographics -v \
    --hostdev=07:10.2 --hostdev=07:10.3 \
    --cdrom /images/iso/ubuntu-20.04-legacy-server-amd64.iso \
    --boot kernel=./install/netboot/ubuntu-installer/amd64/linux,initrd=./install/netboot/ubuntu-installer/amd64/initrd.gz,kernel_args="console=ttyS0"

然后再接着使用上面的netplan做bond + vlan实验,但是即使使用macaddress设置了MAC之后mac也会变成全是一样的,如下:

3: enp6s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 9e:fb:03:a7:84:3e brd ff:ff:ff:ff:ff:ff
4: enp7s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 5e:55:d1:be:15:7f brd ff:ff:ff:ff:ff:ff

cat << EOF |tee /etc/netplan/11-test.yaml
network:
  version: 2
  renderer: networkd
  ethernets:
    enp6s0:
      addresses: []
      dhcp4: false
      dhcp6: false
      macaddress: 9e:fb:03:a7:84:3e
    enp7s0:
      addresses: []
      dhcp4: false
      dhcp6: false
      macaddress: 5e:55:d1:be:15:7f
  bonds:
    ptk0:
      addresses: []
      dhcp4: false
      dhcp6: false
      interfaces:
        - enp6s0
        - enp7s0
      parameters:
        mode: active-backup
        primary: enp6s0
  vlans:
    ptk0.700:
      id: 700
      link: ptk0
      dhcp4: no
      addresses: [ 10.139.160.10/24 ]
      nameservers:
        search: [ domain.local ]
        addresses: [ 8.8.8.8 ]
EOF
netplan apply


3: enp6s0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc fq_codel master ptk0 state UP group default qlen 1000
    link/ether 3a:a3:8b:e2:4b:89 brd ff:ff:ff:ff:ff:ff
4: enp7s0: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc fq_codel master ptk0 state DOWN group default qlen 1000
    link/ether 3a:a3:8b:e2:4b:89 brd ff:ff:ff:ff:ff:ff
5: ptk0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 3a:a3:8b:e2:4b:89 brd ff:ff:ff:ff:ff:ff
6: ptk0.700@ptk0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 3a:a3:8b:e2:4b:89 brd ff:ff:ff:ff:ff:ff
    inet 10.139.160.10/24 brd 10.139.160.255 scope global ptk0.700
       valid_lft forever preferred_lft forever

$ ip link show enp6s0f0
3: enp6s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 2c:53:4a:02:20:3c brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether 3a:a3:8b:e2:4b:89 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
$ ip link show enp6s0f1
4: enp6s0f1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
    link/ether 2c:53:4a:02:20:3d brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether 3a:a3:8b:e2:4b:89 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off

对于sriov,在host上并没有什么bridge和tap所以也不需要用’bridge vlan add vid 700 dev’做trunk, 所以在vlan=700上的i1(10.139.160.10)直接就能ping i2(10.139.160.11).

将host上的PF也做一个vlan=700的GW, 这样i1(10.139.160.10)就可以ping GW(10.139.160.254)了.

sudo ip link add link enp6s0f0 name enp6s0f0.700 type vlan id 700
sudo ip addr add 10.139.160.254/24 dev enp6s0f0.700
sudo ifconfig enp6s0f0.700 up

关于对指定vf添加信任(ip link set dev enp6s0f0 vf 0 trust on)信任后使用该vf的虚机里才能修改mac地址.做一个实验,将虚机里的ptk0.700的MAC由3a:a3:8b:e2:4b:89改成aa:bb:cc:dd:ee:00(ip link set dev ptk0.700 address aa:bb:cc:dd:ee:00), 此时host上的vf上的mac仍然是3a:a3:8b:e2:4b:89,按理说由于trust is off(sudo ip link set dev enp6s0f0 vf 0 trust off)所以应该ping不通了.但结果是仍然能ping通,即使将ptk0与enp6s0f0的mac也改了还是能ping通.
关于对指定vf设置spoof checking off(ip link set dev enp6s0f0 vf 0 spoof off)后会禁用SG的防欺骗检查.

按理说,如果虚机里有active/passive的bond应该设置trust on与spoof checking off(做了一个测试,这样设置之后,之前在虚机里将mac改为aa:bb:cc:dd:ee:00,然后发现host上的vf的mac也被自动改成aa:bb:cc:dd:ee:00了), 下面是原话:

(1). SR-IOV bonding configurations inside guests with VLANs on the interfaces. For example, when the bond shifts from active slave to standby slave, the bond interface carries the MAC of the original active. This MAC needs to be configured down on the VF else all tx packets will be dropped due MAC spoof checking. This can be also achieved if we set fail_over_mac as active which changes the bond MAC on port switchover. But with VLANs on top of bond there will be issues if the bond MAC changes as the MAC of the VLAN interfaces will still have the old MACs.
(2). Currently only a list of 30 multicast addresses can be supported per VF. This restricts the number of IPv6 IPs which can be used/interfaces, as for each IP there will be a different multicast MAC allocated by the kernel. This in turn also restricts the number of VLAN than can created while using IPv6.

vlan实验及相关的 tcpdump输出如下:

1, 由于在enp6s0f0上创建的enp6s0f0.700(相当于运行了ip link set enp6s0f0 vf 0 vlan 700 qos 2), 所以enp6s0f0也自动设置成了trunk,这时在i1里运行(ping -I ptk0.700 10.139.160.254 -c1)时,在enp6s0f0上的抓包会有vlan信息:
$ sudo tcpdump -i enp6s0f0 -nn -e -l "(arp or icmp)"
11:19:15.133438 aa:bb:cc:dd:ee:00 > 2c:53:4a:02:20:3c, ethertype 802.1Q (0x8100), length 102: vlan 700, p 0, ethertype IPv4 (0x0800), 10.139.160.10 > 10.139.160.254: ICMP echo request, id 15, seq 1, length 64
11:19:15.133518 2c:53:4a:02:20:3c > aa:bb:cc:dd:ee:00, ethertype 802.1Q (0x8100), length 102: vlan 700, p 0, ethertype IPv4 (0x0800), 10.139.160.254 > 10.139.160.10: ICMP echo reply, id 15, seq 1, length 64
11:19:20.319191 2c:53:4a:02:20:3c > aa:bb:cc:dd:ee:00, ethertype 802.1Q (0x8100), length 46: vlan 700, p 0, ethertype ARP (0x0806), Request who-has 10.139.160.10 tell 10.139.160.254, length 28
11:19:20.319312 aa:bb:cc:dd:ee:00 > 2c:53:4a:02:20:3c, ethertype 802.1Q (0x8100), length 64: vlan 700, p 0, ethertype ARP (0x0806), Reply 10.139.160.10 is-at aa:bb:cc:dd:ee:00, length 46
11:19:20.387683 aa:bb:cc:dd:ee:00 > 2c:53:4a:02:20:3c, ethertype 802.1Q (0x8100), length 64: vlan 700, p 0, ethertype ARP (0x0806), Request who-has 10.139.160.254 tell 10.139.160.10, length 46
11:19:20.387692 2c:53:4a:02:20:3c > aa:bb:cc:dd:ee:00, ethertype 802.1Q (0x8100), length 46: vlan 700, p 0, ethertype ARP (0x0806), Reply 10.139.160.254 is-at 2c:53:4a:02:20:3c, length 28

2, 但enp6s0f0.700上会看不到vlan信息(也是在i1里运行ping 10.139.160.254 -c1):
$ sudo tcpdump -i enp6s0f0.700 -nn -e -l "(arp or icmp)"
11:23:44.639393 aa:bb:cc:dd:ee:00 > 2c:53:4a:02:20:3c, ethertype IPv4 (0x0800), length 98: 10.139.160.10 > 10.139.160.254: ICMP echo request, id 16, seq 1, length 64
11:23:44.639448 2c:53:4a:02:20:3c > aa:bb:cc:dd:ee:00, ethertype IPv4 (0x0800), length 98: 10.139.160.254 > 10.139.160.10: ICMP echo reply, id 16, seq 1, length 64
11:23:49.701228 aa:bb:cc:dd:ee:00 > 2c:53:4a:02:20:3c, ethertype ARP (0x0806), length 60: Request who-has 10.139.160.254 tell 10.139.160.10, length 46
11:23:49.701245 2c:53:4a:02:20:3c > aa:bb:cc:dd:ee:00, ethertype ARP (0x0806), length 42: Reply 10.139.160.254 is-at 2c:53:4a:02:20:3c, length 28
11:23:49.887183 2c:53:4a:02:20:3c > aa:bb:cc:dd:ee:00, ethertype ARP (0x0806), length 42: Request who-has 10.139.160.10 tell 10.139.160.254, length 28
11:23:49.887297 aa:bb:cc:dd:ee:00 > 2c:53:4a:02:20:3c, ethertype ARP (0x0806), length 60: Reply 10.139.160.10 is-at aa:bb:cc:dd:ee:00, length 46

3, 在i1里的bond(ptk0)和active NIC(enp6s0)和standby NIC(enp7s0)均无法运行ping命令(eg: ping -I enp6s0 10.139.160.254 -c1)

4, 同理,在i1上的ptk0.700上能运行arping命令

root@i1:/home/hua# arping -I ptk0.700 10.139.160.254 -S 10.139.160.10 -C1
ARPING 10.139.160.254
42 bytes from 2c:53:4a:02:20:3c (10.139.160.254): index=0 time=129.264 usec

11:28:52.927114 aa:bb:cc:dd:ee:00 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has 10.139.160.254 tell 10.139.160.10, length 46
11:28:52.927127 2c:53:4a:02:20:3c > aa:bb:cc:dd:ee:00, ethertype ARP (0x0806), length 42: Reply 10.139.160.254 is-at 2c:53:4a:02:20:3c, length 28

5, 但是在i1的里的bond(ptk0)和active NIC(enp6s0)可以运行arping命令(eg: arping -I enp6s0 10.139.160.254 -S 10.139.160.10 -C1)(但tcpdump里无输出),但是无法在standby NIC上运行它(arping -I enp7s0 10.139.160.254 -S 10.139.160.10 -C1)

#这种情况(tcpdump -i enp6s0f0.700 -nn -e -l "(arp or icmp)")无输出
root@i1:/home/hua# arping -I enp6s0 10.139.160.254 -S 10.139.160.10 -C1
ARPING 10.139.160.254
60 bytes from da:0d:a3:9d:5e:dd (10.139.160.254): index=0 time=266.707 usec
60 bytes from 28:d2:44:52:31:1d (10.139.160.254): index=1 time=286.628 usec
#但是tcpdump -i enp6s0f0 -nn -e -l "(arp or icmp)"会有如下输出(只有arp request, 没有arp reply)
13:50:10.436783 aa:bb:cc:dd:ee:00 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has 10.139.160.254 tell 10.139.160.10, length 46

#这种情况arpping无法运行,且(tcpdump -i enp6s0f0.700 -nn -e -l "(arp or icmp)")也无输出
root@i1:/home/hua# arping -I enp7s0 10.139.160.254 -S 10.139.160.10 -C1
ARPING 10.139.160.254
Timeout
Timeout

reference

[1] LACP Bond配置 - https://blog.csdn.net/quqi99/article/details/51251210
[2] 三种方式使用vlan - https://blog.csdn.net/quqi99/article/details/51218884
[3] creating vlan over openstack - https://blog.csdn.net/quqi99/article/details/118341936
[4] VLAN filter support on bridge - https://developers.redhat.com/blog/2017/09/14/vlan-filter-support-on-bridge#

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

Using lxd to do vlan test (by quqi99) 的相关文章

  • VLAN类型

    大家好呀 xff0c 我是请假君 xff0c 今天又来和大家一起学习数通了 xff0c 今天要分享的知识是VLAN类型 一 基于端口的VLAN xff1a 基于端口的VLAN是最简单 最有效的VLAN划分方法 xff0c 它按照设备端口来定
  • IDEA的Junit安装(添加jar包)

    1 安装junit插件 1 打开IDEA 点击文件 选择设置 setting 2 选择Plugins gt 点击Browse repositories 3 在搜索框中搜索Junit gt 找到Junit Generator V2 0点击 g
  • Java程序中对Service进行Mock

    Java程序中对Service进行Mock 背景 Servie Test Service 背景 在项目中往往需要对service逻辑进行单元测试验证 这里采用mockito对dao数据进行模拟 验证service逻辑 Servie pack
  • 坚持天天写技术笔记

    恍恍惚惚
  • 嵌入式经典面试题

    文章目录 一 常见面试题 1 用预处理指令 define 声明一个常数 用以表明1年中有多少秒 忽略闰年问题 2 写一个 标准 宏MIN 这个宏输入两个参数并返回较小的一个 3 预处理器标识 error的目的是什么 4 数据声明 5 sta
  • C# 与 VB.Net 中的命名空间引用

    在 VB Net 中 您可以执行类似以下操作而不会出现任何问题 只需忽略这是一个非常无用的类这一事实 Imports System Public Class Class1 Public Shared Function ArrayToList
  • 在 C# 中使用 Linq 进行字符串替换

    public class Abbreviation public string ShortName get set public string LongName get set 我有一个缩写对象列表 如下所示 List abbreviati
  • 如何为 Outlook for iOS 配置 ms-outlook://compose body 参数的 HTML 正文

    更新 2020 年 2 月 26 日我们的一位客户刚刚从 Microsoft 收到此信息 感谢您将此问题提交给 Outlook for iOS 和 Android 团队 经过仔细考虑 产品团队维持在 Outlook Mobile 深层链接中
  • DB2 MERGE 语句错误

    我尝试了以下几种变体 但仍然出现错误 有什么方法可以解决这个问题 DB2 10 1 用于 z OS V10 的 DB2 对于以下 MERGE INTO TRGT t USING SRC s ON t ACCTID s ACCTID AND
  • C++ 中的“using”关键字

    我正在学习C 我的教授使用了一些类似的代码 using filePath std string using setOfPaths std set
  • C++ 使用 std 和 boost 命名空间的最佳实践[重复]

    这个问题在这里已经有答案了 可能的重复 您更喜欢 C 中的显式命名空间还是 使用 我是一名 C 开发人员 但我的朋友是一名 C 开发人员 他向我展示了充满了类似调用的代码std for each and boost bind 我在 C 中使
  • Dapper 源代码 - 这会正确处理我的连接吗?

    查看 Dappers QueryAsync 方法的源代码 SqlMapper Async cs private static async Task
  • 在Using语句中捕获异常

    我知道Using http msdn microsoft com en us library yh598w02 28v vs 100 29 aspx语句释放正在创建的对象 就像我想做这样的事情 Using SqlConnection con
  • Laravel 5. 使用 USING 运算符

    我尝试了很长时间才找到它 我不敢相信Laravel没有这个功能 所以 我可以写 select from a join b where a id b id 或者更漂亮 select from a join b using id 第一种情况对于
  • 使用声明隐藏名称

    include
  • using 声明应该有多窄?

    我有这个小班widget使用一个std string 它在很多地方使用它 通常与std vector 所以你可以看到 类型名变得非常长而且烦人 我想利用using关键字 即using std string 问题是 最好放置在哪里 widge
  • Visual Studio 可以理解,但 Unity 不能?

    我已在 Visual Studio 中安装了 Microsoft Identity Client 现在可以声明using Microsoft Identity Client 在代码内 Visual Studio 很高兴 然而 团结却并非如此
  • 是否应该在 using 语句中使用 WebClient?

    如果 HttpClient 不应该用于using声明 请参阅以下链接 https aspnetmonsters com 2016 08 2016 08 27 httpclientwrong https aspnetmonsters com
  • 没有await的using语句中的async,这安全吗?

    如果在 using 语句中进行异步调用 并且调用的结果是异步处理的 即 发生这种情况的方法是异步的 并且在加载和处理结果之前返回 那么 using 语句是否会超出范围 换句话说 做这样的事情是否安全 async void LoadAndPr
  • 使用“using”关键字使继承的构造函数公开[重复]

    这个问题在这里已经有答案了 我正在尝试测试我的类的受保护方法和构造函数 为此 我尝试对其进行子类化 并使用 C 11 将其成员重新导出为 publicusing关键词 class Foo protected Foo int i void r

随机推荐

  • 毕业设计使用第三方api

    最近要着手毕业设计了 xff0c 本人的毕设是基于android的 xff0c 和公交有关 xff0c 所以想引用第三方的API xff0c 你们觉得可以吗 xff1f
  • meta—learning调研及MAML概述

    背景 Meta Learning xff0c 又称为 learning to learn xff0c Meta Learning希望使得模型获取一种 学会学习 的能力 xff0c 使其可以在获取已有 知识 的基础上快速学习新的任务 xff0
  • ubuntu18.04安装pycharm

    安装方法 xff1a 方法1 xff1a 在ubuntu的应用商店下载 方法2 xff1a 使用tar包解压缩后下载 xff0c 可参考网页 xff1a https blog csdn net mao hui fei article det
  • Python的命令行参数解析

    文章作者 xff1a Tyan 博客 xff1a noahsnail com CSDN 简书 命令行参数解析在编程语言中基本都会碰到 xff0c Python中内置了一个用于命令项选项与参数解析的模块argparse 下面主要介绍两种解析P
  • Matlab 2016a/b中调用GPU速度巨慢的解决办法

    利用caffe的MATLAB接口跑深度学习时 xff0c 设置gpu模式 xff1a caffe set mode gpu xff0c 可以加速运算 xff0c 然而在MATLAB 2016a b中调用gpu时会出现了一个BUG xff0c
  • keras 2.3.0 做上采样 UpSampling2D的时候的维度出错问题解决办法

    简单的说 xff0c 你是不是遇到了这样的问题 xff0c 上一层的数据是 None xff0c 200 14 14 你希望上采样到28x28 H 61 UpSampling2D size 61 2 2 H 你以为能得到 None xff0
  • juju based openstack upgrade (by quqi99)

    作者 张华 发表于 2022 02 17 版权声明 可以任意转载 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 http blog csdn net quqi99 问题 客户想将juju管理的openstack从xenia
  • Try Fyde OS on VMWare and Surface (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 02 28 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 http blog csdn net quqi99 Insta
  • Installing third-party firmware on x3-55 letv (by quqi99)

    问题 趁贾老板明天回国之前 xff0c 得连夜将他的乐视x3 55电视刷成第三方精简版的固件 xff0e 官方固件安装的内置服务太多不仅占硬盘空间而且都开着也占用内存影响运行速度 xff0e 要安装的是 xff02 蓝同学 xff02 的固
  • Set up debian based maas ha env on xenial by hand (by quqi99)

    准备三个节点 本文将在xenial ubuntu 16 04 使用debian包手工创建maas ha环境 先快速准备三个节点 juju deploy ubuntu maas1 series xenial config hostname m
  • add a wifi AP for armbian box (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 03 26 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 http blog csdn net quqi99 无线网卡的
  • Kids are forbidden to watch TV after school (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 03 30 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 http blog csdn net quqi99 iptab
  • ubuntu 20.04升级到22.04中遇到的问题(by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 04 23 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 http blog csdn net quqi99 昨天通过
  • keil下载代码时出现:“Not a genuine ST Device! Abort connection“的错误

    最近在学习嵌入式 xff0c 难免要玩一些开发板 我选择了相对比较便宜的STM32F10C8T6 所以我就从网上购买了这快板子 刚开始买回来的时候 xff0c 我根本不知道往板子上烧录代码的时候还需要ST LINK 因为我在学F407的时候
  • Testing ovn manually based on LXD (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 05 27 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 准备两个LXD容器 lxc list 43 43 43 43
  • [WIP] Openstack Masakari (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 06 07 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 什么是masakari masakari是OpenStack
  • 远程解决win10上keyboard和chrome不work的两例问题(by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 06 10 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 远程解决了两例windows问题 xff0c 记录一下 xff
  • try anbox or waydroid (by quqi99)

    作者 张华 发表于 2022 06 28 版权声明 可以任意转载 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 无论是安装anbox还是waydroid都失败了 记录一下 里面首先是没有 dev binder的问题 那是因
  • set up ovn development env (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 07 08 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 编译ovs并启动ovs vswitchd https docs
  • Using lxd to do vlan test (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 08 15 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 问题 客户说sriov虚机里收不着arp reply 他们的s