nova VirtualInterfaceCreateException (by quqi99)

2023-05-16

作者:张华 发表于:2022-09-01
版权声明:可以任意转载,转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明

问题

虚机有时候会报下列错误:

nova.exception.VirtualInterfaceCreateException: Virtual Interface creation failed
...
2022-07-19 13:50:32.084 147039 WARNING nova.virt.libvirt.driver [req-7b6da117-d40c-4bac-9f82-50c4266e1617 66d3188e9f24466f8d9c3905f178d12a ca6332100f1d42e4aa94aa2c37f243e4 - d476d9f579154d49961c29942a76d1c0 d476d9f579154d49961c29942a76d1c0] [instance: 6fb0fc5d-4fa3-435c-9f60-b9953d11adb7] Timeout waiting for [('network-vif-plugged', '864b25d9-276b-4644-916a-7637762b93e2')] for instance with vm_state building and task_state spawning.: eventlet.timeout.Timeout: 300 seconds

初步分析

没问题的虚机有下列日志:

Provisioning for port 2e96ade5-3a66-4045-b631-597995c07d5b completed by entity DHCP. provisioning_complete /usr/lib/python3/dist-packages/neutron/db/provisioning_blocks.py:133
Provisioning for port 2e96ade5-3a66-4045-b631-597995c07d5b completed by entity L2. provisioning_complete /usr/lib/python3/dist-packages/neutron/db/provisioning_blocks.py:133

有问题的虚机有下列日志, 它比上面的少了’completed by entity L2. provisioning_complete '.

For every failed VM (due to the vif plugged event timeout) neutron server log contain only following
Provisioning for port 864b25d9-276b-4644-916a-7637762b93e2 completed by entity DHCP. provisioning_complete /usr/lib/python3/dist-packages/neutron/db/provisioning_blocks.py:133

这样它就不会像下列这样将port设置为ACTIVE(https://docs.openstack.org/neutron/latest/contributor/internals/provisioning_blocks.html), 然后会引起nova端报的超时问题.

Transition to ACTIVE for port object 864b25d9-276b-4644-916a-7637762b93e2 will not be triggered until provisioned by entity L2. add_provisioning_component /usr/lib/python3/dist-packages/neutron/db/provisioning_blocks.py:73

neutron-l2-agent有时会花47分钟来完成一个循环.

2022-08-04 19:24:53.758 95691 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-5184b270-3ac4-4777-a974-8d629599cd88 - - - - -] Agent rpc_loop - iteration:2935733 started
2022-08-04 20:11:40.496 95691 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-5184b270-3ac4-4777-a974-8d629599cd88 - - - - -] Agent rpc_loop - iteration:2935733 completed. Processed ports statistics: {'regular': {'added': 0, 'updated': 2, 'removed': 2}}. Elapsed:2806.738

下列脚本能将所有neutron-l2-agent的循环时间给列出来(下面只贴出了部分输出, 它显示有时候循环花很长时间):

$ for i in `seq 2 22`; do echo "neutron-openvswitch-agent.log.$i.gz: " ; zgrep "iteration.* completed" neutron-openvswitch-agent.log.$i.gz | awk '{ print $(NF) }' | cut -d ":" -f2 | sort -gr | head -n 6; done
neutron-openvswitch-agent.log.2.gz:
...
neutron-openvswitch-agent.log.5.gz:
2806.738
2803.673
1410.152
1410.006
925.405
925.316

在上面2022-08-04 19:24:53 - 2022-08-04 20:11:40这段时间也都能看到所报的错误,从而验证所报的错误应该就是这个循环造成的.

$zgrep "Timeout waiting for \[('network-vif-plugged'" nova-compute.log.5.gz                 
2022-08-04 19:30:54.076 147039 WARNING nova.virt.libvirt.driver [req-acfb3bd2-565f-48e7-91e8-f109446895c1 f7ca0fcc30cf47c0b60b260bab9dd7bd e14e421176884fb492a71e1c9be13fa3 - 6113214bdbed491c880c23149c25b7cb 6113214bdbed491c880c23149c25b7cb] [instance: 94d4ab46-799d-44fc-9991-9a57ec18e950] Timeout waiting for [('network-vif-plugged', '43b7f269-0eae-44c4-990f-53c98743e12b')] for instance with vm_state building and task_state spawning.: eventlet.timeout.Timeout: 300 seconds
...
2022-08-04 20:10:06.211 147039 WARNING nova.virt.libvirt.driver [req-5b035412-a797-453a-b3e8-098dead294dd f7ca0fcc30cf47c0b60b260bab9dd7bd e14e421176884fb492a71e1c9be13fa3 - 6113214bdbed491c880c23149c25b7cb 6113214bdbed491c880c23149c25b7cb] [instance: 19da6ffc-dcc1-4620-b3b2-2018a3355b8b] Timeout waiting for [('network-vif-plugged', '71190281-53ce-4558-82cb-0240e71b9245')] for instance with vm_state building and task_state spawning.: eventlet.timeout.Timeout: 300 seconds

进一步分析 - focus在neutron-l2-agent端

这段时间neutron-l2-agent在做什么呢?它在更新SG.

2022-07-28 21:10:34.988 95691 INFO neutron.agent.securitygroups_rpc [req-0be98a3b-1674-486c-aeb4-67d49e413757 f7ca0fcc30cf47c0b60b260bab9dd7bd e14e421176884fb492a71e1c9be13fa3 - - -] Security group member upd
ated {'1f4da925-7e78-4e7c-ac07-8eae09fc2fda', '4de2b2bf-5302-4cf4-96de-949ea78ca8eb'}

通过下列方式要来neutron DB.

pass=`juju run --unit mysql/0 'leader-get root-password'`
juju run --unit mysql/0 "mysqldump -u root --password=$pass --single-transaction --skip-lock-tables --set-gtid-purged=OFF --databases neutron --quick --result-file=/tmp/neutron.sql"

分析到有的SG中979上的LB, 有高达1994个active ports (This is for security group:4d15a8fa-3adc-4750-87da-c3d26902dbca network:4a57f171-1e6b-4ae4-9ec5-2203144817b7 (lb-mgmt-net).):

select sgb.security_group_id, p.network_id, n.name, count(sgb.port_id) from securitygroupportbindings sgb join ports p on p.id=sgb.port_id join networks n on n.id=p.network_id group by 1,2 order by 4 desc limit 10;
+--------------------------------------+--------------------------------------+-------------------+--------------------+
| security_group_id                    | network_id                           | name              | count(sgb.port_id) |
+--------------------------------------+--------------------------------------+-------------------+--------------------+
| 4d15a8fa-3adc-4750-87da-c3d26902dbca | 4a57f171-1e6b-4ae4-9ec5-2203144817b7 | lb-mgmt-net       |               1994 |
| 7f9e022e-f12d-4526-8e11-b09f5ed2e0e5 | 4a8c719c-04e7-490c-b7d2-b99e79e7e79f | network_lb_az3    |                241 |

select count(id) from load_balancer where provisioning_status='ACTIVE';
+-----------+
| count(id) |
+-----------+
|       979 |
+-----------+
...

相关代码如下:

#https://github.com/openstack/neutron/blob/stable/ussuri/neutron/agent/securitygroups_rpc.py#L214
    def security_groups_member_updated(self, security_groups):
        LOG.info("Security group "
                 "member updated %r", security_groups)
        self._security_group_updated(
            security_groups,
            'security_group_source_groups',
            'sg_member')
            
#https://github.com/openstack/neutron/blob/stable/ussuri/neutron/agent/securitygroups_rpc.py#L206
    def security_groups_rule_updated(self, security_groups):
        LOG.info("Security group "
                 "rule updated %r", security_groups)
        self._security_group_updated(
            security_groups,
            'security_groups',
            'sg_rule')

这个是FW rpc的实现机制:
https://github.com/openstack/neutron/blob/stable/ussuri/doc/source/contributor/internals/openvswitch_firewall.rst#firewall-api-calls
https://github.com/openstack/neutron/blob/stable/ussuri/doc/source/contributor/internals/rpc_callbacks.rst

There are two main calls performed by the firewall driver in order to either create or update a port with security groups - prepare_port_filter and update_port_filter. Both methods rely on the security group objects that are already defined in the driver and work similarly to their iptables counterparts. The definition of the objects will be described later in this document. prepare_port_filter must be called only once during port creation, and it defines the initial rules for the port. When the port is updated, all filtering rules are removed, and new rules are generated based on the available information about security groups in the driver.

Security group rules can be defined in the firewall driver by calling update_security_group_rules, which rewrites all the rules for a given security group. If a remote security group is changed, then update_security_group_members is called to determine the set of IP addresses that should be allowed for this remote security group. Calling this method will not have any effect on existing instance ports. In other words, if the port is using security groups and its rules are changed by calling one of the above methods, then no new rules are generated for this port. update_port_filter must be called for the changes to take effect.

All the machinery above is controlled by security group RPC methods, which mean the firewall driver doesn't have any logic of which port should be updated based on the provided changes, it only accomplishes actions when called from the controller.

回顾SG知识

客户是在用openvswitch而不是iptables作为FW driver, 所以这里与是否用ipset无关.之前和SG和flow有关的blog:

  • OpenStack Security - https://zhhuabj.blog.csdn.net/article/details/78435072
  • Debug OpenvSwitch - https://blog.csdn.net/quqi99/article/details/111831695
    那里面提到了这么一个lp bug (https://bugs.launchpad.net/neutron/+bug/1907491), 它说了这么一个问题,为一个VM创建一个具有remote SG的SG时,the conjunctive flows that match the remote-group’s member IPs are created,但当删除fix-ip port时,这人conjunctive flow却不会被删除(patch(https://review.opendev.org/c/openstack/neutron/+/766775/1/neutron/agent/linux/openvswitch_firewall/firewall.py)匹配流时使用ovs_lib.COOKIE_ANY就能找到并删除了)
# Run the following commands as demo user
# Create a server so get an active port associated with the default security group
$ openstack server list
+--------------------------------------+---------+--------+----------------------------------------------------------------------+--------------------------+----------+
| ID                                   | Name    | Status | Networks                                                             | Image                    | Flavor   |
+--------------------------------------+---------+--------+----------------------------------------------------------------------+--------------------------+----------+
| a07caed7-6cff-4e7c-bf6b-eef572934e55 | test-vm | ACTIVE | private=10.0.0.8, fdfd:3244:7253:0:f816:3eff:fe5c:4498, 192.168.0.83 | cirros-0.5.1-x86_64-disk | m1.small |
+--------------------------------------+---------+--------+----------------------------------------------------------------------+--------------------------+----------+

# Create ingress rules using remote-ip and default group as the remote-group
$ openstack security group rule create --remote-ip 8.8.8.8/32 --proto tcp --dst-port 80 default
$ openstack security group rule create --remote-group default --proto tcp --dst-port 443 default

$ openstack security group rule list
+--------------------------------------+-------------+-----------+------------+------------+-----------+--------------------------------------+----------------------+--------------------------------------+
| ID                                   | IP Protocol | Ethertype | IP Range   | Port Range | Direction | Remote Security Group                | Remote Address Group | Security Group                       |
+--------------------------------------+-------------+-----------+------------+------------+-----------+--------------------------------------+----------------------+--------------------------------------+
| 349fc8a0-ee67-4ab3-98c9-87091682fca2 | tcp         | IPv4      | 8.8.8.8/32 | 80:80      | ingress   | None                                 | None                 | 9508b291-181b-43ea-9635-dc293c0a2399 |
| 66bba154-a035-40a7-86f6-ddfbf772526b | tcp         | IPv4      | 0.0.0.0/0  | 443:443    | ingress   | 9508b291-181b-43ea-9635-dc293c0a2399 | None                 | 9508b291-181b-43ea-9635-dc293c0a2399 |
+--------------------------------------+-------------+-----------+------------+------------+-----------+--------------------------------------+----------------------+--------------------------------------+

# Create another port with fixed ip and associates with the default security group
$ openstack port create --network private --fixed-ip subnet=9d50b062-8699-4fc3-a250-1c0b4147357a test-port
$ openstack port show 7271f67c-cfac-42bc-abc1-388e8e4db3ab -c fixed_ips -c security_group_ids
+--------------------+-----------------------------------------------------------------------------------------------------+
| Field              | Value                                                                                               |
+--------------------+-----------------------------------------------------------------------------------------------------+
| fixed_ips          | ip_address='10.0.0.40', subnet_id='9d50b062-8699-4fc3-a250-1c0b4147357a'                            |
|                    | ip_address='fdfd:3244:7253:0:f816:3eff:fef3:ee1e', subnet_id='abfcc07a-6a61-4e09-8232-76d162f2b341' |
| security_group_ids | 9508b291-181b-43ea-9635-dc293c0a2399                                                                |
+--------------------+-----------------------------------------------------------------------------------------------------+

# verify the flows are generated for the ips:
sudo ovs-ofctl dump-flows br-int | grep "8.8.8.8"
 cookie=0x2677522778c1e14b, duration=324.328s, table=82, n_packets=0, n_bytes=0, idle_age=12295, priority=77,ct_state=+est-rel-rpl,tcp,reg5=0x16,nw_src=8.8.8.8,tp_dst=80 actions=output:22
 cookie=0x2677522778c1e14b, duration=324.328s, table=82, n_packets=0, n_bytes=0, idle_age=12295, priority=77,ct_state=+new-est,tcp,reg5=0x16,nw_src=8.8.8.8,tp_dst=80 actions=ct(commit,zone=NXM_NX_REG6[0..15]),output:22,resubmit(,92)

$ sudo ovs-ofctl dump-flows br-int | grep "10.0.0.40"
 cookie=0xde9c1cb2e3da8a74, duration=65.219s, table=82, n_packets=0, n_bytes=0, idle_age=65, priority=73,ct_state=+est-rel-rpl,ip,reg6=0x1,nw_src=10.0.0.40 actions=conjunction(38,1/2)
 cookie=0xde9c1cb2e3da8a74, duration=65.219s, table=82, n_packets=0, n_bytes=0, idle_age=65, priority=73,ct_state=+new-est,ip,reg6=0x1,nw_src=10.0.0.40 actions=conjunction(39,1/2)

# Delete the remote-ip rule and the flows are gone:
$ openstack security group rule delete 349fc8a0-ee67-4ab3-98c9-87091682fca2
$ sudo ovs-ofctl dump-flows br-int | grep "8.8.8.8"
(Empty)

# Unset the fixed-ip from the seperate port and flows are not deleted:
$ openstack port unset --fixed-ip ip-address='10.0.0.40',subnet=9d50b062-8699-4fc3-a250-1c0b4147357a 7271f67c-cfac-42bc-abc1-388e8e4db3ab
$ sudo ovs-ofctl dump-flows br-int | grep "10.0.0.40"
 cookie=0xde9c1cb2e3da8a74, duration=65.219s, table=82, n_packets=0, n_bytes=0, idle_age=65, priority=73,ct_state=+est-rel-rpl,ip,reg6=0x1,nw_src=10.0.0.40 actions=conjunction(38,1/2)
 cookie=0xde9c1cb2e3da8a74, duration=65.219s, table=82, n_packets=0, n_bytes=0, idle_age=65, priority=73,ct_state=+new-est,ip,reg6=0x1,nw_src=10.0.0.40 actions=conjunction(39,1/2)

可能触发的bug

上面的lp bug 1907491在使用exeve来调用ovs-ofctl来删除conjunctive flow, 如果有很多port那就会有很多fixed-ip那么remote SG里有很多rules, 这样删除起来估计就会花很长时间.所以也就有了lp bug 1975674(https://bugs.launchpad.net/neutron/+bug/1975674),它来批量删除conjunctive flow, 也就是:

self._delete_flows(deferred=False, **flow)

得改成:

self._delete_flows(**flow)

这个看起来可能性非常大.客户在用focal (neutron-common=2:16.4.2-0ubuntu1)也就是ussuri, patch在stable/ussuri中,却不在16.4.2中:

hua@t440p:/bak/openstack/neutron$ git tag --contains 30ef996f8aa0b0bc57a280690871f1081946ffee
hua@t440p:/bak/openstack/neutron$ git branch -r --contains 30ef996f8aa0b0bc57a280690871f1081946ffee
  origin/stable/ussuri

另一个非常有可能的lp bug是 - https://bugs.launchpad.net/neutron/+bug/1813703 , 它考虑的是更通过的如RPC client与server超时的问题.

重现问题

通过下列方法重现了问题. 首先迅速搭建openstack环境:

./generate-bundle.sh --name focal --series focal --num-compute 1  --use-stable-charms --run
./tools/vault-unseal-and-authorise.sh
./configure
source novarc

然后像伪造octavia的o-hm0一样,不创建虚机,只是’neutron port-create’创建port,然后’ovs-vsctl add-port’手工创建port,这样这些port也会根据server端的数据来创建流.

1, create 300 ports binding to one host

#openstack security group rule create --remote-ip 8.8.8.8/32 --proto tcp --dst-port 80 default
#openstack security group rule create --remote-group default --proto tcp --dst-port 443 default
#openstack security group rule list
HOST='juju-43efce-focal-9.cloud.sts'
NETWORK_ID=$(openstack network show private -cid -fvalue)
PROJECT_ID=$(openstack project show --domain admin_domain admin -f value -c id)
SECGRP_ID=$(openstack security group list --project ${PROJECT_ID} | awk '/default/ {print $2}')
openstack quota set $PROJECT_ID --ram 262144 --cores 200 --instances 100 --secgroup-rules 2000 --secgroups 200  --ports 500 --gigabytes 2000 --volumes 60
for i in {1..300}
do
	neutron port-create --name test-large-scale-port-$i \
	  --security-group $SECGRP_ID \
	  --device-owner testing:scale \
	  --binding:host_id=$HOST $NETWORK_ID
done

2, create 500 - 1000 security-group rules to group <security_group_id>

#https://specs.openstack.org/openstack/neutron-specs/specs/victoria/address-groups-support-in-security-group-rule.html
NEUTRON_IP=$(juju status neutron-api/0 |awk '/ACTIVE/ {print $3}')
SCHEMA="http"
export AUTH_TOKEN="$(openstack token issue -c id -f value)"
for i in {3000..4000}
do
cat << EOF |tee tmp.json
{
  "security_group_rule": {
    "direction": "ingress",
    "protocol": "tcp",
    "ethertype": "IPv4",
    "port_range_min": "3000",
    "port_range_max": "$i",
    "security_group_id": "$SECGRP_ID"
  }
}
EOF
curl -g -i -X POST ${SCHEMA}://${NEUTRON_IP}:9696/v2.0/security-group-rules \
  -H "User-Agent: python-neutronclient" -H "Content-Type: application/json" \
  -H "Accept: application/json" -H "X-Auth-Token: ${AUTH_TOKEN}" \
  -d @tmp.json
done

3, Run the following commands on compute host to setup the port to the host <compute_node_host_name>

#openstack port list -fvalue |grep test-large-scale-port > port_list && juju scp ./port_list nova-compute/0:~
for p in `cat port_list |awk '{print $1}'`
do
    mac=`grep $p port_list |awk '{print $3}'`
    ip_addr=`grep $p port_list |awk '{print $7}' |awk -F\' '{print $2}'`
    dev_id=`echo $p |cut -b 1-11`
    dev_name="tp-$dev_id"
    echo "========"$mac"======"$ip_addr"======"$dev_id"======="$dev_name
    ovs-vsctl  --may-exist add-port br-int ${dev_name} -- set Interface ${dev_name} type=internal \
    -- set Interface ${dev_name} external-ids:attached-mac="${mac}" \
    -- set Interface ${dev_name} external-ids:iface-id="${p}" \
    -- set Interface ${dev_name} external-ids:iface-status=active
    sleep 0.2
    ip link set dev ${dev_name} address ${mac}
    ip addr add ${ip_addr} dev ${dev_name}
    ip link set ${dev_name} up
done

4, flow verify

# ovs-ofctl -O OpenFlow13 dump-flows br-int |grep conjunction |grep 192 |wc -l
504
# ovs-ofctl -O OpenFlow13 dump-flows br-int |grep conjunction |grep 192 |head -n1
 cookie=0x556a0a63a181157b, duration=54.344s, table=82, n_packets=0, n_bytes=0, priority=70,ct_state=+est-rel-rpl,ip,reg6=0x1,nw_src=192.168.21.177 actions=conjunction(8,1/2)
# ovs-ofctl -O OpenFlow13 dump-flows br-int |grep '192.168.21.177'
 cookie=0x556a0a63a181157b, duration=80.843s, table=82, n_packets=0, n_bytes=0, priority=70,ct_state=+est-rel-rpl,ip,reg6=0x1,nw_src=192.168.21.177 actions=conjunction(8,1/2)
 cookie=0x556a0a63a181157b, duration=80.844s, table=82, n_packets=0, n_bytes=0, priority=70,ct_state=+new-est,ip,reg6=0x1,nw_src=192.168.21.177 actions=conjunction(9,1/2)
# ovs-ofctl -O OpenFlow13 dump-flows br-int |grep 'conjunction(8,2/2)' |head -n1
 cookie=0x556a0a63a181157b, duration=220.995s, table=82, n_packets=0, n_bytes=0, priority=70,ct_state=+est-rel-rpl,ip,reg5=0x8 actions=conjunction(8,2/2)

5, agent verify, compute-compute agent is down

$ juju status nova-compute/0 |grep down
9        down   10.5.1.45  8b0ba05a-0438-46d0-a523-4000c30054ac  focal   nova  ACTIVE

#underlying
$ source ~/novarc
$ nova list |grep juju-43efce-focal-9
| 8b0ba05a-0438-46d0-a523-4000c30054ac | juju-43efce-focal-9      | ACTIVE  | -          | Running     | zhhuabj_admin_net=10.5.1.45              |

$ openstack network agent list
+--------------------------------------+--------------------+-------------------------------+-------------------+-------+-------+---------------------------+
| ID                                   | Agent Type         | Host                          | Availability Zone | Alive | State | Binary                    |
+--------------------------------------+--------------------+-------------------------------+-------------------+-------+-------+---------------------------+
| 11f1ed0c-1bfc-4155-ad06-af96e78f53fe | L3 agent           | juju-43efce-focal-7           | nova              | :-)   | UP    | neutron-l3-agent          |
| 282206fe-2ee0-4946-9120-5792d65b5802 | DHCP agent         | juju-43efce-focal-7           | nova              | :-)   | UP    | neutron-dhcp-agent        |
| 34bdf0f7-fd63-4ef9-b499-3c9a4e427f7b | DHCP agent         | juju-43efce-focal-9.cloud.sts | nova              | XXX   | UP    | neutron-dhcp-agent        |
| 632c482d-94ff-4edd-a318-cefb43be12b0 | Metadata agent     | juju-43efce-focal-9.cloud.sts | None              | XXX   | UP    | neutron-metadata-agent    |
| 6c7fcd2b-9be5-4316-ae29-87767e0f348a | Open vSwitch agent | juju-43efce-focal-7           | None              | :-)   | UP    | neutron-openvswitch-agent |
| 96e0d8ba-f371-4b98-ba0f-0da0218531ee | Metadata agent     | juju-43efce-focal-7           | None              | :-)   | UP    | neutron-metadata-agent    |
| cff16c66-0ebf-488d-ad8b-a6bb9035e582 | Open vSwitch agent | juju-43efce-focal-9.cloud.sts | None              | XXX   | UP    | neutron-openvswitch-agent |
| f08b55d1-e177-4598-aaaf-0426ee5f1368 | Metering agent     | juju-43efce-focal-7           | None              | :-)   | UP    | neutron-metering-agent    |
+--------------------------------------+--------------------+-------------------------------+-------------------+-------+-------+---------------------------+


5, 在重启(nova reboot --hard juju-43efce-focal-9)计算节点后,这时才能登录进去,看到了:

root@juju-43efce-focal-9:/home/ubuntu# zgrep "iteration.* completed" /var/log/neutron/neutron-openvswitch-agent.log* | awk '{ print $(NF) }' | cut -d ":" -f2 | sort -gr | head -n 6
191.011
2.566
1.528
0.856
0.166
0.156

测试patch

直接修改源码:

#https://opendev.org/openstack/neutron/commit/30ef996f8aa0b0bc57a280690871f1081946ffee
#vim /usr/lib/python3/dist-packages/neutron/agent/linux/openvswitch_firewall/firewall.py
systemctl restart neutron-openvswitch-agent

然后重新运行:

for p in `cat port_list |awk '{print $1}'`
do
    mac=`grep $p port_list |awk '{print $3}'`
    ip_addr=`grep $p port_list |awk '{print $7}' |awk -F\' '{print $2}'`
    dev_id=`echo $p |cut -b 1-11`
    dev_name="tp-$dev_id"
    echo "========"$mac"======"$ip_addr"======"$dev_id"======="$dev_name
    ovs-vsctl  --may-exist add-port br-int ${dev_name} -- set Interface ${dev_name} type=internal \
    -- set Interface ${dev_name} external-ids:attached-mac="${mac}" \
    -- set Interface ${dev_name} external-ids:iface-id="${p}" \
    -- set Interface ${dev_name} external-ids:iface-status=active
    sleep 0.2
    ip link set dev ${dev_name} address ${mac}
    ip addr add ${ip_addr} dev ${dev_name}
    ip link set ${dev_name} up
done

此时计算节点没有down, 且不能正常创建虚机"./tools/instance_launch.sh 1 focal" (没有IP了,需先删除一个port才能创建虚机:openstack port delete test-large-scale-port-103),此时,又在neutron-server.log里看到dhcp-agent DOWN了

换个方法重现

上面的方法没有重现问题,继续用下列方法来试图重现lp bug: https://bugs.launchpad.net/neutron/+bug/1975674

 Create a VM with security group A
  - Add a rule to security group A allowing access from a remote security group B
  - Add a large number or ports to security group B (e.g. 2000)
    - The respective ovs flows will be added
  - Delete the VM
    - The ovs flows will be removed

./generate-bundle.sh --name focal --series focal --num-compute 1 --use-stable-charms --run
neutron security-group-create ssh
neutron security-group-create web
neutron security-group-rule-create --direction ingress --protocol tcp  --port-range-min 22 --port-range-max 22 ssh
#Allow TCP port 80 access from IP addresses, specified as IP subnet 0.0.0.0/0 in CIDR notation.
neutron security-group-rule-create --direction ingress --protocol tcp  --port-range-min 80 --port-range-max 80 web
#Allow TCP port 22 and 23 addresses from other security groups (ssh) to access the specified port
neutron security-group-rule-create --direction ingress --protocol tcp  --port-range-min 22 --port-range-max 22 --remote-group-id ssh web
neutron security-group-rule-create --direction ingress --protocol tcp  --port-range-min 23 --port-range-max 23 --remote-group-id ssh web

#Add a large number or ports to security group ssh (e.g. 2000)
NEUTRON_IP=$(juju status neutron-api/0 |awk '/ACTIVE/ {print $3}')
SCHEMA="http"
export AUTH_TOKEN="$(openstack token issue -c id -f value)"
NETWORK_ID=$(openstack network show private -cid -fvalue)
PROJECT_ID=$(openstack project show --domain admin_domain admin -f value -c id)
SECGRP_ID=$(openstack security group list --project ${PROJECT_ID} | awk '/ssh/ {print $2}')
openstack quota set $PROJECT_ID --ram 262144 --cores 200 --instances 100 --secgroup-rules 2000 --secgroups 200  --ports 500 --gigabytes 2000 --volumes 60
for i in {2000..4000}
do
cat << EOF |tee tmp.json
{
  "security_group_rule": {
    "direction": "ingress",
    "protocol": "tcp",
    "ethertype": "IPv4",
    "port_range_min": "2000",
    "port_range_max": "$i",
    "security_group_id": "$SECGRP_ID"
  }
}
EOF
curl -g -i -X POST ${SCHEMA}://${NEUTRON_IP}:9696/v2.0/security-group-rules \
  -H "User-Agent: python-neutronclient" -H "Content-Type: application/json" \
  -H "Accept: application/json" -H "X-Auth-Token: ${AUTH_TOKEN}" \
  -d @tmp.json
done

#create a test VM with security group 'web'
openstack keypair create --public-key ~/.ssh/id_rsa.pub mykey
openstack server create --wait --image focal --flavor m1.small --key-name mykey --nic net-id=$NETWORK_ID --security-group web i1

可惜没测试成功, 删除’openstack server delete i1’时下列flow也能成功被删除.

# ovs-ofctl -O OpenFlow13 dump-flows br-int |grep '192.168.21'
 cookie=0x1a6d745e5627d342, duration=90.519s, table=71, n_packets=6, n_bytes=252, priority=95,arp,reg5=0x4,in_port=4,dl_src=fa:16:3e:88:8c:d5,arp_spa=192.168.21.58 actions=resubmit(,94)
 cookie=0x1a6d745e5627d342, duration=90.519s, table=71, n_packets=328, n_bytes=38542, priority=65,ip,reg5=0x4,in_port=4,dl_src=fa:16:3e:88:8c:d5,nw_src=192.168.21.58 actions=ct(table=72,zone=NXM_NX_REG6[0..15])
 cookie=0x1a6d745e5627d342, duration=90.518s, table=71, n_packets=0, n_bytes=0, priority=80,udp,reg5=0x4,in_port=4,dl_src=fa:16:3e:88:8c:d5,nw_src=192.168.21.58,tp_src=68,tp_dst=67 actions=resubmit(,73)
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

nova VirtualInterfaceCreateException (by quqi99) 的相关文章

  • OpenStack介绍说明、OpenStack架构说明、OpenStack核心服务详细说明【keystone,nova,cinder,neutron...】、OpenStack创建VM,服务间交互示例

    文章目录 OpenStack介绍说明OpenStack起源认识openstack 重要 OpenStack架构说明OpenStack架构概览OpenStack逻辑架构OpenStack生产环境部署架构示例 OpenStack核心服务说明通用
  • openstack删除nova service-list中的computer服务列表流程

    文章目录 说明 删除流程 nova服务查看 修改主机名 关闭准备删除的nova服务 删除nova服务 删除agent服务 创建虚拟机测试 说明 如下图 同一个计算节点主机名有2个 造成这种原因就是单纯的改主机名了 因为正常来说计算节点nov
  • juju based openstack upgrade (by quqi99)

    作者 张华 发表于 2022 02 17 版权声明 可以任意转载 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 http blog csdn net quqi99 问题 客户想将juju管理的openstack从xenia
  • [WIP] Openstack Masakari (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 06 07 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 什么是masakari masakari是OpenStack
  • 远程解决win10上keyboard和chrome不work的两例问题(by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 06 10 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 远程解决了两例windows问题 xff0c 记录一下 xff
  • try anbox or waydroid (by quqi99)

    作者 张华 发表于 2022 06 28 版权声明 可以任意转载 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 无论是安装anbox还是waydroid都失败了 记录一下 里面首先是没有 dev binder的问题 那是因
  • set up ovn development env (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 07 08 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 编译ovs并启动ovs vswitchd https docs
  • Using lxd to do vlan test (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 08 15 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 问题 客户说sriov虚机里收不着arp reply 他们的s
  • ovn metadata (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 08 25 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 问题 客户描述虚机的metadata功能偶尔有问题 xff0c
  • ovn-central raft HA (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 10 12 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 What s raft RAFT https raft git
  • my cloud test bed (by quqi99)

    作者 张华 发表于 2023 03 10 版权声明 可以任意转载 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 问题 有一台NUC minipc 配置是 CPU i7 13700H 16核20线程 内存 16G 32G 4
  • 计算节点nova服务启动失败

    在计算节点启动openstack nova compute服务的时候 xff0c 服务无法正常启动 xff0c 查看nova的日志发现如下报错 xff1a 2019 04 25 00 02 26 481 24682 ERROR nova T
  • VMM插件和OpenStack nova集成(华三CAS插件处理虚拟化流程及源码分析)

    插件组成 华三目前依托OpenStack有以下几个插件 xff1a l openstack cas nova version tar gz 虚拟化 l openstack cas cinder version tar gz 为用户提供统一的
  • 解决Nova ERROR nova.compute.manager ResourceProviderCreationFailed: Failed to create resource provider

    解决Nova ERROR nova compute manager ResourceProviderCreationFailed Failed to create resource provider Nova出现错误类似如下 xff1a 2
  • Openstack(nova)、kvm、qemu和libvirtd之间的联系

    之前一直不清楚kvm qemu libvirtd和nova组件之间的区别和联系 xff0c 今天在网上看了几篇文章 xff0c 基本搞清了这几者之间的关系 下面简单总结一下 先介绍一下这几种技术 一 xff1a QEMU QEMU是一个模拟
  • OpenStack — Nova

    文章目录 NovaNova架构Nava组件nova apinova computenova conductornova schedulernova novncproxy 创建虚拟机流程 Nova Nova是OpenStack最核心的服务模块
  • 踏入OpenStack大门,Nova计算服务讲解

    文章目录 一 Nova计算服务概述1 1 Nova简介1 2 Nova系统架构1 3 Nova 部署 Cell 二 Nova组件详细介绍2 1 API xff08 通信接口 xff09 2 2 Scheduler xff08 调度器 xff
  • OpenStack之Nova(T版)

    目录 一 概述二 Nova系统架构一 API二 Scheduler一 选择计算节点二 调度器类型三 过滤器 三 compute四 conductor五 PlacementAPl六 cell架构 三 部署一 Placement一 创建数据库二
  • 【Nova】nova-scheduler过滤称重

    在上一篇 nova scheduler调度过程分析 中 xff0c 对过滤称重的过程一笔带过了 xff0c 这篇着重介绍一下 首先 xff0c 我们声明一下host为主机 xff0c node为节点 xff0c 在OpenStack中一个h
  • nova mitaka ReleaseNotes

    nova mitaka ReleaseNotes nova mitaka ReleaseNotes 概要 新特性 升级注意点 废弃列表主要针对配置项 概要 API的微版本号增加到了v2 25 新增数据库nova api 新增nova man

随机推荐

  • Python的命令行参数解析

    文章作者 xff1a Tyan 博客 xff1a noahsnail com CSDN 简书 命令行参数解析在编程语言中基本都会碰到 xff0c Python中内置了一个用于命令项选项与参数解析的模块argparse 下面主要介绍两种解析P
  • Matlab 2016a/b中调用GPU速度巨慢的解决办法

    利用caffe的MATLAB接口跑深度学习时 xff0c 设置gpu模式 xff1a caffe set mode gpu xff0c 可以加速运算 xff0c 然而在MATLAB 2016a b中调用gpu时会出现了一个BUG xff0c
  • keras 2.3.0 做上采样 UpSampling2D的时候的维度出错问题解决办法

    简单的说 xff0c 你是不是遇到了这样的问题 xff0c 上一层的数据是 None xff0c 200 14 14 你希望上采样到28x28 H 61 UpSampling2D size 61 2 2 H 你以为能得到 None xff0
  • juju based openstack upgrade (by quqi99)

    作者 张华 发表于 2022 02 17 版权声明 可以任意转载 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 http blog csdn net quqi99 问题 客户想将juju管理的openstack从xenia
  • Try Fyde OS on VMWare and Surface (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 02 28 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 http blog csdn net quqi99 Insta
  • Installing third-party firmware on x3-55 letv (by quqi99)

    问题 趁贾老板明天回国之前 xff0c 得连夜将他的乐视x3 55电视刷成第三方精简版的固件 xff0e 官方固件安装的内置服务太多不仅占硬盘空间而且都开着也占用内存影响运行速度 xff0e 要安装的是 xff02 蓝同学 xff02 的固
  • Set up debian based maas ha env on xenial by hand (by quqi99)

    准备三个节点 本文将在xenial ubuntu 16 04 使用debian包手工创建maas ha环境 先快速准备三个节点 juju deploy ubuntu maas1 series xenial config hostname m
  • add a wifi AP for armbian box (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 03 26 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 http blog csdn net quqi99 无线网卡的
  • Kids are forbidden to watch TV after school (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 03 30 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 http blog csdn net quqi99 iptab
  • ubuntu 20.04升级到22.04中遇到的问题(by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 04 23 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 http blog csdn net quqi99 昨天通过
  • keil下载代码时出现:“Not a genuine ST Device! Abort connection“的错误

    最近在学习嵌入式 xff0c 难免要玩一些开发板 我选择了相对比较便宜的STM32F10C8T6 所以我就从网上购买了这快板子 刚开始买回来的时候 xff0c 我根本不知道往板子上烧录代码的时候还需要ST LINK 因为我在学F407的时候
  • Testing ovn manually based on LXD (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 05 27 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 准备两个LXD容器 lxc list 43 43 43 43
  • [WIP] Openstack Masakari (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 06 07 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 什么是masakari masakari是OpenStack
  • 远程解决win10上keyboard和chrome不work的两例问题(by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 06 10 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 远程解决了两例windows问题 xff0c 记录一下 xff
  • try anbox or waydroid (by quqi99)

    作者 张华 发表于 2022 06 28 版权声明 可以任意转载 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 无论是安装anbox还是waydroid都失败了 记录一下 里面首先是没有 dev binder的问题 那是因
  • set up ovn development env (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 07 08 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 编译ovs并启动ovs vswitchd https docs
  • Using lxd to do vlan test (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 08 15 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 问题 客户说sriov虚机里收不着arp reply 他们的s
  • ovn metadata (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 08 25 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 问题 客户描述虚机的metadata功能偶尔有问题 xff0c
  • 网络攻防实验 (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 08 29 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 测试环境 lxd容器 xff0c i3为中间攻击者所以在i3上
  • nova VirtualInterfaceCreateException (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 09 01 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 问题 虚机有时候会报下列错误 xff1a nova excep