ovn-central raft HA (by quqi99)

2023-05-16

作者:张华 发表于:2022-10-12
版权声明:可以任意转载,转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明

What’s raft

RAFT(https://raft.github.io/)是一种致性算法,它有三种节点:

  • Follower - This is first state when node start.
  • Candidate - If no leader then start election.
  • Leader - If get enough vote then it become leader and all other nodes followers.
    在这里插入图片描述
    使用RAFT做做ovn-central HA的作用相当于haproxy或pacemaker (see https://bugs.launchpad.net/neutron/+bug/1969354/comments/3 )
    在这里插入图片描述

Set up ovn-central raft HA env

基于3个LXD容器快速搭建ovn-central raft HA环境. 或请参考: https://bugzilla.redhat.com/show_bug.cgi?id=1929690#c9

cd ~ && lxc launch faster:ubuntu/focal v1
lxc launch faster:ubuntu/focal v2
lxc launch faster:ubuntu/focal v3
#the subnet is 192.168.121.0/24
lxc config device override v1 eth0 ipv4.address=192.168.121.2
lxc config device override v2 eth0 ipv4.address=192.168.121.3
lxc config device override v3 eth0 ipv4.address=192.168.121.4
lxc stop v1 && lxc start v1 && lxc stop v2 && lxc start v2 && lxc stop v3 && lxc start v3

#on v1
lxc exec `lxc list |grep v1 |awk -F '|' '{print $2}'` bash
sudo apt install ovn-central -y
cat << EOF |tee /etc/default/ovn-central
OVN_CTL_OPTS= \
  --db-nb-addr=192.168.121.2 \
  --db-sb-addr=192.168.121.2 \
  --db-nb-cluster-local-addr=192.168.121.2 \
  --db-sb-cluster-local-addr=192.168.121.2 \
  --db-nb-create-insecure-remote=yes \
  --db-sb-create-insecure-remote=yes \
  --ovn-northd-nb-db=tcp:192.168.121.2:6641,tcp:192.168.121.3:6641,tcp:192.168.121.4:6641 \
  --ovn-northd-sb-db=tcp:192.168.121.2:6642,tcp:192.168.121.3:6642,tcp:192.168.121.4:6642
EOF
rm -rf /var/lib/ovn/* && rm -rf /var/lib/ovn/.ovn*
systemctl restart ovn-central
root@v1:~# ovs-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound
6943
Name: OVN_Northbound
Cluster ID: 51b9 (51b9f953-989f-4f90-9add-73dbabe3fe06)
Server ID: 6943 (69432f05-2d37-44fd-8869-2ec365bb0b4c)
Address: tcp:192.168.121.2:6643
Status: cluster member
Role: leader
Term: 2
Leader: self
Vote: self
Election timer: 1000
Log: [2, 5]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: <-0000 <-0000
Servers:
    6943 (6943 at tcp:192.168.121.2:6643) (self) next_index=4 match_index=4

#on v2
lxc exec `lxc list |grep v2 |awk -F '|' '{print $2}'` bash
sudo apt install ovn-central -y
cat << EOF |tee /etc/default/ovn-central
OVN_CTL_OPTS= \
  --db-nb-addr=192.168.121.3 \
  --db-sb-addr=192.168.121.3 \
  --db-nb-cluster-local-addr=192.168.121.3 \
  --db-sb-cluster-local-addr=192.168.121.3 \
  --db-nb-create-insecure-remote=yes \
  --db-sb-create-insecure-remote=yes \
  --ovn-northd-nb-db=tcp:192.168.121.2:6641,tcp:192.168.121.3:6641,tcp:192.168.121.4:6641 \
  --ovn-northd-sb-db=tcp:192.168.121.2:6642,tcp:192.168.121.3:6642,tcp:192.168.121.4:6642 \
  --db-nb-cluster-remote-addr=192.168.121.2 \
  --db-sb-cluster-remote-addr=192.168.121.2
EOF
rm -rf /var/lib/ovn/* && rm -rf /var/lib/ovn/.ovn*
systemctl restart ovn-central
root@v2:~# ovs-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound
158b
Name: OVN_Northbound
Cluster ID: 51b9 (51b9f953-989f-4f90-9add-73dbabe3fe06)
Server ID: 158b (158b0aea-ba5d-42e0-b69b-2fc05204f622)
Address: tcp:192.168.121.3:6643
Status: cluster member
Role: follower
Term: 2
Leader: 6943
Vote: unknown
Election timer: 1000
Log: [2, 7]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: ->0000 <-6943
Servers:
    6943 (6943 at tcp:192.168.121.2:6643)
    158b (158b at tcp:192.168.121.3:6643) (self)


#on v3
lxc exec `lxc list |grep v3 |awk -F '|' '{print $2}'` bash
sudo apt install ovn-central -y
cat << EOF |tee /etc/default/ovn-central
OVN_CTL_OPTS= \
  --db-nb-addr=192.168.121.4 \
  --db-sb-addr=192.168.121.4 \
  --db-nb-cluster-local-addr=192.168.121.4 \
  --db-sb-cluster-local-addr=192.168.121.4 \
  --db-nb-create-insecure-remote=yes \
  --db-sb-create-insecure-remote=yes \
  --ovn-northd-nb-db=tcp:192.168.121.2:6641,tcp:192.168.121.3:6641,tcp:192.168.121.4:6641 \
  --ovn-northd-sb-db=tcp:192.168.121.2:6642,tcp:192.168.121.3:6642,tcp:192.168.121.4:6642 \
  --db-nb-cluster-remote-addr=192.168.121.2 \
  --db-sb-cluster-remote-addr=192.168.121.2
EOF
rm -rf /var/lib/ovn/* && rm -rf /var/lib/ovn/.ovn*
systemctl restart ovn-central
root@v3:~# ovs-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound
298d
Name: OVN_Northbound
Cluster ID: 51b9 (51b9f953-989f-4f90-9add-73dbabe3fe06)
Server ID: 298d (298de33b-1b92-47c4-95aa-ebcf8e80f567)
Address: tcp:192.168.121.4:6643
Status: cluster member
Role: follower
Term: 2
Leader: 6943
Vote: unknown

Election timer: 1000
Log: [2, 8]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: ->0000 ->158b <-6943 <-158b
Servers:
    6943 (6943 at tcp:192.168.121.2:6643)
    158b (158b at tcp:192.168.121.3:6643)
    298d (298d at tcp:192.168.121.4:6643) (self)
OVN_NB_DB=tcp:192.168.121.2:6641,tcp:192.168.121.3:6641,tcp:192.168.121.4:6641 ovn-nbctl show
OVN_SB_DB=tcp:192.168.121.2:6642,tcp:192.168.121.3:6642,tcp:192.168.121.4:6642 ovn-sbctl show

Cluster Failover Testing

做一个failover的测试,停掉容器v1 (lxc stop v1), 会在v2与v3上看到如下日志,现在v2变成了leader,并且Term由2变成了3.

root@v2:~# tail -f /var/log/ovn/ovsdb-server-nb.log
2022-10-12T03:38:18.092Z|00085|raft|INFO|received leadership transfer from 6943 in term 2
2022-10-12T03:38:18.092Z|00086|raft|INFO|term 3: starting election
2022-10-12T03:38:18.095Z|00088|raft|INFO|term 3: elected leader by 2+ of 3 servers

root@v3:~# tail -f /var/log/ovn/ovsdb-server-nb.log
2022-10-12T03:38:18.095Z|00021|raft|INFO|server 158b is leader for term 3

root@v2:~# ovs-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound
158b
Name: OVN_Northbound
Cluster ID: 51b9 (51b9f953-989f-4f90-9add-73dbabe3fe06)
Server ID: 158b (158b0aea-ba5d-42e0-b69b-2fc05204f622)
Address: tcp:192.168.121.3:6643
Status: cluster member
Role: leader
Term: 3
Leader: self
Vote: self
Election timer: 1000
Log: [2, 9]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: (->0000) <-298d ->298d
Servers:
    6943 (6943 at tcp:192.168.121.2:6643) next_index=9 match_index=0
    158b (158b at tcp:192.168.121.3:6643) (self) next_index=8 match_index=8
    298d (298d at tcp:192.168.121.4:6643) next_index=9 match_index=8

set inactivity-probe for raft port (6644)

ovn-sbctl list connection
#ovn-sbctl --no-leader-only list connection
ovn-sbctl --inactivity-probe=30000 set-connection pssl:6642 pssl:6644 pssl:16642
ovn-sbctl --inactivity-probe=30000 set-connection pssl:6642 pssl:6644 pssl:16642 punix:/var/run/ovn/ovnsb_db.sock
ovn-sbctl --inactivity-probe=30000 set-connection read-write role="ovn-controller" pssl:6642 read-write role="ovn-controller" pssl:6644 pssl:16642

或者使用下面的(它等价于:ovn-nbctl --inactivity-probe=57 set-connection pssl:6642 pssl:6644)

#https://mail.openvswitch.org/pipermail/ovs-discuss/2020-February/049743.html
#https://opendev.org/x/charm-ovn-central/commit/9dcd53bb75805ff733c8f10b99724ea16a2b5f25
ovn-sbctl -- --id=@connection create Connection target="pssl\:6644" inactivity_probe=55 -- set SB_Global . connections=@connection
ovn-sbctl set connection . inactivity_probe=56

#above 'set SB_Global' will delete all then create one new, here 'add SB_Global' is only to add one new
ovn-sbctl -- --id=@connection create Connection target="pssl\:6648" -- add SB_Global . connections @connection
ovn-sbctl --inactivity-probe=30000 set-connection pssl:6648

NOTE: 20221019更新 - 上面的’ovn-sbctl --inactivity-probe=30000 set-connection pssl:6648’会覆盖到6648之外的其他port的配置, 结果造成客户开了L1, 想哭. 正确的设置方法是下列两种之一:

ovn-sbctl --inactivity-probe=60001 set-connection read-write role="ovn-controller" pssl:6644 pssl:6641 pssl:6642 pssl:16642

ovn-sbctl -- --id=@connection create Connection role=ovn-controller target="pssl\:6644" inactivity_probe=6000 -- add SB_Global . connections @connection

use ovsdb-tool to set up cluster

https://mail.openvswitch.org/pipermail/ovs-discuss/2020-February/049743.html
使用下列命令来用ovsdb-tool来创建cluster时刚开始未成功(在v2上用’ovs-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound’ 看到'Remotes for joining: tcp:192.168.121.4:6643 tcp:192.168.121.3:6643'无法加入cluster),那是因为在v2上运行’join-cluster’命令时v2的ip (192.168.121.3:6644)应该写在最前面, 所以对于v2它应该是" tcp:192.168.121.3:6644 tcp:192.168.121.2:6644 tcp:192.168.121.4:6644",而不是" tcp:192.168.121.2:6644 tcp:192.168.121.3:6644 tcp:192.168.121.4:6644"

#reset env in all nodes(v1, v2, v3)
systemctl stop ovn-central
rm -rf /var/lib/ovn/* && rm -rf /var/lib/ovn/.ovn*
rm -rf /etc/default/ovn-central

# on v1
rm -rf /var/lib/openvswitch/ovn*b_db.db
ovsdb-tool create-cluster /var/lib/openvswitch/ovnsb_db.db /usr/share/ovn/ovn-sb.ovsschema tcp:192.168.121.2:6644
ovsdb-tool create-cluster /var/lib/openvswitch/ovnnb_db.db /usr/share/ovn/ovn-nb.ovsschema tcp:192.168.121.2:6643

# on v2
rm -rf /var/lib/openvswitch/ovn*b_db.db
ovsdb-tool join-cluster /var/lib/openvswitch/ovnsb_db.db OVN_Southbound tcp:192.168.121.3:6644 tcp:192.168.121.2:6644 tcp:192.168.121.4:6644
ovsdb-tool join-cluster /var/lib/openvswitch/ovnnb_db.db OVN_Northbound tcp:192.168.121.3:6643 tcp:192.168.121.2:6643 tcp:192.168.121.4:6643

# on v3
rm -rf /var/lib/openvswitch/ovn*b_db.db
ovsdb-tool join-cluster /var/lib/openvswitch/ovnsb_db.db OVN_Southbound tcp:192.168.121.4:6644 tcp:192.168.121.2:6644 tcp:192.168.121.3:6644
ovsdb-tool join-cluster /var/lib/openvswitch/ovnnb_db.db OVN_Northbound tcp:192.168.121.4:6643 tcp:192.168.121.2:6643 tcp:192.168.121.3:6643

# then append the following content in /etc/default/ovn-central, finally restart ovn-central
--db-nb-file=/var/lib/openvswitch/ovnnb_db.db --db-sb-file=/var/lib/openvswitch/ovnsb_db.db

又例如,这个lp bug 提到的恢复ovn-central的方法 - https://bugs.launchpad.net/charm-ovn-central/+bug/1948680

1. Stop all units:
$ juju run-action ovn-central/0 pause --wait
$ juju run-action ovn-central/1 pause --wait
$ juju run-action ovn-central/2 pause --wait

2. Create standalone on ovn-central/0
# ovsdb-tool cluster-to-standalone /tmp/standalone_ovnsb_db.db /var/lib/ovn/ovnsb_db.db
# ovsdb-tool cluster-to-standalone /tmp/standalone_ovnnb_db.db /var/lib/ovn/ovnnb_db.db

3. Create the cluster
# mv /var/lib/ovn/ovnsb_db.db /var/lib/ovn/ovnsb_db.db.old -v
# mv /var/lib/ovn/ovnnb_db.db /var/lib/ovn/ovnnb_db.db.old -v
# ovsdb-tool create-cluster /var/lib/ovn/ovnsb_db.db /tmp/standalone_ovnsb_db.db ssl:<ovn-central-0-ip>:6644
# ovsdb-tool create-cluster /var/lib/ovn/ovnnb_db.db /tmp/standalone_ovnnb_db.db ssl:<ovn-central-0-ip>:6643

4. Resume ovn-central/0
$ juju run-action ovn-central/0 resume --wait


5. Get New Cluster UUIDs
$ juju ssh ovn-central/0 'sudo ovn-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound'
$ juju ssh ovn-central/0 'sudo ovn-appctl -t /var/run/ovn/ovnsb_db.ctl cluster/status OVN_Southbound'


6. Join cluster from ovn-central/1
# mv /var/lib/ovn/ovnsb_db.db /var/lib/ovn/ovnsb_db.db.old -v
# mv /var/lib/ovn/ovnnb_db.db /var/lib/ovn/ovnnb_db.db.old -v
# ovsdb-tool --cid=SB-UUID join-cluster /var/lib/ovn/ovnsb_db.db OVN_Southbound ssl:<ovn-central-1-ip>:6644 ssl:<ovn-central-0-ip>:6644
# ovsdb-tool --cid=NB-UUID join-cluster /var/lib/ovn/ovnnb_db.db OVN_Northbound ssl:<ovn-central-1-ip>:6643 ssl:<ovn-central-0-ip>:6643

7. Resume ovn-central/1
$ juju run-action ovn-central/1 resume --wait

8. Join cluster from ovn-central/2
# mv /var/lib/ovn/ovnsb_db.db /var/lib/ovn/ovnsb_db.db.old -v
# mv /var/lib/ovn/ovnnb_db.db /var/lib/ovn/ovnnb_db.db.old -v

# ovsdb-tool --cid=SB-UUID join-cluster /var/lib/ovn/ovnsb_db.db OVN_Southbound ssl:<ovn-central-2-ip>:6644 ssl:<ovn-central-1-ip>:6644 ssl:<ovn-central-0-ip>:6644
# ovsdb-tool --cid=NB-UUID join-cluster /var/lib/ovn/ovnnb_db.db OVN_Northbound ssl:<ovn-central-2-ip>:6643 ssl:<ovn-central-1-ip>:6643 ssl:<ovn-central-0-ip>:6643

9. Resume ovn-central/2
$ juju run-action ovn-central/2 resume --wait

一个大问题

上面给raft port 6644设置inactivity-probe会造成一个大问题,因为要给它设置值就会创建一个connection, 这样在重启ovn-ovsdb-server-sb.service时会看到错误:6644:10.5.3.254: bind: Address already in use , 这样SB DB不 work, 进而neutron list与nova list都hang在那.
最后通过下列命令将a) 将SB DB从raft版本转成standalone版本 b) 从命令行启动 (从systemd启动会仍然以cluster模式启动) c) 将6644这个connection删除

ovsdb-tool cluster-to-standalone /var/lib/ovn/ovnsb_db.db_standalone /var/lib/ovn/ovnsb_db.db
cp /var/lib/ovn/ovnsb_db.db /var/lib/ovn/ovnsb_db.db_bk2
cp /var/lib/ovn/ovnsb_db.db_standalone /var/lib/ovn/ovnsb_db.db
ovsdb-server --remote=punix:/var/run/ovn/ovnsb_db.sock --pidfile=/var/run/ovn/ovnsb_db.pid --unixctl=/var/run/ovn/ovnsb_db.ctl --remote=db:OVN_Southbound,SB_Global,connections --private-key=/etc/ovn/key_host --certificate=/etc/ovn/cert_host --ca-cert=/etc/ovn/ovn-central.crt --ssl-protocols=db:OVN_Southbound,SSL,ssl_protocols --ssl-ciphers=db:OVN_Southbound,SSL,ssl_ciphers /var/lib/ovn/ovnsb_db.db > temp 2>&1

6644的心跳与6641和6642还有有些区别的, 它只是3个ovn-central节点的raft之间的心跳, 一般5秒应该是够了(For raft ports the only thing which is transmitted consensus and the DB diff, it should not take more than 5 seconds ). 它将调大了的话then it will take longer to see if any of the members has issues and you could get in a place where you have a split brain/db corruption , so changing the inactivity probe for raft ports is discouraged

https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/1990978
https://bugs.launchpad.net/openvswitch/+bug/1985062

raft的heartbeat是单向的, 当inactivity-probe被禁用时,leader不停往followers发heartbeat, followers没有收到heartbeat就会发起election, 但是如果leader没有收到任何响应的话它也不会做任何事情(if leader will not receive any replies it will not do anything until there is a quorum), 这会导致leader一直盲目发包浪费cpu. 为避免这种情况又重新引入了inactivity-probe这样leader是长时间没有收到一个leader的响应的话会断到和它之间的tcp连接.
在所有ovn-central节点上运行上面iptables将会让下列的连接变得单向.

   iptables -A INPUT -i eth0 -p tcp --match multiport --dports 6641:6644 -j DROP
   iptables -A POSTROUTING -p tcp --match multiport --dports 6641:6644 -j DROP
   
     ovn1:<random port> <------ ovn3:6641
      ovn1:6641          ------> ovn3:<random port>

20221122 - 一些分析

1, neutron_lib工程的worker.BaseWorker#start会发布消息

registry.publish(resources.PROCESS, events.AFTER_INIT, self.start)

2, neutron工程中的./neutron/worker.py#NeutronBaseWorker会继承neutron_lib工程的worker.BaseWorker, wsgi#WorkerService是继承自NeutronBaseWorker的,UnixDomainWSGIServer继承了wsgi#WorkerService. UnixDomainMetadataProxy会调用UnixDomainWSGIServer, metadata-agent会调用UnixDomainMetadataProxy

3, neutron工程的./neutron/agent/ovn/metadata/server.py会消费消息

45 class MetadataProxyHandler(object):
        ...
 65     def subscribe(self):                                                        
 66         registry.subscribe(self.post_fork_initialize,                           
 67                            resources.PROCESS,                                   
 68                            events.AFTER_INIT) 

4, 在post_fork_initialize中为每个worker创建到SB_DB的sdl连接

 70     def post_fork_initialize(self, resource, event, trigger, payload=None):     
 71         # We need to open a connection to OVN SouthBound database for           
 72         # each worker so that we can process the metadata requests.             
 73         self._post_fork_event.clear()                                           
 74         self.sb_idl = ovsdb.MetadataAgentOvnSbIdl(                                                                                                                                 
 75             tables=('Port_Binding', 'Datapath_Binding', 'Chassis'),             
 76             chassis=self.chassis).start()                                       
 77                                                                                 
 78         # Now IDL connections can be safely used.                               
 79         self._post_fork_event.set()


5, 在ovsdb.MetadataAgentOvnSbIdl中会根据ovsdb_retry_max_interval=180秒重新调用ovsdbapp.schema.open_vswitch.impl_idl来创建链接

import logging

from ovs.db import idl
from ovsdbapp.backend.ovs_idl import connection
from ovsdbapp.backend.ovs_idl import idlutils
from ovsdbapp.schema.open_vswitch import impl_idl as idl_ovs

TIMEOUT_SEC = 3600

class NotificationBackend(object):
 
    @staticmethod
    def notify_event(event, table):
        print('New event "%s", table "%s"' % (event, table))


def connect():
    connection_string = "tcp:127.0.0.1:6642"
    helper = idlutils.get_schema_helper(connection_string, 'Open_vSwitch')
    tables = ('Open_vSwitch', 'Bridge', 'Port', 'Interface')
    for table in tables:
        helper.register_table(table)
    ovs_idl = idl.Idl(connection_string, helper, NotificationBackend())
    conn = connection.Connection(ovs_idl, timeout=TIMEOUT_SEC)
    return idl_ovs.OvsdbIdl(conn)

def get_port_name_list(ovs_idl):
    return ovs_idl.list_ports('br-int').execute(check_error=False)

def main():
    ovs_idl = connect()
    port_names = get_port_name_list(ovs_idl)
    ovs_idl.db_list('Interface', port_names,
                    columns=['name', 'external_ids', 'ofport'], if_exists=False).execute(check_error=False)

if __name__ == "__main__":
    logging.basicConfig()
    main()

6, ./neutron/agent/ovn/metadata/agent.py用了ovsdb.MetadataAgentOvnSbIdl, 且它会调用self.sb_idl.db_add去做事.
   在获取self.sb_idl时会使用self._post_fork_event.wait
   
209     @property                                                                   
210     def sb_idl(self):                                                                                                                                                              
211         if not self._sb_idl:                                                    
212             self._post_fork_event.wait()                                        
213         return self._sb_idl

  它在运行了self._post_fork_event.set()之后也会调用self.sync()进而调用self.ensure_all_networks_provisioned来为每个namespace来创建metadata tap这期间
会调用ports = self.sb_idl.get_ports_on_chassis(self.chassis) ,此时若SB_DB在做snapshop leadership transfer就连不上了从而从SB_DB中获取的Port_Binding为0,
也就无法为VM创建metadata tap了

7, 所以在ovn的leader切换后, neutron端应该设置resync=true然后重新运行self.agent.sync()吗?(目前只是PortBindingChassisEvent, PortBindingChassisCreatedEvent, PortBindingChassisDeletedEvent, /ChassisCreateEventBase)会重新运行sync(). 重新连接之后neutron也确实会做sync的, 日志如下:

2022-08-18 09:15:49.887 75556 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:100.94.0.158:6642: clustered database server is disconnected from cluster; trying another server
2022-08-18 09:15:49.888 75556 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:100.94.0.158:6642: connection closed by client
2022-08-18 09:15:49.892 75556 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:100.94.0.99:6642: connecting...
2022-08-18 09:15:49.910 75556 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:100.94.0.99:6642: connected
2022-08-18 09:15:50.726 75556 INFO neutron.agent.ovn.metadata.agent [-] Connection to OVSDB established, doing a full sync

8, 往ovn的表(如Chassis_Private)添加记录时都会重试 - 如:https://review.opendev.org/c/openstack/neutron/+/764318/3/neutron/agent/ovn/metadata/agent.py#260

9, 之后,在2022-08-18 12:30由于snapshop leadership transfer, 100.94.0.204:6644变成了leader

5-lxd-23: f801 100.94.0.99:6644 follower term=297
6-lxd-24: 0f3c 100.94.0.158:6644 follower term=297
7-lxd-27: 9b15 100.94.0.204:6644 leader term=297 leader

$ find sosreport-juju-2752e1-*/var/log/ovn/* |xargs zgrep -i -E 'received leadership transfer from'
sosreport-juju-2752e1-6-lxd-24-00341190-2022-08-18-entowko/var/log/ovn/ovsdb-server-sb.log:2022-08-18T12:30:35.610Z|80967|raft|INFO|received leadership transfer from 9b15 in term 295
sosreport-juju-2752e1-7-lxd-27-00341190-2022-08-18-hhxxqci/var/log/ovn/ovsdb-server-sb.log:2022-08-18T12:30:35.322Z|92695|raft|INFO|received leadership transfer from 0f3c in term 294
sosreport-juju-2752e1-7-lxd-27-00341190-2022-08-18-hhxxqci/var/log/ovn/ovsdb-server-sb.log:2022-08-18T17:52:53.025Z|92893|raft|INFO|received leadership transfer from 0f3c in term 296
# reelection was caused by snapshot factor according to timestamp
$ find sosreport-juju-2752e1-*/var/log/ovn/* |xargs zgrep -i -E 'Transferring leadership'
sosreport-juju-2752e1-6-lxd-24-00341190-2022-08-18-entowko/var/log/ovn/ovsdb-server-sb.log:2022-08-18T12:30:35.322Z|80962|raft|INFO|Transferring leadership to write a snapshot.
sosreport-juju-2752e1-6-lxd-24-00341190-2022-08-18-entowko/var/log/ovn/ovsdb-server-sb.log:2022-08-18T17:52:53.024Z|82382|raft|INFO|Transferring leadership to write a snapshot.
sosreport-juju-2752e1-7-lxd-27-00341190-2022-08-18-hhxxqci/var/log/ovn/ovsdb-server-sb.log:2022-08-18T12:30:35.330Z|92698|raft|INFO|Transferring leadership to write a snapshot.

10, 2022-08-18 12:30:36发生了错误

$ find sosreport-srv1ibm00*/var/log/neutron/* |xargs zgrep -i -E 'OVSDB Error'
sosreport-srv1ibm002d-00341190-2022-08-18-cuvkufw/var/log/neutron/neutron-ovn-metadata-agent.log:2022-08-18 12:30:36.103 75556 ERROR ovsdbapp.backend.ovs_idl.transaction [-] OVSDB Error: no error details available

metadata好的时候看到的是这种日志:
2022-08-18 00:01:37.607 75810 INFO eventlet.wsgi.server [-] 100.94.99.243,<local> "GET /latest/meta-data/ HTTP/1.1" status: 200  len: 360 time: 0.0124910

发生了OVSDB Error之后看到的一直是下列日志, 直到重启metadata-agent才恢复:
2022-08-18 13:13:55.508 75810 ERROR neutron.agent.ovn.metadata.server [-] No port found in network 63e2c276-60dd-40e3-baa1-c16342eacce2 with IP address 100.94.99.107

它是neutron/agent/ovn/metadata/server.py#_get_instance_and_project_id在调用self.sb_idl.get_network_port_bindings_by_ip时self.sb_idl连不上SB_DB了


[1] reference: https://medoc.readthedocs.io/en/latest/docs/ovs/ovn/idl/1_ovn_driver_2_ovn_client.html

测试代码

import sys
import hashlib
import hmac
import threading
import urllib

from neutron._i18n import _
from neutron.agent.linux import utils as agent_utils
from neutron.agent.ovn.metadata import ovsdb
from neutron.common import ipv6_utils
from neutron.common.ovn import constants as ovn_const
from neutron.common import config
from neutron.common import utils
from neutron_lib.callbacks import events
from neutron_lib.callbacks import registry
from neutron_lib.callbacks import resources
from neutron.conf.agent.ovn.metadata import config as ovn_meta
from neutron.conf.agent.metadata import config as meta
from oslo_config import cfg
from oslo_log import log as logging
from oslo_utils import encodeutils
import requests
import webob


LOG = logging.getLogger(__name__)


class MetadataProxyHandler(object):

    def __init__(self, conf, chassis):
        self.conf = conf
        self.chassis = chassis
        self._sb_idl = None
        self._post_fork_event = threading.Event()

    @property
    def sb_idl(self):
        if not self._sb_idl:
            self._post_fork_event.wait()
        return self._sb_idl

    @sb_idl.setter
    def sb_idl(self, val):
        self._sb_idl = val

    def post_fork_initialize(self):
        self._post_fork_event.clear()
        self.sb_idl = ovsdb.MetadataAgentOvnSbIdl(
            tables=('Port_Binding', 'Datapath_Binding', 'Chassis'),
            chassis=self.chassis).start()
        # Now IDL connections can be safely used.
        self._post_fork_event.set()

    def get_instance_and_project_id(self):
        #ports = self.sb_idl.get_network_port_bindings_by_ip(network_id, remote_address)
        ports = self.sb_idl.get_ports_on_chassis(self.chassis)
        num_ports = len(ports)
        if num_ports == 0:
            LOG.error("No port found")
            return None
        return ports


if __name__ == '__main__':
    ovn_meta.register_meta_conf_opts(meta.SHARED_OPTS)
    ovn_meta.register_meta_conf_opts(meta.UNIX_DOMAIN_METADATA_PROXY_OPTS)
    ovn_meta.register_meta_conf_opts(meta.METADATA_PROXY_HANDLER_OPTS)
    ovn_meta.register_meta_conf_opts(ovn_meta.OVS_OPTS, group='ovs')
    config.init(['--config-file=/etc/neutron/neutron.conf', '--config-file=/etc/neutron/neutron_ovn_metadata_agent.ini'])
    config.setup_logging()
    # ovn-sbctl list chassis
    #import rpdb;rpdb.set_trace()
    proxy = MetadataProxyHandler(cfg.CONF, chassis='juju-402d04-xena-8.cloud.sts')
    proxy.post_fork_initialize()
    ports = proxy.get_instance_and_project_id()
    print(ports)

试图重现1996594

试图重现: https://bugs.launchpad.net/neutron/+bug/1996594 , 但失败了

#!/bin/bash
for i in {1..5000}
do
ovn-nbctl ls-add sw$i
if [[ $? -ne 0 ]] ; then
    echo "Failed on ls-add i: $i"
    exit 1
fi
        for j in {1..2000}
        do
                echo "Iteration i: $i and j:$j"
                ovn-nbctl lsp-add sw$i sw$i$j
                if [[ $? -ne 0 ]] ; then
                    echo "Failed on lsp-add i: $i and j: $j"
                    exit 1
                fi
        done
done
for i in {1..5000}
do
        echo "Delete iteration i: $i"
        ovn-nbctl ls-del sw$i
        if [[ $? -ne 0 ]] ; then
            echo "Failed on ls-del i: $i"
            exit 1
        fi
done

按照上面(lp bug 1990978)中的脚本给DB加压最后将NB_DB给压出snapshot了,但不是SB_DB

Iteration i: 3 and j:822
Iteration i: 3 and j:823
2022-11-22T01:24:24Z|00002|timeval|WARN|Unreasonably long 1030ms poll interval (895ms user, 19ms system)
2022-11-22T01:24:24Z|00003|timeval|WARN|faults: 1634 minor, 0 major
2022-11-22T01:24:24Z|00004|timeval|WARN|context switches: 0 voluntary, 638 involuntary
Iteration i: 3 and j:824
Iteration i: 3 and j:825
Iteration i: 3 and j:826
Iteration i: 3 and j:827
2022-11-22T01:24:28Z|00002|timeval|WARN|Unreasonably long 1010ms poll interval (966ms user, 10ms system)
2022-11-22T01:24:28Z|00003|timeval|WARN|faults: 1394 minor, 0 major
2022-11-22T01:24:28Z|00004|timeval|WARN|context switches: 0 voluntary, 723 involuntary
Iteration i: 3 and j:828
Iteration i: 3 and j:829
2022-11-22T01:24:30Z|00002|timeval|WARN|Unreasonably long 1019ms poll interval (970ms user, 9ms system)
2022-11-22T01:24:30Z|00003|timeval|WARN|faults: 1284 minor, 0 major
2022-11-22T01:24:30Z|00004|timeval|WARN|context switches: 0 voluntary, 746 involuntary
Iteration i: 3 and j:830
Iteration i: 3 and j:831
Iteration i: 3 and j:832
2022-11-22T01:24:33Z|00002|timeval|WARN|Unreasonably long 1015ms poll interval (969ms user, 9ms system)
2022-11-22T01:24:33Z|00003|timeval|WARN|faults: 1283 minor, 0 major
2022-11-22T01:24:33Z|00004|timeval|WARN|context switches: 0 voluntary, 728 involuntary
Iteration i: 3 and j:833
Iteration i: 3 and j:834
Iteration i: 3 and j:835
2022-11-22T01:24:36Z|00002|timeval|WARN|Unreasonably long 1029ms poll interval (908ms user, 19ms system)
2022-11-22T01:24:36Z|00003|timeval|WARN|faults: 1257 minor, 0 major
2022-11-22T01:24:36Z|00004|timeval|WARN|context switches: 0 voluntary, 675 involuntary
Iteration i: 3 and j:836
Iteration i: 3 and j:837
Iteration i: 3 and j:838
Iteration i: 3 and j:839
2022-11-22T01:24:41Z|00004|timeval|WARN|Unreasonably long 1125ms poll interval (907ms user, 37ms system)
2022-11-22T01:24:41Z|00005|timeval|WARN|faults: 1359 minor, 0 major
2022-11-22T01:24:41Z|00006|timeval|WARN|context switches: 0 voluntary, 667 involuntary
ovn-nbctl: unix:/var/run/ovn/ovnnb_db.sock: database connection failed ()
Failed on lsp-add i: 3 and j: 839

$ juju run -a ovn-central -- grep -r -i 'leadership transfer' /var/log/ovn/
- ReturnCode: 1
Stdout: ""
UnitId: ovn-central/0
- ReturnCode: 1
Stdout: ""
UnitId: ovn-central/1
- Stdout: |
/var/log/ovn/ovsdb-server-nb.log:2022-11-22T01:24:40.535Z|00890|raft|INFO|received leadership transfer from 42bc in term 1
UnitId: ovn-central/2

过了一会,SB_DB自已开始snapshot了,

$ juju run -a ovn-central -- grep -r -i 'Transferring leadership to write a snapshot' /var/log/ovn/
- Stdout: |
/var/log/ovn/ovsdb-server-nb.log:2022-11-22T01:43:16.695Z|00908|raft|INFO|Transferring leadership to write a snapshot.
UnitId: ovn-central/0
- Stdout: |
/var/log/ovn/ovsdb-server-nb.log:2022-11-22T01:24:40.534Z|00901|raft|INFO|Transferring leadership to write a snapshot.
/var/log/ovn/ovsdb-server-sb.log:2022-11-22T02:11:02.547Z|01586|raft|INFO|Transferring leadership to write a snapshot.
UnitId: ovn-central/1
- Stdout: |
/var/log/ovn/ovsdb-server-nb.log:2022-11-22T01:43:11.186Z|00905|raft|INFO|Transferring leadership to write a snapshot.
UnitId: ovn-central/2

但此时创建虚机metadata没有问题, /var/log/neutron/neutron-ovn-metadata-agent.log中也没有搜到任何connect相关的字眼.

2022-11-22 04:23:34.386 45409 INFO neutron.agent.ovn.metadata.agent [-] Port fa4d9821-6ed3-4c7e-95a5-a68b08c95b9f in datapath 946aa920-137f-42f1-a4a8-8d772f2af3cd bound to our chassis

目前https://bugs.launchpad.net/ubuntu/+source/neutron/+bug/1975594这上面的lp bug里的patch都有的.
neutron端也支持连接non-leader - https://review.opendev.org/c/openstack/neutron/+/829486
monitor_cond_since也应该需要 - https://openii.cn/openvswitch/ovs/commit/9167cb52fa8708ff524be9fc41a8f5ccdfa0a15d
这个也应该需要 - https://patchwork.ozlabs.org/project/openvswitch/patch/20211213194603.32487-1-dceara@redhat.com/

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

ovn-central raft HA (by quqi99) 的相关文章

  • set up ovn development env (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 07 08 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 编译ovs并启动ovs vswitchd https docs
  • ovn metadata (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 08 25 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 问题 客户描述虚机的metadata功能偶尔有问题 xff0c
  • my cloud test bed (by quqi99)

    作者 张华 发表于 2023 03 10 版权声明 可以任意转载 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 问题 有一台NUC minipc 配置是 CPU i7 13700H 16核20线程 内存 16G 32G 4
  • devstack安装OpenStack Pike版本 (OVN+VLAN)

    控制节点配置文件 Sample DevStack local conf This sample file is intended to be used for your typical DevStack environment that 3
  • OVN总结

    参考 xff1a https www sdnlab com 18600 html 三 OVN L3 对比 Neutron L3 Neutron 的三层功能主要有路由 xff0c SNAT 和 Floating IP xff08 也叫 DNA
  • 《OVN Logical Flows and ovn-trace》翻译

    在本篇文章中 xff0c 我将解释什么是Logical Flow以及如何使用ovn trace去更好地理解它们 同时 xff0c 我也会用一些例子来解释 xff0c 为什么使用Logical Flow这种抽象模型能让新特性的添加变得出乎意料
  • OVN – OVN OpenStack(二)

    OpenStack networking ovn 项目为Neutron提供了一个基于ML2的OVN插件 xff0c 它使用OVN组件代替了各种Neutron的Python agent xff0c 也不再使用 RabbitMQ xff0c 而
  • openstack和ovn架构

  • ovn 架构介绍

    ovn是什么就不多说了 xff0c 网上有很多介绍的文章 这里主要是学习下ovn的架构 xff0c 并通过实践认识一下ovn ovn代码最初是在ovs源码下 xff0c 但是从版本v2 13 0开始 xff0c ovn被移除ovs xff0
  • ovn 通过网关虚拟路由器连接外部网络

    本文实验如何通过ovn的网关逻辑路由器将ovn网络连接到外部网络 前面讲过ovn的逻辑路由器是分布式的 xff0c 这意味着它没有绑定到某个节点上 xff0c 而是存在于所有节点上的 xff0c 同时它是通过每个节点的openflow流表来
  • ovn-northd 源码分析

    ovn northd是ovn中的核心后台进程 xff0c 主要负责将ovn的高层配置转换成供ovn controller后台进程使用的逻辑配置 xff0c 更详细的说就是它将ovn northbound数据库中传统意义上的逻辑网络配置转换成
  • ovn-controller源码分析

    ovn controller是运行在chassis hypervisor上的后台进程 xff0c 向上通过OVSDB协议连接到OVN sourthbound数据库 xff0c 向下通过OVSDB协议连接到ovs数据库 xff0c 并通过op
  • OVN 流表基础 -- 基于 kubeOVN (一)

    文章目录 Kubectl ko 工具分析Nbctl SbctlTrace Ovn 流表MatchActionRegisterTable 介绍Logical Switch DatapathsLogical Router Datapaths K
  • TinyKv Project3 PartA Multi-raft KV

    TinyKv Project3 PartA Multi raft KV 前言Project3 PartA Multi raft KV 文档翻译Add RemoveLeaderTransfer 前言 Project3是整个项目最难的部分 xf
  • Testing ovn manually based on LXD (by quqi99)

    作者 张华 发表于 2022 05 27 版权声明 可以任意转载 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 准备两个LXD容器 lxc list NAME STATE IPV4 IPV6 TYPE SNAPSHOTS
  • 所有的raft算法

    https raft github io
  • 一致性算法(paxos、raft)

    背景 分布式 一致性模型 1 弱一致性 最终一致性 1 1 DNS 1 1 Gossip 2 强一致性 2 1 同步 2 1 paxos 2 1 raft multi paxos 2 1 ZAB multi paxos 与raft相似 心跳
  • Raft算法的Java实现

    自己这几天在看Redis的Sentinel高可用解决方案 Sentinel选主使用的是著名的Raft一致性算法 本文对Raft的选主作了介绍 具体的算法内容 请参考 Raft 论文 Raft的整体结构 Raft 通过选举一个高贵的领导人 然
  • etcd 集群搭建及常用场景分析

    概述 etcd 是一个分布式一致性k v存储系统 可用于服务注册发现与共享配置 具有以下优点 简单 相比于晦涩难懂的paxos算法 etcd基于相对简单且易实现的raft算法实现一致性 并通过gRPC提供接口调用 安全 支持TLS通信 并可
  • 在 RAFT 中,是否有可能对某个日志条目达成多数共识,但该条目尚未提交?

    考虑一下官方的这个模拟筏网页 https raft github io Why is term 2 index 1尽管没有承诺S2 leader S3 and S4同意日志吗 我运行了几分钟以确保所有通信均已完成 奇怪的是 如果我再添加一个

随机推荐

  • Matlab 2016a/b中调用GPU速度巨慢的解决办法

    利用caffe的MATLAB接口跑深度学习时 xff0c 设置gpu模式 xff1a caffe set mode gpu xff0c 可以加速运算 xff0c 然而在MATLAB 2016a b中调用gpu时会出现了一个BUG xff0c
  • keras 2.3.0 做上采样 UpSampling2D的时候的维度出错问题解决办法

    简单的说 xff0c 你是不是遇到了这样的问题 xff0c 上一层的数据是 None xff0c 200 14 14 你希望上采样到28x28 H 61 UpSampling2D size 61 2 2 H 你以为能得到 None xff0
  • juju based openstack upgrade (by quqi99)

    作者 张华 发表于 2022 02 17 版权声明 可以任意转载 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 http blog csdn net quqi99 问题 客户想将juju管理的openstack从xenia
  • Try Fyde OS on VMWare and Surface (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 02 28 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 http blog csdn net quqi99 Insta
  • Installing third-party firmware on x3-55 letv (by quqi99)

    问题 趁贾老板明天回国之前 xff0c 得连夜将他的乐视x3 55电视刷成第三方精简版的固件 xff0e 官方固件安装的内置服务太多不仅占硬盘空间而且都开着也占用内存影响运行速度 xff0e 要安装的是 xff02 蓝同学 xff02 的固
  • Set up debian based maas ha env on xenial by hand (by quqi99)

    准备三个节点 本文将在xenial ubuntu 16 04 使用debian包手工创建maas ha环境 先快速准备三个节点 juju deploy ubuntu maas1 series xenial config hostname m
  • add a wifi AP for armbian box (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 03 26 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 http blog csdn net quqi99 无线网卡的
  • Kids are forbidden to watch TV after school (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 03 30 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 http blog csdn net quqi99 iptab
  • ubuntu 20.04升级到22.04中遇到的问题(by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 04 23 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 http blog csdn net quqi99 昨天通过
  • keil下载代码时出现:“Not a genuine ST Device! Abort connection“的错误

    最近在学习嵌入式 xff0c 难免要玩一些开发板 我选择了相对比较便宜的STM32F10C8T6 所以我就从网上购买了这快板子 刚开始买回来的时候 xff0c 我根本不知道往板子上烧录代码的时候还需要ST LINK 因为我在学F407的时候
  • Testing ovn manually based on LXD (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 05 27 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 准备两个LXD容器 lxc list 43 43 43 43
  • [WIP] Openstack Masakari (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 06 07 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 什么是masakari masakari是OpenStack
  • 远程解决win10上keyboard和chrome不work的两例问题(by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 06 10 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 远程解决了两例windows问题 xff0c 记录一下 xff
  • try anbox or waydroid (by quqi99)

    作者 张华 发表于 2022 06 28 版权声明 可以任意转载 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 无论是安装anbox还是waydroid都失败了 记录一下 里面首先是没有 dev binder的问题 那是因
  • set up ovn development env (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 07 08 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 编译ovs并启动ovs vswitchd https docs
  • Using lxd to do vlan test (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 08 15 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 问题 客户说sriov虚机里收不着arp reply 他们的s
  • ovn metadata (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 08 25 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 问题 客户描述虚机的metadata功能偶尔有问题 xff0c
  • 网络攻防实验 (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 08 29 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 测试环境 lxd容器 xff0c i3为中间攻击者所以在i3上
  • nova VirtualInterfaceCreateException (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 09 01 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 问题 虚机有时候会报下列错误 xff1a nova excep
  • ovn-central raft HA (by quqi99)

    作者 xff1a 张华 发表于 xff1a 2022 10 12 版权声明 xff1a 可以任意转载 xff0c 转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 What s raft RAFT https raft git