2023-03-11 11:28:52,012 ERROR -139673393420032- send ZMQ message to [tcp://:9765] failed: [{"action":"get_pool_status","need_lock":false,"resource":"usbp1r02n01","req_type":"DelegateAction"}] (/usr/lib/python2.7/threading.py:801)
2023-03-11 11:28:52,196 ERROR -139673938683648- send ZMQ message to [tcp://:9765] failed: [{"req_type":"GetNodeStatus"}] (/usr/lib/python2.7/threading.py:801)
2023-03-11 11:28:57,057 WARNING -139673393420032- cmd [echo 'this is a test.' > /pitrix/.pitrix_test && (grep /dev/drbd0 /proc/mounts | grep ' /pitrix/data/container ' && echo 'this is a test.' > /pitrix/data/container/.drbd_test)] on [usbp1r02n01] time out. exec_time is [5.004]s (/usr/lib/python2.7/threading.py:801)
2023-03-11 11:29:53,672 WARNING -139673393420032- check node [usbp1r02n01] disk failed! (/usr/lib/python2.7/threading.py:801)
触发灾难迁移。存储磁盘出现问题,磁盘io下降
Mar 11 11:48:38 usbp1r02n01 kernel: drbd usbp1r02n01: meta connection shut down by peer.
这里网络已经不通,丢包无法访问导致后续请求堆积
Mar 11 11:48:38 usbp1r02n01 kernel: drbd usbp1r02n01: peer( Secondary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown )
drbd脑裂,访问对端不通
Mar 11 11:48:38 usbp1r02n01 kernel: block drbd0: new current UUID DD8905A6A6A052F1:1B08E7A9B5422769:608C80F76032E909:608B80F76032E909
INFO: task jbd2/sda1-8:1430 blocked for more than 120 seconds.
默认情况下, Linux会最多使用40%的可用内存作为文件系统缓存。当超过这个阈值后,文件系统会把将缓存中的内存全部写入磁盘, 导致后续的IO请求都是同步的。
将缓存写入磁盘时,有一个默认120秒的超时时间。 出现上面的问题的原因是IO子系统的处理速度不够快,不能在120秒将缓存中的数据全部写入磁盘。
IO系统响应缓慢,导致越来越多的请求堆积,最终系统内存全部被占用,导致系统失去响应。
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)