推断是Network manager 导致的,原因待查
今天在VmWare的虚拟机上装了个测试RAC,又遇到了一个摸不到头绪的问题
CRS装好后,一旦登陆图形界面,节点就重启,事情就有这么巧
不登陆图形界面,观察了1个小时没问题,一旦登陆后,立刻重启
在OS日志中,一旦登陆图形界面,重启前的日志如下
Sep 5 19:29:18 dm01db01 nm-system-settings: Loaded plugin ifcfg-rh: (c) 2007 - 2008 Red Hat, Inc. To report bugs please use the NetworkManager mailing list.
Sep 5 19:29:18 dm01db01 nm-system-settings: ifcfg-rh: parsing /etc/sysconfig/network-scripts/ifcfg-lo ...
Sep 5 19:29:18 dm01db01 nm-system-settings: ifcfg-rh: parsing /etc/sysconfig/network-scripts/ifcfg-eth1 ...
Sep 5 19:29:18 dm01db01 nm-system-settings: ifcfg-rh: read connection 'System eth1'
Sep 5 19:29:18 dm01db01 nm-system-settings: ifcfg-rh: parsing /etc/sysconfig/network-scripts/ifcfg-eth0 ...
Sep 5 19:29:18 dm01db01 nm-system-settings: ifcfg-rh: read connection 'System eth0'
集群件没有任何日志,就像机器被人直接重启了一样,找不到任何原因
ping心跳,偶尔有200多ms,但是重启前,ping都在几ms内
vmstat监控,CPU利用率也没有问题
测试了如下调整:
1.加大 misscount 无效
2.调整 diagwait,也没有任何日志
3.关闭了无用的服务,无效
4.重新换了个网段,无效
一直觉得是网络的问题,搜索关键字 ifcfg-rh ,找到了一篇文章 OEL: Error: Missing Or Invalid IP4 Prefix '0' On Linux Server (Doc ID 1522095.1)
虽然现象和我的问题无关,但是抱着死马当活马医的想法,跟着文档关闭了Network manager
1.在/etc/sysconfig/network-scripts/ifcfg-eth* 中增加 NM_CONTROLLED="no"
2.chkconfig NetworkManager off
3.reboot
重启后主机正常。在OS日志中看到:
Sep 5 19:41:06 dm01db01 nm-system-settings: Loaded plugin ifcfg-rh: (c) 2007 - 2008 Red Hat, Inc. To report bugs please use the NetworkManager mailing list.
Sep 5 19:41:06 dm01db01 nm-system-settings: ifcfg-rh: parsing /etc/sysconfig/network-scripts/ifcfg-lo ...
Sep 5 19:41:06 dm01db01 nm-system-settings: ifcfg-rh: parsing /etc/sysconfig/network-scripts/ifcfg-eth1 ...
Sep 5 19:41:06 dm01db01 nm-system-settings: ifcfg-rh: read connection 'System eth1'
Sep 5 19:41:06 dm01db01 nm-system-settings: ifcfg-rh: Ignoring connection 'System eth1' and its device because NM_CONTROLLED was false.
Sep 5 19:41:06 dm01db01 nm-system-settings: ifcfg-rh: parsing /etc/sysconfig/network-scripts/ifcfg-eth0 ...
Sep 5 19:41:06 dm01db01 nm-system-settings: ifcfg-rh: read connection 'System eth0'
Sep 5 19:41:06 dm01db01 nm-system-settings: ifcfg-rh: Ignoring connection 'System eth0' and its device because NM_CONTROLLED was false.
可以看到配置被忽略掉了。
先记录一个,以后在研究
版本信息
[root@dm01db01 network-scripts]# cat /etc/issue
Oracle Linux Server release 5.9
Kernel \r on an \m
[root@dm01db01 network-scripts]# uname -a
Linux dm01db01 2.6.39-300.26.1.el5uek #1 SMP Thu Jan 3 18:31:38 PST 2013 x86_64 x86_64 x86_64 GNU/Linux
[root@dm01db01 network-scripts]# /u01/app/oracle/product/crs/bin/crsctl query crs activeversion
CRS active version on the cluster is [10.2.0.5.0]
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/8242091/viewspace-772247/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/8242091/viewspace-772247/