k8s Trouble Shooting 故障排除

2023-11-15

    本文要讲的是k8s的故障排除,比较浅,最近刚入门。主要涵盖的内容是查看k8s对象的当前运行时信息;对于服务、容器的问题是如何诊断的;对于某些复杂的问题例如pod调度问题是如何排查的。

1、查看系统的Event事件

    在对象资源(pod,service,RC,node,namespace,deployment等)运行有问题时,例如pod创建后没有成功运行,都应该查看k8s对象的当前运行时信息,特别是与对象关联的Event事件。这些事件记录了相关主题、发生时段、最近发生时间、发生次数和时间原因等。

    k8s提供一下命令来查看对象运行状态:

kubectl describe pod xxxx
kubectl describe node xxxx

结果如下:
 

[root@centos ~]# kubectl  get pod
NAME                    READY   STATUS    RESTARTS   AGE
curl-5f8bff6547-rb4qk   1/1     Running   2          3d14h
redis-master-7j8cm      1/1     Running   2          3d14h
webapp-j7gd2            1/1     Running   3          3d21h
webapp-kzrn7            1/1     Running   3          3d14h
[root@centos ~]# kubectl describe pod webapp-j7gd2 
Name:               webapp-j7gd2
Namespace:          default
Priority:           0
PriorityClassName:  <none>
Node:               node3/192.168.195.138
Start Time:         Mon, 08 Apr 2019 13:19:25 +0800
Labels:             app=webapp
Annotations:        <none>
Status:             Running
IP:                 10.244.1.35
Controlled By:      ReplicationController/webapp
Containers:
  webapp:
    Container ID:   docker://e4dd5ec51e4d05456bd1605459a252085ad092c6be26e2becd5301114a470a33
    Image:          tomcat:9-jre8-alpine
    Image ID:       docker-pullable://tomcat@sha256:67fc2a0a54f9dfa7abda85a2900d721a55115dcae8ca7da560e65d15ca4c8aa7
    Port:           8080/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Thu, 11 Apr 2019 09:26:42 +0800
    Last State:     Terminated
      Reason:       Error
      Exit Code:    255
      Started:      Mon, 08 Apr 2019 21:52:27 +0800
      Finished:     Thu, 11 Apr 2019 09:25:55 +0800
    Ready:          True
    Restart Count:  3
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-nx72w (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  default-token-nx72w:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-nx72w
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>

最后一行的event信息比较难重要,我这个pod是没有问题的,所以没啥信息,如果你的pod有一场的话,这边是会有错误信息的。然后错误信息是英文的,你一看就知道是什么问题。一般是镜像拉不到啥的,没有可用的node等等。如果你的pod是在某个namespace下的,不是default命名空间下的,那就需要用一下命令来指定命名空间:

kubectl describe pod xxx -n 你的命名空间

2、查看容器的日志

  在需要排查容器内部应用程序生成的日志时,可以使用kubectl logs <pod-name>命令,例如:

[root@centos ~]# kubectl  get pod
NAME                    READY   STATUS    RESTARTS   AGE
curl-5f8bff6547-rb4qk   1/1     Running   2          3d14h
redis-master-7j8cm      1/1     Running   2          3d14h
webapp-j7gd2            1/1     Running   3          3d21h
webapp-kzrn7            1/1     Running   3          3d14h
[root@centos ~]# kubectl logs webapp-j7gd2 
11-Apr-2019 01:26:45.108 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server version name:   Apache Tomcat/9.0.17
11-Apr-2019 01:26:45.145 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server built:          Mar 13 2019 15:55:27 UTC
11-Apr-2019 01:26:45.146 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server version number: 9.0.17.0
11-Apr-2019 01:26:45.146 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log OS Name:               Linux
11-Apr-2019 01:26:45.146 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log OS Version:            3.10.0-957.el7.x86_64
11-Apr-2019 01:26:45.146 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Architecture:          amd64
11-Apr-2019 01:26:45.146 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Java Home:             /usr/lib/jvm/java-1.8-openjdk/jre
11-Apr-2019 01:26:45.147 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log JVM Version:           1.8.0_201-b08
11-Apr-2019 01:26:45.147 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log JVM Vendor:            Oracle Corporation
11-Apr-2019 01:26:45.147 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log CATALINA_BASE:         /usr/local/tomcat
11-Apr-2019 01:26:45.147 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log CATALINA_HOME:         /usr/local/tomcat
11-Apr-2019 01:26:45.148 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.util.logging.config.file=/usr/local/tomcat/conf/logging.properties
11-Apr-2019 01:26:45.148 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
11-Apr-2019 01:26:45.148 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djdk.tls.ephemeralDHKeySize=2048
11-Apr-2019 01:26:45.149 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.protocol.handler.pkgs=org.apache.catalina.webresources
11-Apr-2019 01:26:45.149 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dorg.apache.catalina.security.SecurityListener.UMASK=0027
11-Apr-2019 01:26:45.150 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dignore.endorsed.dirs=
11-Apr-2019 01:26:45.150 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dcatalina.base=/usr/local/tomcat
11-Apr-2019 01:26:45.150 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dcatalina.home=/usr/local/tomcat
11-Apr-2019 01:26:45.150 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.io.tmpdir=/usr/local/tomcat/temp
11-Apr-2019 01:26:45.151 INFO [main] org.apache.catalina.core.AprLifecycleListener.lifecycleEvent Loaded APR based Apache Tomcat Native library [1.2.21] using APR version [1.6.5].
11-Apr-2019 01:26:45.151 INFO [main] org.apache.catalina.core.AprLifecycleListener.lifecycleEvent APR capabilities: IPv6 [true], sendfile [true], accept filters [false], random [true].
11-Apr-2019 01:26:45.151 INFO [main] org.apache.catalina.core.AprLifecycleListener.lifecycleEvent APR/OpenSSL configuration: useAprConnector [false], useOpenSSL [true]
11-Apr-2019 01:26:45.160 INFO [main] org.apache.catalina.core.AprLifecycleListener.initializeSSL OpenSSL successfully initialized [OpenSSL 1.1.1b  26 Feb 2019]
11-Apr-2019 01:26:45.606 INFO [main] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["http-nio-8080"]
11-Apr-2019 01:26:45.678 INFO [main] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["ajp-nio-8009"]
11-Apr-2019 01:26:45.689 INFO [main] org.apache.catalina.startup.Catalina.load Server initialization in [2,071] milliseconds
11-Apr-2019 01:26:45.755 INFO [main] org.apache.catalina.core.StandardService.startInternal Starting service [Catalina]
11-Apr-2019 01:26:45.755 INFO [main] org.apache.catalina.core.StandardEngine.startInternal Starting Servlet engine: [Apache Tomcat/9.0.17]
11-Apr-2019 01:26:45.777 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/ROOT]
11-Apr-2019 01:26:46.985 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/ROOT] has finished in [1,202] ms
11-Apr-2019 01:26:46.986 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/docs]
11-Apr-2019 01:26:47.071 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/docs] has finished in [86] ms
11-Apr-2019 01:26:47.080 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/examples]
11-Apr-2019 01:26:48.100 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/examples] has finished in [1,020] ms
11-Apr-2019 01:26:48.104 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/host-manager]
11-Apr-2019 01:26:48.169 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/host-manager] has finished in [65] ms
11-Apr-2019 01:26:48.169 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/manager]
11-Apr-2019 01:26:48.227 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/manager] has finished in [58] ms
11-Apr-2019 01:26:48.235 INFO [main] org.apache.coyote.AbstractProtocol.start Starting ProtocolHandler ["http-nio-8080"]
11-Apr-2019 01:26:48.302 INFO [main] org.apache.coyote.AbstractProtocol.start Starting ProtocolHandler ["ajp-nio-8009"]
11-Apr-2019 01:26:48.323 INFO [main] org.apache.catalina.startup.Catalina.start Server startup in [2,633] milliseconds

    如果在一个pod中包含多个容器,则需要通过-c参数来指定容器的名称来进行查看,例如:

kubectl logs <pod_name> -c <container_name>

当然也可以直接直用docker logs <container_id>

[root@node2 ~]# docker ps | grep web
6041a63c30ea        6097ab3c4283           "catalina.sh run"        25 hours ago        Up 25 hours                             k8s_webapp_webapp-kzrn7_default_7c476613-59f4-11e9-9a41-000c29f1f0e4_3
974390ced06b        k8s.gcr.io/pause:3.1   "/pause"                 25 hours ago        Up 25 hours                             k8s_POD_webapp-kzrn7_default_7c476613-59f4-11e9-9a41-000c29f1f0e4_7
[root@node2 ~]# docker logs 6041a63c30ea
11-Apr-2019 01:26:33.432 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server version name:   Apache Tomcat/9.0.17
11-Apr-2019 01:26:33.526 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server built:          Mar 13 2019 15:55:27 UTC
11-Apr-2019 01:26:33.526 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server version number: 9.0.17.0
11-Apr-2019 01:26:33.526 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log OS Name:               Linux
11-Apr-2019 01:26:33.527 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log OS Version:            3.10.0-957.el7.x86_64
11-Apr-2019 01:26:33.527 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Architecture:          amd64
11-Apr-2019 01:26:33.527 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Java Home:             /usr/lib/jvm/java-1.8-openjdk/jre
11-Apr-2019 01:26:33.527 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log JVM Version:           1.8.0_201-b08
11-Apr-2019 01:26:33.528 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log JVM Vendor:            Oracle Corporation
11-Apr-2019 01:26:33.528 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log CATALINA_BASE:         /usr/local/tomcat
11-Apr-2019 01:26:33.528 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log CATALINA_HOME:         /usr/local/tomcat
11-Apr-2019 01:26:33.529 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.util.logging.config.file=/usr/local/tomcat/conf/logging.properties
11-Apr-2019 01:26:33.529 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
11-Apr-2019 01:26:33.529 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djdk.tls.ephemeralDHKeySize=2048
11-Apr-2019 01:26:33.529 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.protocol.handler.pkgs=org.apache.catalina.webresources
11-Apr-2019 01:26:33.530 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dorg.apache.catalina.security.SecurityListener.UMASK=0027
11-Apr-2019 01:26:33.530 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dignore.endorsed.dirs=
11-Apr-2019 01:26:33.530 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dcatalina.base=/usr/local/tomcat
11-Apr-2019 01:26:33.530 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dcatalina.home=/usr/local/tomcat
11-Apr-2019 01:26:33.530 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.io.tmpdir=/usr/local/tomcat/temp
11-Apr-2019 01:26:33.531 INFO [main] org.apache.catalina.core.AprLifecycleListener.lifecycleEvent Loaded APR based Apache Tomcat Native library [1.2.21] using APR version [1.6.5].
11-Apr-2019 01:26:33.539 INFO [main] org.apache.catalina.core.AprLifecycleListener.lifecycleEvent APR capabilities: IPv6 [true], sendfile [true], accept filters [false], random [true].
11-Apr-2019 01:26:33.540 INFO [main] org.apache.catalina.core.AprLifecycleListener.lifecycleEvent APR/OpenSSL configuration: useAprConnector [false], useOpenSSL [true]
11-Apr-2019 01:26:33.565 INFO [main] org.apache.catalina.core.AprLifecycleListener.initializeSSL OpenSSL successfully initialized [OpenSSL 1.1.1b  26 Feb 2019]
11-Apr-2019 01:26:34.291 INFO [main] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["http-nio-8080"]
11-Apr-2019 01:26:34.374 INFO [main] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["ajp-nio-8009"]
11-Apr-2019 01:26:34.378 INFO [main] org.apache.catalina.startup.Catalina.load Server initialization in [3,215] milliseconds
11-Apr-2019 01:26:34.467 INFO [main] org.apache.catalina.core.StandardService.startInternal Starting service [Catalina]
11-Apr-2019 01:26:34.468 INFO [main] org.apache.catalina.core.StandardEngine.startInternal Starting Servlet engine: [Apache Tomcat/9.0.17]
11-Apr-2019 01:26:34.507 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/ROOT]
11-Apr-2019 01:26:36.293 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/ROOT] has finished in [1,786] ms
11-Apr-2019 01:26:36.294 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/docs]
11-Apr-2019 01:26:36.368 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/docs] has finished in [73] ms
11-Apr-2019 01:26:36.377 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/examples]
11-Apr-2019 01:26:37.797 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/examples] has finished in [1,420] ms
11-Apr-2019 01:26:37.802 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/host-manager]
11-Apr-2019 01:26:38.031 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/host-manager] has finished in [228] ms
11-Apr-2019 01:26:38.032 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/manager]
11-Apr-2019 01:26:38.161 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/manager] has finished in [128] ms
11-Apr-2019 01:26:38.183 INFO [main] org.apache.coyote.AbstractProtocol.start Starting ProtocolHandler ["http-nio-8080"]
11-Apr-2019 01:26:38.244 INFO [main] org.apache.coyote.AbstractProtocol.start Starting ProtocolHandler ["ajp-nio-8009"]
11-Apr-2019 01:26:38.290 INFO [main] org.apache.catalina.startup.Catalina.start Server startup in [3,911] milliseconds

3、查看k8s的服务日志

如果在linux系统上进行安装,并且是使用systemd系统来管理k8s服务,那么systemd的journal系统会接管服务程序的输出日志。可以使用systemd status 或者systemctl status或者journalctl查看系统服务日志:

[root@node2 ~]# systemctl status kubelet.service 
Display all 502 possibilities? (y or n)
[root@node2 ~]# systemctl status kubelet.service 
● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: active (running) since Thu 2019-04-11 09:25:36 CST; 1 day 1h ago
     Docs: https://kubernetes.io/docs/
 Main PID: 7793 (kubelet)
    Tasks: 19
   Memory: 112.4M
   CGroup: /system.slice/kubelet.service
           └─7793 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/v...

Apr 12 09:56:44 node2 kubelet[7793]: W0412 09:56:44.886746    7793 reflector.go:270] object-"kube-system"/"kube-proxy": watch of *v1.ConfigMa... (562273)
Apr 12 09:57:46 node2 kubelet[7793]: W0412 09:57:46.933029    7793 reflector.go:270] object-"kube-system"/"kube-flannel-cfg": watch of *v1.Co... (562359)
Apr 12 10:04:45 node2 kubelet[7793]: W0412 10:04:45.828641    7793 reflector.go:270] object-"kube-system"/"coredns": watch of *v1.ConfigMap e... (562964)
Apr 12 10:11:04 node2 kubelet[7793]: W0412 10:11:04.635497    7793 reflector.go:270] object-"kube-system"/"kube-flannel-cfg": watch of *v1.Co... (563510)
Apr 12 10:12:23 node2 kubelet[7793]: W0412 10:12:23.593624    7793 reflector.go:270] object-"kube-system"/"kube-proxy": watch of *v1.ConfigMa... (563619)
Apr 12 10:24:09 node2 kubelet[7793]: W0412 10:24:09.875061    7793 reflector.go:270] object-"kube-system"/"coredns": watch of *v1.ConfigMap e... (564637)
Apr 12 10:26:55 node2 kubelet[7793]: W0412 10:26:55.642788    7793 reflector.go:270] object-"kube-system"/"kube-proxy": watch of *v1.ConfigMa... (564886)
Apr 12 10:28:14 node2 kubelet[7793]: W0412 10:28:14.693489    7793 reflector.go:270] object-"kube-system"/"kube-flannel-cfg": watch of *v1.Co... (564992)
Apr 12 10:43:12 node2 kubelet[7793]: W0412 10:43:12.893306    7793 reflector.go:270] object-"kube-system"/"coredns": watch of *v1.ConfigMap e... (566287)
Apr 12 10:43:37 node2 kubelet[7793]: W0412 10:43:37.662130    7793 reflector.go:270] object-"kube-system"/"kube-proxy": watch of *v1.ConfigMa... (566320)
Hint: Some lines were ellipsized, use -l to show in full.

或者

[root@centos ~]# journalctl -xeu kubelet
Apr 12 10:46:53 centos.master kubelet[9787]: E0412 10:46:53.510165    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:53 centos.master kubelet[9787]: E0412 10:46:53.610691    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:53 centos.master kubelet[9787]: E0412 10:46:53.711008    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:53 centos.master kubelet[9787]: E0412 10:46:53.811468    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:53 centos.master kubelet[9787]: I0412 10:46:53.883382    9787 kubelet_node_status.go:278] Setting node annotation to enable volume controlle
Apr 12 10:46:53 centos.master kubelet[9787]: E0412 10:46:53.912065    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:53 centos.master kubelet[9787]: I0412 10:46:53.914043    9787 kubelet_node_status.go:72] Attempting to register node centos.master
Apr 12 10:46:53 centos.master kubelet[9787]: E0412 10:46:53.916659    9787 kubelet_node_status.go:94] Unable to register node "centos.master" with API se
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.012363    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.113003    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:54 centos.master kubelet[9787]: I0412 10:46:54.147210    9787 kubelet_node_status.go:278] Setting node annotation to enable volume controlle
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.213291    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.313616    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.413970    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.514292    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.615167    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.715863    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.816154    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.916432    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.017040    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.117863    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.218694    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.319663    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.420254    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.521053    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.621575    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.722435    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.823464    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.924273    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.024392    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.125129    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: I0412 10:46:56.146767    9787 kubelet_node_status.go:278] Setting node annotation to enable volume controlle
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.225839    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.326354    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.427552    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.528289    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.628843    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.729056    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.829340    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.929690    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:57 centos.master kubelet[9787]: E0412 10:46:57.030373    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:57 centos.master kubelet[9787]: E0412 10:46:57.131158    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:57 centos.master kubelet[9787]: E0412 10:46:57.232373    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:57 centos.master kubelet[9787]: E0412 10:46:57.333084    9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:57 centos.master kubelet[9787]: E0412 10:46:57.433269    9787 kubelet.go:2266] node "centos.master" not found

上面的kubelet服务日志告诉我centos.master 的node找不到。

 好了到这里三板斧算是用完了。很简单的三板斧,只能用于基本排查。

  如果某个k8s对象存在问题而查看系统服务的日志,则我们可以用这个对象的名字作为关键字来搜索日志,在大多数情况下,我么平常所遇到的主要是与pod对象相关的问题,比如无法创建pod,pod启动后就停止或者Pod副本无法增加等。此时,我们可以先确定哪个pod在哪个节点上,然后登陆这个节点,从kubelet的日志中查询该pod的完整日志,然后进行问题排查。对于与pod扩容相关或者与RC相关的问题,则很有可能在kjbe-controller-manager及Kube-scheduler的日志中找出问题的关键点。

   另外kube-proxy经常被我们忽略,因为就算他停了,pod的状态依旧时正常的,但会导致某些服务访问异常。

 

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

k8s Trouble Shooting 故障排除 的相关文章

随机推荐

  • 掌握到胃-奈氏图与伯德图的绘制

    自控笔记 5 4绘制频率特性曲线 一 开环奈奎斯特曲线的绘制 先上步骤 确定起点G j0 和终点G j 中间段由s平面零极点矢量随s j 变化规律绘制 必要时可求出G j 与实轴 虚轴的交点 再看细节 对于一个系统的传递函数 可以将其分解成
  • touch、mkdir、rmdir、cp、mv、rm命令的常用参数的使用

    touch 可创建多个新文件或更新文件的修改日期 touch m t 时间 修改文件的时间 并可以指定修改时间 touch a 将文件的存取时间改为当前时间 mkdir 用于创建一个目录 mkdir p 用于创建目录时 如果父目录不存在 则
  • 基于MATLAB用图解法解方程(附图像与代码)

    目录 一 一元方程图解法 例题1 二 二元方程图解法 例题2 三 多项式型方程 例题 3 一 一元方程图解法 例题1 用图解法求 解 MATLAB代码 clc clear ezplot exp 3 t sin 4 t 2 4 exp 0 5
  • C# 笔记4——如何实现单击放大全屏和退出全屏

    C 笔记4 如何实现单击放大全屏和退出全屏 由于工作需求 需要实现单击放大和退出全屏功能 想了一下 即单击放大时候把播放视频的picturebox的大小设置和屏幕宽高相同 位置设置为屏幕左上角 0 0 即可 单击退出全屏时候把控件大小和位置
  • 扩散模型与生成模型详解

    扩散模型与其他生成模型 什么是扩散模型 扩散模型的简介 生成建模是理解自然数据分布的开创性任务之一 VAE GAN和Flow系列模型因其实用性能而在过去几年中占据了该领域的主导地位 尽管取得了商业上的成功 但它们的理论和设计缺陷 棘手的似然
  • 中国传统节日端午节网页HTML代码 学生网页课程设计期末作业下载 春节大学生网页设计制作成品下载 DW春节节日网页作业代码下载

    HTML5期末大作业 节日网站设计 中国传统节日端午节网页HTML代码 7页 HTML CSS JavaScript 学生DW网页设计作业成品 web课程设计网页规划与设计 计算机毕设网页设计源码 常见网页设计作业题材有 个人 美食 公司
  • visio技巧(曲线、连接点、自制模具)

    一 画曲线 1 1 铅笔 任意多边形 弧形都可以画曲线 但曲度不好更改 1 2 鼠标选中连接线 在画布上画一个直角线 选中该线 点击右键 选曲线连接线 随意拉动该线上的连接点可以调整成任意曲度 二 增加 移动 删除图形上的连接点 1 1 增
  • 【Python 基础篇】Python代码 之 程序结构

    目录 前言 一 顺序结构 1 1 分支结构 1 2 双向分支 1 3 多路分支 1 4 if语句补充 二 顺序结构 三 循环结构 while while else for in for else 四 流程控制语句 break continu
  • SparkStreaming知识总结

    一 流式计算的概述 1 1 什么是流式计算 1 数据流与静态数据的区别 数据流指的就是不断产生的数据 是源源不断 不会停止 静态数据指的就是存储在磁盘中的固定的数据 2 流式计算的概念 就是对数据流进行计算 由于数据是炼苗不断的产生的 所以
  • VMware15.5安装win7旗舰版系统

    1 启动vmware 文件 新建虚拟机 2 选择自定义安装 下一步 3 兼容性默认不做修改 下一步 4 安装来源选择稍后安装操作系统 5 选择windows7 64 6 安装位置选择非系统盘位置 放在C盘会拖累系统运行速度 7 8 取决于物
  • PostgreSQL_row_number() over()

    语法 row number over partition by col1 order by col2 desc row number 为返回的记录定义各行编号 pritition by 分组 order by 排序 实例 实例数据来源 利用
  • vue实现文件下载

    原理 a href url 复制代码 实际使用场景 上面的原理中适合开放的资源下载 http请求中无需验证时使用 在实际使用过程中 a标签中的url中直接设置header比较麻烦且不安全 而且从开发规范上api一般上要封装一下 header
  • 华为OD机试 - 最多颜色的车辆(Java)

    题目描述 在一个狭小的路口 每秒只能通过一辆车 假设车辆的颜色只有 3 种 找出 N 秒内经过的最多颜色的车辆数量 三种颜色编号为0 1 2 输入描述 第一行输入的是通过的车辆颜色信息 0 1 1 2 代表4 秒钟通过的车辆颜色分别是 0
  • LeetCode 2545. 根据第 K 场考试的分数排序

    班里有 m 位学生 共计划组织 n 场考试 给你一个下标从 0 开始 大小为 m x n 的整数矩阵 score 其中每一行对应一位学生 而 score i j 表示第 i 位学生在第 j 场考试取得的分数 矩阵 score 包含的整数 互
  • git忽略指定文件夹

    git忽略指定文件夹 如下结构 总共有三个文件夹 假设要忽略第一层的B文件夹 在目录下新建一个 gitignore文件 并填写下面内容 B 假设要忽略第一层的A文件夹 在文件中填写A 的话 会把B文件夹下的A文件夹也忽略了 这个时候可以加上
  • 要称王,先做行业破坏者

    author skate time 2010 06 18 高端阅读78期 原标题为 世界 油王 的职场启示 我的人生 狠 字当头 有极强的故事性 白手起家 狂赚几亿美元 后遭朋友暗算 被踢出一手创办并成功发展40年的公司 同期不得不应对麻烦
  • Linux nrm 运行失败,解决:npm中 下载速度慢 和(无法将“nrm”项识别为 cmdlet、函数、脚本文件或可运行程序的名称。请检查名称的拼写,如果包括路径,请确保路径正确, 然后再试一次)...

    1 解决下载速度 因为我们npm下载默认是 连接国外的服务器 所以网速不是特别好的时候 可能下不了包 安装nrm 使用 npm i nrm g 我们的一般工具包都是下载到全局 安装完毕之后 可以运行 命令 nrm ls ls 表示 list
  • Django-Model层ORM之查询操作(六)

    目录 一 Django查询相关API all 查询所有记录 返回一个集合对象 filter 属性 根据条件查询 返回一个集合对象 first 和 last 查询第一个和最后一个记录 返回单个对象 get id 2 根据id查询 返回一个对象
  • Ubuntu 15.04 下编译Caffe2

    深度学习大神贾扬清在四月底发布了最新框架Caffe2 最近在Ubuntu15 04下编译了它的源代码 遇到一些坑 记录下来以供参考 基本安装次序如官网所述 https caffe2 ai docs getting started html
  • k8s Trouble Shooting 故障排除

    本文要讲的是k8s的故障排除 比较浅 最近刚入门 主要涵盖的内容是查看k8s对象的当前运行时信息 对于服务 容器的问题是如何诊断的 对于某些复杂的问题例如pod调度问题是如何排查的 1 查看系统的Event事件 在对象资源 pod serv