K8s 失败rabbitmq-peer-discovery-k8s 集群

2024-01-11

我正在尝试使用 Rabbitmq-peer-discovery-k8s 插件在 Kubernetes 上启动 RabbitMQ 集群,并且我总是只有一个 pod 运行并准备就绪,但下一个总是失败。

我尝试对配置进行多次更改,这就是至少一个 Pod 运行的原因

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: rabbitmq 
  namespace: namespace-dev
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: endpoint-reader
  namespace: namespace-dev
rules:
- apiGroups: [""]
  resources: ["endpoints"]
  verbs: ["get"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: endpoint-reader
  namespace: namespace-dev
subjects:
- kind: ServiceAccount
  name: rabbitmq
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: endpoint-reader
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: "rabbitmq-data"
  labels:
    name: "rabbitmq-data"
    release: "rabbitmq-data"
    namespace: "namespace-dev"
spec:
  capacity:
    storage: 5Gi
  accessModes:
  - "ReadWriteMany"
  nfs:
    path: "/path/to/nfs"
    server: "xx.xx.xx.xx"
  persistentVolumeReclaimPolicy: Retain

---  
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: "rabbitmq-data-claim"
  namespace: "namespace-dev"
spec:
  accessModes:
    - ReadWriteMany
  resources:  
    requests:
      storage: 5Gi
  selector:
    matchLabels:
      release: rabbitmq-data
---
# headless service Used to access pods using hostname
kind: Service
apiVersion: v1
metadata:
  name: rabbitmq-headless
  namespace: namespace-dev
spec:
  clusterIP: None
  # publishNotReadyAddresses, when set to true, indicates that DNS implementations must publish the notReadyAddresses of subsets for the Endpoints associated with the Service.     The default value is false. The primary use case for setting this field is to use a StatefulSet's Headless Service to propagate SRV records for its Pods without respect to     their readiness for purpose of peer discovery. This field will replace the service.alpha.kubernetes.io/tolerate-unready-endpoints when that annotation is deprecated and all clients have been converted to use this field.
  # Since access to the Pod using DNS requires Pod and Headless service to be started before launch, publishNotReadyAddresses is set to true to prevent readinessProbe from finding DNS when the service is not started.
  publishNotReadyAddresses: true 
  ports: 
   - name: amqp
     port: 5672
   - name: http
     port: 15672
  selector:
    app: rabbitmq
---
# Used to expose the dashboard to the external network
kind: Service
apiVersion: v1
metadata:
  namespace: namespace-dev
  name: rabbitmq-service
spec:
  type: NodePort
  ports:
   - name: http
     protocol: TCP
     port: 15672
     targetPort: 15672
     nodePort: 31672
   - name: amqp
     protocol: TCP
     port: 5672
     targetPort: 5672
     nodePort: 30672
  selector:
    app: rabbitmq
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: rabbitmq-config
  namespace: namespace-dev
data:
  enabled_plugins: |
      [rabbitmq_management,rabbitmq_peer_discovery_k8s].
  rabbitmq.conf: |
      cluster_formation.peer_discovery_backend  = rabbit_peer_discovery_k8s
      cluster_formation.k8s.host = kubernetes.default.svc.cluster.local
      cluster_formation.k8s.address_type = hostname
      cluster_formation.node_cleanup.interval = 10
      cluster_formation.node_cleanup.only_log_warning = true
      cluster_partition_handling = autoheal
      queue_master_locator=min-masters
      loopback_users.guest = false

      cluster_formation.randomized_startup_delay_range.min = 0
      cluster_formation.randomized_startup_delay_range.max = 2
      cluster_formation.k8s.service_name = rabbitmq-headless
      cluster_formation.k8s.hostname_suffix = .rabbitmq-headless.namespace-dev.svc.cluster.local
      vm_memory_high_watermark.absolute = 1.6GB
      disk_free_limit.absolute = 2GB

---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: rabbitmq
  namespace: rabbitmq
spec:
  serviceName: rabbitmq-headless   # Must be the same as the name of the headless service, used for hostname propagation access pod
  selector:
    matchLabels:
      app: rabbitmq # In apps/v1, it needs to be the same as .spec.template.metadata.label for hostname propagation access pods, but not in apps/v1beta
  replicas: 3
  template:
    metadata:
      labels:
        app: rabbitmq  # In apps/v1, the same as .spec.selector.matchLabels
      # setting podAntiAffinity
      annotations:
        scheduler.alpha.kubernetes.io/affinity: >
            {
              "podAntiAffinity": {
                "requiredDuringSchedulingIgnoredDuringExecution": [{
                  "labelSelector": {
                    "matchExpressions": [{
                      "key": "app",
                      "operator": "In",
                      "values": ["rabbitmq"]
                    }]
                  },
                  "topologyKey": "kubernetes.io/hostname"
                }]
              }
            }
    spec:
      serviceAccountName: rabbitmq
      terminationGracePeriodSeconds: 10
      containers:        
      - name: rabbitmq
        image: rabbitmq:3.7.10
        resources:
          limits:
            cpu: "0.5"
            memory: 2Gi
          requests:
            cpu: "0.3"
            memory: 2Gi
        volumeMounts:
          - name: config-volume
            mountPath: /etc/rabbitmq
          - name: rabbitmq-data
            mountPath: /var/lib/rabbitmq/mnesia
        ports:
          - name: http
            protocol: TCP
            containerPort: 15672
          - name: amqp
            protocol: TCP
            containerPort: 5672
        livenessProbe:
          exec:
            command: ["rabbitmqctl", "status"]
          initialDelaySeconds: 60
          periodSeconds: 60
          timeoutSeconds: 5
        readinessProbe:
          exec:
            command: ["rabbitmqctl", "status"]
          initialDelaySeconds: 20
          periodSeconds: 60
          timeoutSeconds: 5
        imagePullPolicy: IfNotPresent
        env:
          - name: HOSTNAME
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: RABBITMQ_USE_LONGNAME
            value: "true"
          - name: RABBITMQ_NODENAME
            value: "rabbit@$(HOSTNAME).rabbitmq-headless.namespace-dev.svc.cluster.local"
          # If service_name is set in ConfigMap, there is no need to set it again here.
          # - name: K8S_SERVICE_NAME
          #   value: "rabbitmq-headless"
          - name: RABBITMQ_ERLANG_COOKIE
            value: "mycookie" 
      volumes:
        - name: config-volume
          configMap:
            name: rabbitmq-config
            items:
            - key: rabbitmq.conf
              path: rabbitmq.conf
            - key: enabled_plugins
              path: enabled_plugins
        - name: rabbitmq-data
          persistentVolumeClaim:
            claimName: rabbitmq-data-claim

我只运行并准备好 1 个 pod,而不是 3 个副本

[admin@devsvr3 yaml]$ kubectl get pods
NAME                          READY   STATUS    RESTARTS   AGE
rabbitmq-0                    1/1     Running   0          2m2s
rabbitmq-1                    0/1     Running   1          43s

检查失败的吊舱我得到了这个。

[admin@devsvr3 yaml]$ kubectl logs rabbitmq-1

  ##  ##
  ##  ##      RabbitMQ 3.7.10. Copyright (C) 2007-2018 Pivotal Software, Inc.
  ##########  Licensed under the MPL.  See http://www.rabbitmq.com/
  ######  ##
  ##########  Logs: <stdout>

              Starting broker...
2019-02-06 21:09:03.303 [info] <0.211.0> 
 Starting RabbitMQ 3.7.10 on Erlang 21.2.3
 Copyright (C) 2007-2018 Pivotal Software, Inc.
 Licensed under the MPL.  See http://www.rabbitmq.com/
2019-02-06 21:09:03.315 [info] <0.211.0> 
 node           : rabbit@rabbitmq-1.rabbitmq-headless.namespace-dev.svc.cluster.local
 home dir       : /var/lib/rabbitmq
 config file(s) : /etc/rabbitmq/rabbitmq.conf
 cookie hash    : XhdCf8zpVJeJ0EHyaxszPg==
 log(s)         : <stdout>
 database dir   : /var/lib/rabbitmq/mnesia/rabbit@rabbitmq-1.rabbitmq-headless.namespace-dev.svc.cluster.local
2019-02-06 21:09:10.617 [error] <0.219.0> Unable to parse vm_memory_high_watermark value "1.6GB"
2019-02-06 21:09:10.617 [info] <0.219.0> Memory high watermark set to 103098 MiB (108106919116 bytes) of 257746 MiB (270267297792 bytes) total
2019-02-06 21:09:10.690 [info] <0.221.0> Enabling free disk space monitoring
2019-02-06 21:09:10.690 [info] <0.221.0> Disk free limit set to 2000MB
2019-02-06 21:09:10.698 [info] <0.224.0> Limiting to approx 1048476 file handles (943626 sockets)
2019-02-06 21:09:10.698 [info] <0.225.0> FHC read buffering:  OFF
2019-02-06 21:09:10.699 [info] <0.225.0> FHC write buffering: ON
2019-02-06 21:09:10.702 [info] <0.211.0> Node database directory at /var/lib/rabbitmq/mnesia/rabbit@rabbitmq-1.rabbitmq-headless.namespace-dev.svc.cluster.local is empty. Assuming we need to join an existing cluster or initialise from scratch...
2019-02-06 21:09:10.702 [info] <0.211.0> Configured peer discovery backend: rabbit_peer_discovery_k8s
2019-02-06 21:09:10.702 [info] <0.211.0> Will try to lock with peer discovery backend rabbit_peer_discovery_k8s
2019-02-06 21:09:10.702 [info] <0.211.0> Peer discovery backend does not support locking, falling back to randomized delay
2019-02-06 21:09:10.702 [info] <0.211.0> Peer discovery backend rabbit_peer_discovery_k8s does not support registration, skipping randomized startup delay.
2019-02-06 21:09:10.710 [info] <0.211.0> Failed to get nodes from k8s - {failed_connect,[{to_address,{"kubernetes.default.svc.cluster.local",443}},
                 {inet,[inet],nxdomain}]}
2019-02-06 21:09:10.711 [error] <0.210.0> CRASH REPORT Process <0.210.0> with 0 neighbours exited with reason: no case clause matching {error,"{failed_connect,[{to_address,{\"kubernetes.default.svc.cluster.local\",443}},\n                 {inet,[inet],nxdomain}]}"} in rabbit_mnesia:init_from_config/0 line 164 in application_master:init/4 line 138
2019-02-06 21:09:10.711 [info] <0.43.0> Application rabbit exited with reason: no case clause matching {error,"{failed_connect,[{to_address,{\"kubernetes.default.svc.cluster.local\",443}},\n                 {inet,[inet],nxdomain}]}"} in rabbit_mnesia:init_from_config/0 line 164
{"Kernel pid terminated",application_controller,"{application_start_failure,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{{case_clause,{error,\"{failed_connect,[{to_address,{\\"kubernetes.default.svc.cluster.local\\",443}},\n                 {inet,[inet],nxdomain}]}\"}},[{rabbit_mnesia,init_from_config,0,[{file,\"src/rabbit_mnesia.erl\"},{line,164}]},{rabbit_mnesia,init_with_lock,3,[{file,\"src/rabbit_mnesia.erl\"},{line,144}]},{rabbit_mnesia,init,0,[{file,\"src/rabbit_mnesia.erl\"},{line,111}]},{rabbit_boot_steps,'-run_step/2-lc$^1/1-1-',1,[{file,\"src/rabbit_boot_steps.erl\"},{line,49}]},{rabbit_boot_steps,run_step,2,[{file,\"src/rabbit_boot_steps.erl\"},{line,49}]},{rabbit_boot_steps,'-run_boot_steps/1-lc$^0/1-0-',1,[{file,\"src/rabbit_boot_steps.erl\"},{line,26}]},{rabbit_boot_steps,run_boot_steps,1,[{file,\"src/rabbit_boot_steps.erl\"},{line,26}]},{rabbit,start,2,[{file,\"src/rabbit.erl\"},{line,815}]}]}}}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{{case_clause,{error,"{failed_connect,[{to_address,{\"kubernetes.defau

Crash dump is being written to: /var/log/rabbitmq/erl_crash.dump...done
[admin@devsvr3 yaml]$ 

我在这里做错了什么?


最后我通过在我的 pod 的 /etc/resolv.conf 中添加以下内容来修复它:

[my-rabbit-svc].[my-rabbitmq-namespace].svc.[cluster-name]

为了将其添加到我的 Pod 中,我在 StatefulSet 中使用了此设置:

dnsConfig:
    searches:
      - [my-rabbit-svc].[my-rabbitmq-namespace].svc.[cluster-name]

完整的文档here https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-config

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

K8s 失败rabbitmq-peer-discovery-k8s 集群 的相关文章

随机推荐

  • RIP寄存器不改变

    为什么当我继续使用c和内联汇编打印堆栈和指令指针寄存器时它们不会改变 因为逻辑上其他程序同时运行 所以它们应该在打印时不断改变 操作系统和 CPU 一起工作 为进程 同时运行 提供 CPU 切片 实际上 他们通过分配时间片来虚拟化 CPU
  • 如何在多行中编写 f 字符串而不引入意外的空格? [复制]

    这个问题在这里已经有答案了 考虑以下代码片段 name1 Nadya name2 Jim def print string string f name1 n name2 print string print string 产生 Nadya
  • Monodevelop - 仅使用 sudo 运行

    我已经在我的 Debian amd64 jessie 构建上安装了 Mono 和 Monodevelop 并且我只能使用提升的权限运行 monodevelop 从 UI startesque 菜单启动 monodevelop 似乎什么也没发
  • 带有位置参数的 Git 别名

    基本上我正在尝试别名 git files 9fa3 执行命令 git diff name status 9fa3 9fa3 但 git 似乎没有将位置参数传递给别名命令 我努力了 alias files git diff name stat
  • 为什么 Apache 没有在 XAMPP 上启动 [关闭]

    Closed 这个问题是无关 help closed questions 目前不接受答案 直到昨天 我的本地主机一切都很好 但从昨天开始 本地主机无法打开 它说 无法连接 我尝试了很多次来启动Apache on XAMPP 但它说消息忙 我
  • .NET .config 文件中 ConnectionString 元素的用途

    在中存储和读取应用程序的连接字符串有什么区别
  • 转移 PyPI 包的所有权

    As per PEP 541 https www python org dev peps pep 0541 现在可以认领废弃的 PyPI 项目 有人这样做过吗 联系谁 我尝试过dist utils 邮件列表 https mail pytho
  • Get-EventLog - 某些事件日志源缺少有效消息

    我正在使用 get eventlog 提取和过滤系统事件日志数据 我发现 get event log 无法正确返回与某些条目关联的消息 这些条目通常显示在事件日志查看器中 例如 get eventlog logname system sou
  • Python 模拟多个具有不同结果的调用

    我希望能够对特定属性函数进行多次调用 为每次连续调用返回不同的结果 在下面的示例中 我希望增量在第一次调用时返回 5 然后在第二次调用时返回 10 Ex import mock class A def init self self size
  • OpenCV 和 VS2010:致命错误 LNK1104:致命错误 LNK1104:无法打开文件“tbb_debug.lib”

    我尝试按照本指南使用 Visual Studio C 2010 安装 OpenCV 使用 Windows 7 64 位 在 Visual C 2010 Express 中安装 OpenCV 2 4 3 https stackoverflow
  • Django 独立脚本

    我正在尝试从另一个 python 脚本访问我的 Django v1 10 应用程序数据库 但遇到了一些问题 这是我的文件和文件夹结构 store store init py settings py urls py wsgi py store
  • 面向对象的设计建议

    这是我的代码 class Soldier public Soldier const string name const Gun gun string getName private Gun gun string name class Gun
  • Android:SensorManager.getRotationMatrix 和 SensorManager.getOrientation() 的算法

    要在 Android 中获取欧拉角 例如俯仰角 横滚角 方位角 的方向 需要执行以下操作 SensorManager getRotationMatrix float R float I float 重力 float 地磁 SensorMan
  • 跨站脚本注入

    我正在测试一个网络应用程序 我想写一个XSS将显示警报的脚本 Hello 我写的第一个脚本是
  • VBA Word - 带有初始文件名的另存为对话框

    我有一个 vba 宏 可以对当前文档进行一些更改 并确定应该用于该文档的文件名 如果该文档没有保存为该文件名 但应该提示用户这样做 但应该能够更改默认设置 我发现两种可能性都不完美 我需要这两种的混合 第一种方法 Application D
  • 如何使用Airflow获取并处理mysql记录?

    我需要 1 run a select query on MYSQL DB and fetch the records 2 Records are processed by python script 我不确定我应该采取什么方式 xcom 是
  • 根据使用的发电机设置 QTDIR

    我正在尝试设置一个 CMake 项目 该项目由多个静态库和一个依赖于 QT 的主要可执行文件组成 我希望在运行 cmake 构建时能够选择 MinGW 或 MSVC 构建 我最近了解到 如果满足以下条件 CMake 的 QT 特定部分将自动
  • 如何设置 NSMenuItem 操作的发送者?

    Apple 文档说传递给 NSMenuItem 操作的发送者可以设置为某个自定义对象 但我似乎不知道如何执行此操作 有没有我在文档中没有看到的方法 我不确定您指的是哪一份文档 链接会有所帮助 您可以使用 setRepresentedObje
  • 在 SQL Server 中创建相关矩阵

    我试图在 SQL Server 中获取相关矩阵 并且我的数据按以下方式存储在表中 RptLOB1 RptLOB2 Correlation AE AE 1 Bail AE 0 35 Commercial Bail 0 25 Commercia
  • K8s 失败rabbitmq-peer-discovery-k8s 集群

    我正在尝试使用 Rabbitmq peer discovery k8s 插件在 Kubernetes 上启动 RabbitMQ 集群 并且我总是只有一个 pod 运行并准备就绪 但下一个总是失败 我尝试对配置进行多次更改 这就是至少一个 P