在多节点集群上部署 mysql 时出现“CrashLoopBackOff”

2024-6-2 • tag-icon

这是我的configmap.yaml文件：


apiVersion: v1
kind: ConfigMap
metadata:
  name: mysql
  labels:
    app: mysql
    app.kubernetes.io/name: mysql
data:
  primary.cnf: |
    # Apply this config only on the primary.
    [mysqld]
    log-bin    
  replica.cnf: |
    # Apply this config only on replicas.
    [mysqld]
    super-read-only

文件内容如下mysql-depl.yaml：

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql
spec:
  selector:
    matchLabels:
      app: mysql
      app.kubernetes.io/name: mysql
  serviceName: mysql
  replicas: 3
  template:
    metadata:
      labels:
        app: mysql
        app.kubernetes.io/name: mysql
    spec:
      initContainers:
      - name: init-mysql
        image: mysql
        command:
        - bash
        - "-c"
        - |
          set -ex
          # Generate mysql server-id from pod ordinal index.
          [[ $HOSTNAME =~ -([0-9]+)$ ]] || exit 1
          ordinal=${BASH_REMATCH[1]}
          echo [mysqld] > /mnt/conf.d/server-id.cnf
          # Add an offset to avoid reserved server-id=0 value.
          echo server-id=$((100 + $ordinal)) >> /mnt/conf.d/server-id.cnf
          # Copy appropriate conf.d files from config-map to emptyDir.
          if [[ $ordinal -eq 0 ]]; then
            cp /mnt/config-map/primary.cnf /mnt/conf.d/
          else
            cp /mnt/config-map/replica.cnf /mnt/conf.d/
          fi          
        volumeMounts:
        - name: conf
          mountPath: /mnt/conf.d
        - name: config-map
          mountPath: /mnt/config-map
      - name: clone-mysql
        image: gcr.io/google-samples/xtrabackup:1.0
        command:
        - bash
        - "-c"
        - |
          set -ex
          # Skip the clone if data already exists.
          [[ -d /var/lib/mysql/mysql ]] && exit 0
          # Skip the clone on primary (ordinal index 0).
          [[ `hostname` =~ -([0-9]+)$ ]] || exit 1
          ordinal=${BASH_REMATCH[1]}
          [[ $ordinal -eq 0 ]] && exit 0
          # Clone data from previous peer.
          ncat --recv-only mysql-$(($ordinal-1)).mysql 3307 | xbstream -x -C /var/lib/mysql
          # Prepare the backup.
          xtrabackup --prepare --target-dir=/var/lib/mysql          
        volumeMounts:
        - name: data
          mountPath: /var/lib/mysql
          subPath: mysql
        - name: conf
          mountPath: /etc/mysql/conf.d
      containers:
      - name: mysql
        image: mysql:5.7
        env:
        - name: MYSQL_ALLOW_EMPTY_PASSWORD
          value: "1"
        ports:
        - name: mysql
          containerPort: 3306
        volumeMounts:
        - name: data
          mountPath: /var/lib/mysql
          subPath: mysql
        - name: conf
          mountPath: /etc/mysql/conf.d
        resources:
          requests:
            cpu: 500m
            memory: 1Gi
        livenessProbe:
          exec:
            command: ["mysqladmin", "ping"]
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
        readinessProbe:
          exec:
            # Check we can execute queries over TCP (skip-networking is off).
            command: ["mysql", "-h", "127.0.0.1", "-e", "SELECT 1"]
          initialDelaySeconds: 5
          periodSeconds: 2
          timeoutSeconds: 1
      - name: xtrabackup
        image: gcr.io/google-samples/xtrabackup:1.0
        ports:
        - name: xtrabackup
          containerPort: 3307
        command:
        - bash
        - "-c"
        - |
          set -ex
          cd /var/lib/mysql

          # Determine binlog position of cloned data, if any.
          if [[ -f xtrabackup_slave_info && "x$(<xtrabackup_slave_info)" != "x" ]]; then
            # XtraBackup already generated a partial "CHANGE MASTER TO" query
            # because we're cloning from an existing replica. (Need to remove the tailing semicolon!)
            cat xtrabackup_slave_info | sed -E 's/;$//g' > change_master_to.sql.in
            # Ignore xtrabackup_binlog_info in this case (it's useless).
            rm -f xtrabackup_slave_info xtrabackup_binlog_info
          elif [[ -f xtrabackup_binlog_info ]]; then
            # We're cloning directly from primary. Parse binlog position.
            [[ `cat xtrabackup_binlog_info` =~ ^(.*?)[[:space:]]+(.*?)$ ]] || exit 1
            rm -f xtrabackup_binlog_info xtrabackup_slave_info
            echo "CHANGE MASTER TO MASTER_LOG_FILE='${BASH_REMATCH[1]}',\
                  MASTER_LOG_POS=${BASH_REMATCH[2]}" > change_master_to.sql.in
          fi

          # Check if we need to complete a clone by starting replication.
          if [[ -f change_master_to.sql.in ]]; then
            echo "Waiting for mysqld to be ready (accepting connections)"
            until mysql -h 127.0.0.1 -e "SELECT 1"; do sleep 1; done

            echo "Initializing replication from clone position"
            mysql -h 127.0.0.1 \
                  -e "$(<change_master_to.sql.in), \
                          MASTER_HOST='mysql-0.mysql', \
                          MASTER_USER='root', \
                          MASTER_PASSWORD='', \
                          MASTER_CONNECT_RETRY=10; \
                        START SLAVE;" || exit 1
            # In case of container restart, attempt this at-most-once.
            mv change_master_to.sql.in change_master_to.sql.orig
          fi

          # Start a server to send backups when requested by peers.
          exec ncat --listen --keep-open --send-only --max-conns=1 3307 -c \
            "xtrabackup --backup --slave-info --stream=xbstream --host=127.0.0.1 --user=root"          
        volumeMounts:
        - name: data
          mountPath: /var/lib/mysql
          subPath: mysql
        - name: conf
          mountPath: /etc/mysql/conf.d
        resources:
          requests:
            cpu: 100m
            memory: 100Mi
      volumes:
      - name: conf
        emptyDir: {}
      - name: config-map
        configMap:
          name: mysql
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 10Gi
---
# Headless service for stable DNS entries of StatefulSet members.
apiVersion: v1
kind: Service
metadata:
  name: mysql
  labels:
    app: mysql
    app.kubernetes.io/name: mysql
spec:
  ports:
  - name: mysql
    port: 3306
  clusterIP: None
  selector:
    app: mysql
---
# Client service for connecting to any MySQL instance for reads.
# For writes, you must instead connect to the primary: mysql-0.mysql.
apiVersion: v1
kind: Service
metadata:
  name: mysql-read
  labels:
    app: mysql
    app.kubernetes.io/name: mysql
    readonly: "true"
spec:
  ports:
  - name: mysql
    port: 3306
  selector:
    app: mysql

当我成功应用这两个文件并检查 pod 时，kubectl get pods我看到的结果如下：

NAME      READY   STATUS             RESTARTS      AGE
mysql-0   1/2     CrashLoopBackOff   9 (52s ago)   20m

我Kind在本地 Ubuntu 22.04 机器上使用以下配置的集群：

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker

结果describe pod mysql-0：

Name:             mysql-0
Namespace:        default
Priority:         0
Service Account:  default
Node:             kind-worker/172.18.0.3
Start Time:       Fri, 30 Dec 2022 11:34:11 -0800
Labels:           app=mysql
                  app.kubernetes.io/name=mysql
                  controller-revision-hash=mysql-7d8d6f4696
                  statefulset.kubernetes.io/pod-name=mysql-0
Annotations:      <none>
Status:           Running
IP:               10.244.1.3
IPs:
  IP:           10.244.1.3
Controlled By:  StatefulSet/mysql
Init Containers:
  init-mysql:
    Container ID:  containerd://0cb850e677789c328b394905d40eee8cab539847fab784162b25f05dbbce0d31
    Image:         mysql
    Image ID:      docker.io/library/mysql@sha256:3d7ae561cf6095f6aca8eb7830e1d14734227b1fb4748092f2be2cfbccf7d614
    Port:          <none>
    Host Port:     <none>
    Command:
      bash
      -c
      set -ex
      # Generate mysql server-id from pod ordinal index.
      [[ $HOSTNAME =~ -([0-9]+)$ ]] || exit 1
      ordinal=${BASH_REMATCH[1]}
      echo [mysqld] > /mnt/conf.d/server-id.cnf
      # Add an offset to avoid reserved server-id=0 value.
      echo server-id=$((100 + $ordinal)) >> /mnt/conf.d/server-id.cnf
      # Copy appropriate conf.d files from config-map to emptyDir.
      if [[ $ordinal -eq 0 ]]; then
        cp /mnt/config-map/primary.cnf /mnt/conf.d/
      else
        cp /mnt/config-map/replica.cnf /mnt/conf.d/
      fi          
      
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Fri, 30 Dec 2022 11:34:45 -0800
      Finished:     Fri, 30 Dec 2022 11:34:45 -0800
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /mnt/conf.d from conf (rw)
      /mnt/config-map from config-map (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-r4j87 (ro)
  clone-mysql:
    Container ID:  containerd://0b0ffb57fca6234b09b033963850a7cc7acdc4150b85652f3eb195aef970ca7f
    Image:         gcr.io/google-samples/xtrabackup:1.0
    Image ID:      sha256:cee14c121daad572f219c0128ea1fdff4b81e7faf714c78b853427af3bda1cf7
    Port:          <none>
    Host Port:     <none>
    Command:
      bash
      -c
      set -ex
      # Skip the clone if data already exists.
      [[ -d /var/lib/mysql/mysql ]] && exit 0
      # Skip the clone on primary (ordinal index 0).
      [[ `hostname` =~ -([0-9]+)$ ]] || exit 1
      ordinal=${BASH_REMATCH[1]}
      [[ $ordinal -eq 0 ]] && exit 0
      # Clone data from previous peer.
      ncat --recv-only mysql-$(($ordinal-1)).mysql 3307 | xbstream -x -C /var/lib/mysql
      # Prepare the backup.
      xtrabackup --prepare --target-dir=/var/lib/mysql          
      
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Fri, 30 Dec 2022 11:35:13 -0800
      Finished:     Fri, 30 Dec 2022 11:35:13 -0800
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /etc/mysql/conf.d from conf (rw)
      /var/lib/mysql from data (rw,path="mysql")
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-r4j87 (ro)
Containers:
  mysql:
    Container ID:   containerd://f228587f532100ac6a1685b935877dcfdeb304c30d89542261976f243cbd936b
    Image:          mysql
    Image ID:       docker.io/library/mysql@sha256:3d7ae561cf6095f6aca8eb7830e1d14734227b1fb4748092f2be2cfbccf7d614
    Port:           3306/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Fri, 30 Dec 2022 15:18:57 -0800
      Finished:     Fri, 30 Dec 2022 15:18:58 -0800
    Ready:          False
    Restart Count:  47
    Requests:
      cpu:      500m
      memory:   1Gi
    Liveness:   exec [mysqladmin ping] delay=30s timeout=5s period=10s #success=1 #failure=3
    Readiness:  exec [mysql -h 127.0.0.1 -e SELECT 1] delay=5s timeout=1s period=2s #success=1 #failure=3
    Environment:
      MYSQL_ALLOW_EMPTY_PASSWORD:  1
    Mounts:
      /etc/mysql/conf.d from conf (rw)
      /var/lib/mysql from data (rw,path="mysql")
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-r4j87 (ro)
  xtrabackup:
    Container ID:  containerd://9c4b58c109e500e1cc2c68d3f1830cc065e850713b01a724d5a293a21fc21bb5
    Image:         gcr.io/google-samples/xtrabackup:1.0
    Image ID:      sha256:cee14c121daad572f219c0128ea1fdff4b81e7faf714c78b853427af3bda1cf7
    Port:          3307/TCP
    Host Port:     0/TCP
    Command:
      bash
      -c
      set -ex
      cd /var/lib/mysql
      
      # Determine binlog position of cloned data, if any.
      if [[ -f xtrabackup_slave_info && "x$(<xtrabackup_slave_info)" != "x" ]]; then
        # XtraBackup already generated a partial "CHANGE MASTER TO" query
        # because we're cloning from an existing replica. (Need to remove the tailing semicolon!)
        cat xtrabackup_slave_info | sed -E 's/;$//g' > change_master_to.sql.in
        # Ignore xtrabackup_binlog_info in this case (it's useless).
        rm -f xtrabackup_slave_info xtrabackup_binlog_info
      elif [[ -f xtrabackup_binlog_info ]]; then
        # We're cloning directly from primary. Parse binlog position.
        [[ `cat xtrabackup_binlog_info` =~ ^(.*?)[[:space:]]+(.*?)$ ]] || exit 1
        rm -f xtrabackup_binlog_info xtrabackup_slave_info
        echo "CHANGE MASTER TO MASTER_LOG_FILE='${BASH_REMATCH[1]}',\
              MASTER_LOG_POS=${BASH_REMATCH[2]}" > change_master_to.sql.in
      fi
      
      # Check if we need to complete a clone by starting replication.
      if [[ -f change_master_to.sql.in ]]; then
        echo "Waiting for mysqld to be ready (accepting connections)"
        until mysql -h 127.0.0.1 -e "SELECT 1"; do sleep 1; done
      
        echo "Initializing replication from clone position"
        mysql -h 127.0.0.1 \
              -e "$(<change_master_to.sql.in), \
                      MASTER_HOST='mysql-0.mysql', \
                      MASTER_USER='root', \
                      MASTER_PASSWORD='', \
                      MASTER_CONNECT_RETRY=10; \
                    START SLAVE;" || exit 1
        # In case of container restart, attempt this at-most-once.
        mv change_master_to.sql.in change_master_to.sql.orig
      fi
      
      # Start a server to send backups when requested by peers.
      exec ncat --listen --keep-open --send-only --max-conns=1 3307 -c \
        "xtrabackup --backup --slave-info --stream=xbstream --host=127.0.0.1 --user=root"          
      
    State:          Running
      Started:      Fri, 30 Dec 2022 11:35:46 -0800
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:        100m
      memory:     100Mi
    Environment:  <none>
    Mounts:
      /etc/mysql/conf.d from conf (rw)
      /var/lib/mysql from data (rw,path="mysql")
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-r4j87 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  data-mysql-0
    ReadOnly:   false
  conf:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  config-map:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      mysql
    Optional:  false
  kube-api-access-r4j87:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason   Age                    From     Message
  ----     ------   ----                   ----     -------
  Warning  BackOff  65s (x987 over 3h43m)  kubelet  Back-off restarting failed container

附言：我也想知道为什么会显示READY 1/2？之前我以同样的方式部署cassandra，但还是显示READY 1/1，在 1 节点机器上也一样mysql。这里还添加了什么？

编辑：这是结果kubectl logs mysql-0 -c mysql：

2022-12-31 00:48:16+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 8.0.31-1.el8 started.
2022-12-31 00:48:16+00:00 [Note] [Entrypoint]: Switching to dedicated user 'mysql'
2022-12-31 00:48:16+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 8.0.31-1.el8 started.
2022-12-31T00:48:17.443527Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.31) starting as process 1
2022-12-31T00:48:17.443610Z 0 [ERROR] [MY-010338] [Server] Can't find error-message file '/usr/share/mysql-8.0/errmsg.sys'. Check error-message file location and 'lc-messages-dir' configuration directive.
2022-12-31T00:48:17.507830Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
2022-12-31T00:48:17.637835Z 1 [ERROR] [MY-012960] [InnoDB] Cannot create redo log files because data files are corrupt or the database was not shut down cleanly after creating the data files.
2022-12-31T00:48:18.122850Z 1 [ERROR] [MY-010334] [Server] Failed to initialize DD Storage Engine
2022-12-31T00:48:18.123114Z 0 [ERROR] [MY-010020] [Server] Data Dictionary initialization failed.
2022-12-31T00:48:18.123131Z 0 [ERROR] [MY-010119] [Server] Aborting
2022-12-31T00:48:18.123592Z 0 [System] [MY-010910] [Server] /usr/sbin/mysqld: Shutdown complete (mysqld 8.0.31)  MySQL Community Server - GPL.

答案1

这是给你的一半答案：

PS：我也想知道为什么它显示READY 1/2？

输出READY x/y会告诉您 pod 中有多少个容器处于“就绪”状态。在 Cassandra 部署中，您只有一个容器。在 MySQL 部署中，您有两个容器：

mysql根据镜像命名的容器mysql:5.7
xtrabackup根据镜像命名的容器gcr.io/google-samples/xtrabackup:1.0

容器xtrabackup正在运行但是mysql容器出现故障，所以你看到了READY 1/2。

答案1

相关内容