MountVolume.MountDevice 操作失败,给定的卷 ID 已存在

MountVolume.MountDevice 操作失败,给定的卷 ID 已存在

环境:

具有 1 个主服务器和 3 个节点的 Kubernetes 集群 Ubuntu 18.04.3 LTS(GNU/Linux 4.15.0-66-generic x86_64)(VMWARE VM)

仪表板屏幕截图

无法将 Pod(简单的 nginx 映像)挂载到具有 rook-ceph 和 csi-cephfs 存储类的 Kubernetes 集群中的指定卷。它显示错误:

MountVolume.MountDevice failed for volume "pvc-9aad698e-ef82-495b-a1c5-e09d07d0e072" : rpc error: code = Aborted desc = an operation with the given Volume ID 0001-0009-rook-ceph-0000000000000001-89d24230-0571-11ea-a584-ce38896d0bb2 already exists

PVC 和 PV 是绿色的。PVC 是ReadWriteMany,但它也失败了ReadWriteOnce

Ceph 集群 HEALTH_OK全部为绿色。

我错过了什么?


更多日志:

Normal   Scheduled               <unknown>            default-scheduler        Successfully assigned rook-ceph/csicephfs-demo-pod to <myhost>

  Normal   SuccessfulAttachVolume  2m37s                attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-c1ad8144-15ae-49f6-a012-d866b74ff902"

  Warning  FailedMount             2m17s                kubelet, <myhost>        Unable to attach or mount volumes: unmounted volumes=[mypvc], unattached volumes=[mypvc default-token-wfjxl]: timed out waiting for the condition

  Warning  FailedMount             2m4s                 kubelet, <myhost>        MountVolume.MountDevice failed for volume "pvc-c1ad8144-15ae-49f6-a012-d866b74ff902" : rpc error: code = DeadlineExceeded desc = context deadline exceeded

  Warning  FailedMount             108s (x5 over 2m4s)  kubelet, <myhost>        MountVolume.MountDevice failed for volume "pvc-c1ad8144-15ae-49f6-a012-d866b74ff902" : rpc error: code = Aborted desc = an operation with the given Volume ID 0001-0009-rook-ceph-0000000000000001-0bc5ddfc-05f2-11ea-9f0a-bee51ab2829b already exists

kubectl -n rook-ceph get pv,pvc -o wide
NAME                                                        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                        STORAGECLASS   REASON   AGE     VOLUMEMODE
persistentvolume/pvc-c1ad8144-15ae-49f6-a012-d866b74ff902   1Gi        RWX            Delete           Bound    rook-ceph/cephfs-pvc-many2   csi-cephfs              114m    Filesystem
persistentvolume/pvc-d678dd06-7197-4342-934d-33e60edc564a   1Gi        RWO            Delete           Bound    rook-ceph/cephfs-pvc         csi-cephfs              6d19h   Filesystem

NAME                                     STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE    VOLUMEMODE
persistentvolumeclaim/cephfs-pvc         Bound    pvc-d678dd06-7197-4342-934d-33e60edc564a   1Gi        RWO            csi-cephfs     11d    Filesystem
persistentvolumeclaim/cephfs-pvc-many2   Bound    pvc-c1ad8144-15ae-49f6-a012-d866b74ff902   1Gi        RWX            csi-cephfs     118m   Filesystem

原始 PVC YAML:

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: cephfs-pvc-many2
  namespace: rook-ceph
spec:
  accessModes:
  - ReadWriteMany
  volumeMode: Filesystem
  resources:
    requests:
      storage: 1Gi
  storageClassName: csi-cephfs

荚:

---
apiVersion: v1
kind: Pod
metadata:
  name: csicephfs-demo-pod
  namespace: rook-ceph
spec:
  containers:
   - name: web-server
     image: nginx
     volumeMounts:
       - name: mypvc
         mountPath: /var/lib/www/html
  volumes:
   - name: mypvc
     persistentVolumeClaim:
       claimName: cephfs-pvc-many2
       readOnly: false

答案1

我遇到了这个错误,解决的办法是删除csi-cephfsplugin-provisionercsi-rbdplugin-provisionerpod,然后让副本集重新创建它们。一旦我这样做了,我的所有 PVC 都会按预期创建 PV 并绑定。我可能只需要杀死 pod csi-rbdplugin-provisioner,所以先试试这个。

答案2

节点重启后,rook ceph 外部集群通过公共 IP 成功创建 pvc 和 pv。但是,由于尝试通过无法访问的集群 IP,因此在连接节点时失败。如何强制 rook ceph 使用公共 IP?

我通过登录节点(使用 ssh)并检查了 sudo dmesg 的输出来了解这一点……但不知道如何将其设置为公共 IP 而不是私有 IP,因为从 kubernetes 集群节点无法访问外部 osd 的集群 ip!任何建议都将不胜感激!谢谢!

相关内容