环境:
具有 1 个主服务器和 3 个节点的 Kubernetes 集群 Ubuntu 18.04.3 LTS(GNU/Linux 4.15.0-66-generic x86_64)(VMWARE VM)
无法将 Pod(简单的 nginx 映像)挂载到具有 rook-ceph 和 csi-cephfs 存储类的 Kubernetes 集群中的指定卷。它显示错误:
MountVolume.MountDevice failed for volume "pvc-9aad698e-ef82-495b-a1c5-e09d07d0e072" : rpc error: code = Aborted desc = an operation with the given Volume ID 0001-0009-rook-ceph-0000000000000001-89d24230-0571-11ea-a584-ce38896d0bb2 already exists
PVC 和 PV 是绿色的。PVC 是ReadWriteMany
,但它也失败了ReadWriteOnce
Ceph 集群 HEALTH_OK
全部为绿色。
我错过了什么?
更多日志:
Normal Scheduled <unknown> default-scheduler Successfully assigned rook-ceph/csicephfs-demo-pod to <myhost>
Normal SuccessfulAttachVolume 2m37s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-c1ad8144-15ae-49f6-a012-d866b74ff902"
Warning FailedMount 2m17s kubelet, <myhost> Unable to attach or mount volumes: unmounted volumes=[mypvc], unattached volumes=[mypvc default-token-wfjxl]: timed out waiting for the condition
Warning FailedMount 2m4s kubelet, <myhost> MountVolume.MountDevice failed for volume "pvc-c1ad8144-15ae-49f6-a012-d866b74ff902" : rpc error: code = DeadlineExceeded desc = context deadline exceeded
Warning FailedMount 108s (x5 over 2m4s) kubelet, <myhost> MountVolume.MountDevice failed for volume "pvc-c1ad8144-15ae-49f6-a012-d866b74ff902" : rpc error: code = Aborted desc = an operation with the given Volume ID 0001-0009-rook-ceph-0000000000000001-0bc5ddfc-05f2-11ea-9f0a-bee51ab2829b already exists
kubectl -n rook-ceph get pv,pvc -o wide
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE VOLUMEMODE
persistentvolume/pvc-c1ad8144-15ae-49f6-a012-d866b74ff902 1Gi RWX Delete Bound rook-ceph/cephfs-pvc-many2 csi-cephfs 114m Filesystem
persistentvolume/pvc-d678dd06-7197-4342-934d-33e60edc564a 1Gi RWO Delete Bound rook-ceph/cephfs-pvc csi-cephfs 6d19h Filesystem
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE VOLUMEMODE
persistentvolumeclaim/cephfs-pvc Bound pvc-d678dd06-7197-4342-934d-33e60edc564a 1Gi RWO csi-cephfs 11d Filesystem
persistentvolumeclaim/cephfs-pvc-many2 Bound pvc-c1ad8144-15ae-49f6-a012-d866b74ff902 1Gi RWX csi-cephfs 118m Filesystem
原始 PVC YAML:
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: cephfs-pvc-many2
namespace: rook-ceph
spec:
accessModes:
- ReadWriteMany
volumeMode: Filesystem
resources:
requests:
storage: 1Gi
storageClassName: csi-cephfs
荚:
---
apiVersion: v1
kind: Pod
metadata:
name: csicephfs-demo-pod
namespace: rook-ceph
spec:
containers:
- name: web-server
image: nginx
volumeMounts:
- name: mypvc
mountPath: /var/lib/www/html
volumes:
- name: mypvc
persistentVolumeClaim:
claimName: cephfs-pvc-many2
readOnly: false
答案1
我遇到了这个错误,解决的办法是删除csi-cephfsplugin-provisioner
和csi-rbdplugin-provisioner
pod,然后让副本集重新创建它们。一旦我这样做了,我的所有 PVC 都会按预期创建 PV 并绑定。我可能只需要杀死 pod csi-rbdplugin-provisioner
,所以先试试这个。
答案2
节点重启后,rook ceph 外部集群通过公共 IP 成功创建 pvc 和 pv。但是,由于尝试通过无法访问的集群 IP,因此在连接节点时失败。如何强制 rook ceph 使用公共 IP?
我通过登录节点(使用 ssh)并检查了 sudo dmesg 的输出来了解这一点……但不知道如何将其设置为公共 IP 而不是私有 IP,因为从 kubernetes 集群节点无法访问外部 osd 的集群 ip!任何建议都将不胜感激!谢谢!