在 aws eks 中使用 pvc 的 kubernetes 提供程序时,terraform destroy 失败,如何修复?

在 aws eks 中使用 pvc 的 kubernetes 提供程序时,terraform destroy 失败,如何修复?

我们已经使用 terraform kubernetes 提供程序完成了 kubernetes 部署,同时创建了集群 eks 本身。

之后我们尝试销毁时,还没有使用产品,只是测试了销毁。使用 terraform 销毁时出现以下错误。

kubernetes_persistent_volume_claim.prometheus-pvc: Still destroying... [id=default/prometheus-pvc, 19m30s elapsed]
kubernetes_persistent_volume_claim.register-pvc[0]: Still destroying... [id=default/register-pvc, 19m30s elapsed]
kubernetes_persistent_volume_claim.register-pvc[0]: Still destroying... [id=default/register-pvc, 19m40s elapsed]
kubernetes_persistent_volume_claim.prometheus-pvc: Still destroying... [id=default/prometheus-pvc, 19m40s elapsed]
kubernetes_persistent_volume_claim.prometheus-pvc: Still destroying... [id=default/prometheus-pvc, 19m50s elapsed]
kubernetes_persistent_volume_claim.register-pvc[0]: Still destroying... [id=default/register-pvc, 19m50s elapsed]
│ Error: Persistent volume claim prometheus-pvc still exists with finalizers: [kubernetes.io/pvc-protection]
│ Error: Persistent volume claim register-pvc still exists with finalizers: [kubernetes.io/pvc-protection]
time=2022-06-17T19:38:38Z level=error msg=1 error occurred:
    * exit status 1
Error destroying Terraform 

持久卷:

kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                STORAGECLASS   REASON   AGE
pvc-51256bfd-4e32-4a4f-a24b-c0f47f9e1d63   100Gi      RWO            Delete           Bound    default/db-persistent-storage-db-0   ssd                     171m
pvc-9453236c-ffc3-4161-a205-e057c3e1ba77   20Gi       RWO            Delete           Bound    default/prometheus-pvc               hdd                     171m
pvc-ddfef2b9-9723-4651-916b-2cb75baf0f22   20Gi       RWO            Delete           Bound    default/register-pvc                 ssd                     171m

持久卷声明:

kubectl get pvc
NAME                         STATUS        VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
db-persistent-storage-db-0   Bound         pvc-51256bfd-4e32-4a4f-a24b-c0f47f9e1d63   100Gi      RWO            ssd            173m
prometheus-pvc               Terminating   pvc-9453236c-ffc3-4161-a205-e057c3e1ba77   20Gi       RWO            hdd            173m
register-pvc                 Terminating   pvc-ddfef2b9-9723-4651-916b-2cb75baf0f22   20Gi       RWO            ssd            173m

以下是一些事件:

45m         Normal    VolumeDelete             persistentvolume/pvc-0e5c621a-529c-4458-b224-39ea22a783fc   error deleting EBS volume "vol-0e36ca327609ae963" since volume is currently attached to "i-0bff735f4c0871705"
46m         Warning   NodeNotReady             pod/quicksilver-pg2wb                                       Node is not ready
46m         Normal    Killing                  pod/reducer-0                                               Stopping container reducer
46m         Normal    Killing                  pod/reducer-0                                               Stopping container checkup-buddy
45m         Warning   Unhealthy                pod/reducer-0                                               Readiness probe failed: Get "http://10.0.130.242:9001/": dial tcp 10.0.130.242:9001: connect: connection refused
46m         Warning   NodeNotReady             pod/register-0                                              Node is not ready
44m         Normal    TaintManagerEviction     pod/register-0                                              Cancelling deletion of Pod default/register-0

该实例似乎是 kubernetes 自动缩放组的一部分

    [ec2-user@ip-172-31-16-242 software]$ kubectl get po
NAME                          READY   STATUS        RESTARTS   AGE
auto-updater-27601140-4gqtw   0/1     Error         0          50m
auto-updater-27601140-hzrnl   0/1     Error         0          49m
auto-updater-27601140-kmspn   0/1     Error         0          50m
auto-updater-27601140-m4ws6   0/1     Error         0          49m
auto-updater-27601140-wsdpm   0/1     Error         0          45m
auto-updater-27601140-z2m7r   0/1     Error         0          48m
estimator-0                   3/3     Terminating   0          51m
reducer-0                     1/2     Terminating   0          51m
[ec2-user@ip-172-31-16-242 software]$ kubectl get pvc
NAME                         STATUS        VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
db-persistent-storage-db-0   Bound         pvc-ca829a02-9bf5-4540-9900-b6e5ab4624a2   100Gi      RWO            ssd            52m
estimator                    Terminating   pvc-e028acd5-eeb1-4028-89c2-a42c1d28091e   200Gi      RWO            hdd            52m
[ec2-user@ip-172-31-16-242 software]$ kubectl get logs estimator-0
error: the server doesn't have a resource type "logs"
[ec2-user@ip-172-31-16-242 software]$ kubectl logs estimator-0
error: a container name must be specified for pod estimator-0, choose one of: [postgres estimator resize-buddy]
[ec2-user@ip-172-31-16-242 software]$ kubectl logs estimator-0 -c estimator
Error from server (InternalError): Internal error occurred: Authorization error (user=kube-apiserver-kubelet-client, verb=get, resource=nodes, subresource=proxy)
[ec2-user@ip-172-31-16-242 software]$ kubectl logs estimator-0 -c postgres
Error from server (InternalError): Internal error occurred: Authorization error (user=kube-apiserver-kubelet-client, verb=get, resource=nodes, subresource=proxy)
[ec2-user@ip-172-31-16-242 software]$ kubectl logs estimator-0 -c resize-buddy
Error from server (InternalError): Internal error occurred: Authorization error (user=kube-apiserver-kubelet-client, verb=get, resource=nodes, subresource=proxy)
[ec2-user@ip-172-31-16-242 software]$

以及减速器舱的详细信息。

[ec2-user@ip-172-31-16-242 software]$ kubectl logs reducer-0
Default container name "data-cruncher" not found in pod reducer-0
error: a container name must be specified for pod reducer-0, choose one of: [reducer checkup-buddy] or one of the init containers: [set-resource-owner]
[ec2-user@ip-172-31-16-242 software]$ kubectl logs reducer-0 -c reducer
Error from server (InternalError): Internal error occurred: Authorization error (user=kube-apiserver-kubelet-client, verb=get, resource=nodes, subresource=proxy)
[ec2-user@ip-172-31-16-242 software]$ kubectl logs reducer-0 -c checkup-buddy
Error from server (InternalError): Internal error occurred: Authorization error (user=kube-apiserver-kubelet-client, verb=get, resource=nodes, subresource=proxy)

甚至我检查了自动更新程序荚。也出现了类似的授权错误。

kubectl logs auto-updater-27601140-4gqtw
Error from server (InternalError): Internal error occurred: Authorization error (user=kube-apiserver-kubelet-client, verb=get, resource=nodes, subresource=proxy)

我尝试使用 kubectl edit 检查 pvc 内容并得到以下信息。

volume.beta.kubernetes.io/storage-provisioner:kubernetes.io/aws-ebs

[ec2-user@ip-172-31-16-242 ~]$ kubectl describe pv pvc-ca829a02-9bf5-4540-9900-b6e5ab4624a2
Name:              pvc-ca829a02-9bf5-4540-9900-b6e5ab4624a2
Labels:            topology.kubernetes.io/region=us-west-2
                   topology.kubernetes.io/zone=us-west-2b
Annotations:       kubernetes.io/createdby: aws-ebs-dynamic-provisioner
                   pv.kubernetes.io/bound-by-controller: yes
                   pv.kubernetes.io/provisioned-by: kubernetes.io/aws-ebs
Finalizers:        [kubernetes.io/pv-protection]
StorageClass:      ssd
Status:            Terminating (lasts 89m)
Claim:             default/db-persistent-storage-db-0
Reclaim Policy:    Delete
Access Modes:      RWO
VolumeMode:        Filesystem
Capacity:          100Gi
Node Affinity:
  Required Terms:
    Term 0:        topology.kubernetes.io/zone in [us-west-2b]
                   topology.kubernetes.io/region in [us-west-2]
Message:
Source:
    Type:       AWSElasticBlockStore (a Persistent Disk resource in AWS)
    VolumeID:   aws://us-west-2b/vol-02bc902640cdb406c
    FSType:     ext4
    Partition:  0
    ReadOnly:   false
Events:         <none>

无卷附件

[ec2-user@ip-172-31-16-242 ~]$ kubectl get volumeattachment
No resources found

节点:

kubectl get node
NAME                                         STATUS     ROLES    AGE    VERSION
ip-10-0-134-174.us-west-2.compute.internal   NotReady   <none>   3h8m   v1.21.12-eks-5308cf7
ip-10-0-142-12.us-west-2.compute.internal    NotReady   <none>   3h5m   v1.21.12-eks-5308cf7

并且计算部分没有节点。

节点

请建议如何修复此问题。

相关内容