helm delete release 卡在 DELETING 状态

helm delete release 卡在 DELETING 状态

看来我无法删除 Helm 版本。状态停留在 DELETING,与 kubernetes 相关的清理作业也失败了,但没有说明导致失败的原因。

你们以前遇到过这种情况吗?如何解决?我还单独运行了清理容器中使用的 kubectl 命令,但仍然没有任何反应。

谢谢 !。

这里附加的命令输出:

helm ls --all prometheus-operator --debug

NAME                    REVISION        UPDATED                         STATUS          CHART                           APP VERSION     NAMESPACE
prometheus-operator     1               Mon Aug  5 17:22:14 2019        DELETING        prometheus-operator-6.3.1       0.31.1          monitoring
prometheus-operator-v2  1               Mon Aug  5 19:26:20 2019        DEPLOYED        prometheus-operator-6.4.0       0.31.1          monitoring
kubectl get job prometheus-operator-operator-cleanup -n monitoring

NAME                                   COMPLETIONS   DURATION   AGE
prometheus-operator-operator-cleanup   0/1           19h        19h
kubectl describe jobs/prometheus-operator-operator-cleanup -n monitoring

Name:           prometheus-operator-operator-cleanup
Namespace:      monitoring
Selector:       controller-uid=c6bfd107-b79a-11e9-a527-42010aa80121
Labels:         app=prometheus-operator-operator
                chart=prometheus-operator-6.3.1
                heritage=Tiller
                release=prometheus-operator
Annotations:    helm.sh/hook: pre-delete
                helm.sh/hook-delete-policy: hook-succeeded
                helm.sh/hook-weight: 3
Parallelism:    1
Completions:    1
Start Time:     Mon, 05 Aug 2019 19:04:59 +0300
Pods Statuses:  0 Running / 0 Succeeded / 0 Failed
Pod Template:
  Labels:           app=prometheus-operator-operator
                    chart=prometheus-operator-6.3.1
                    controller-uid=c6bfd107-b79a-11e9-a527-42010aa80121
                    heritage=Tiller
                    job-name=prometheus-operator-operator-cleanup
                    release=prometheus-operator
  Service Account:  prometheus-operator-operator
  Containers:
   kubectl:
    Image:      k8s.gcr.io/hyperkube:v1.12.1
    Port:       <none>
    Host Port:  <none>
    Command:
      /bin/sh
      -c
      kubectl delete alertmanager   --all; kubectl delete prometheus     --all; kubectl delete prometheusrule --all; kubectl delete servicemonitor --all; sleep 10; kubectl delete crd alertmanagers.monitoring.coreos.com; kubectl delete crd prometheuses.monitoring.coreos.com; kubectl delete crd prometheusrules.monitoring.coreos.com; kubectl delete crd servicemonitors.monitoring.coreos.com; kubectl delete crd podmonitors.monitoring.coreos.com;

    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Events:           <none>

答案1

查看

你可能只需要:

kubectl edit jobs/prometheus-operator-operator-cleanup -n monitoring 并从资源中删除终结器块。

答案2

找到问题了。不确定为什么我在作业描述中没有任何事件,但再次运行删除并能够检查生成的清理 pod 日志:

Error from server (Forbidden): prometheuses.monitoring.coreos.com is forbidden: User "system:serviceaccount:monitoring:prometheus-operator-operator" cannot list resource "prometheuses" in API group "monitoring.coreos.com" in the namespace "monitoring"

该问题是由不完整的 helm release 删除(中断)引起的。在此删除过程中,prometheus 操作员的服务帐户及其关联的 clusterrolebinding+clusterrole 被删除,并且在第二次 helm delete 尝试时,它缺少删除第一次尝试中未删除的所有其他内容所需的权限。

相关内容