看来我无法删除 Helm 版本。状态停留在 DELETING,与 kubernetes 相关的清理作业也失败了,但没有说明导致失败的原因。
你们以前遇到过这种情况吗?如何解决?我还单独运行了清理容器中使用的 kubectl 命令,但仍然没有任何反应。
谢谢 !。
这里附加的命令输出:
helm ls --all prometheus-operator --debug
NAME REVISION UPDATED STATUS CHART APP VERSION NAMESPACE
prometheus-operator 1 Mon Aug 5 17:22:14 2019 DELETING prometheus-operator-6.3.1 0.31.1 monitoring
prometheus-operator-v2 1 Mon Aug 5 19:26:20 2019 DEPLOYED prometheus-operator-6.4.0 0.31.1 monitoring
kubectl get job prometheus-operator-operator-cleanup -n monitoring
NAME COMPLETIONS DURATION AGE
prometheus-operator-operator-cleanup 0/1 19h 19h
kubectl describe jobs/prometheus-operator-operator-cleanup -n monitoring
Name: prometheus-operator-operator-cleanup
Namespace: monitoring
Selector: controller-uid=c6bfd107-b79a-11e9-a527-42010aa80121
Labels: app=prometheus-operator-operator
chart=prometheus-operator-6.3.1
heritage=Tiller
release=prometheus-operator
Annotations: helm.sh/hook: pre-delete
helm.sh/hook-delete-policy: hook-succeeded
helm.sh/hook-weight: 3
Parallelism: 1
Completions: 1
Start Time: Mon, 05 Aug 2019 19:04:59 +0300
Pods Statuses: 0 Running / 0 Succeeded / 0 Failed
Pod Template:
Labels: app=prometheus-operator-operator
chart=prometheus-operator-6.3.1
controller-uid=c6bfd107-b79a-11e9-a527-42010aa80121
heritage=Tiller
job-name=prometheus-operator-operator-cleanup
release=prometheus-operator
Service Account: prometheus-operator-operator
Containers:
kubectl:
Image: k8s.gcr.io/hyperkube:v1.12.1
Port: <none>
Host Port: <none>
Command:
/bin/sh
-c
kubectl delete alertmanager --all; kubectl delete prometheus --all; kubectl delete prometheusrule --all; kubectl delete servicemonitor --all; sleep 10; kubectl delete crd alertmanagers.monitoring.coreos.com; kubectl delete crd prometheuses.monitoring.coreos.com; kubectl delete crd prometheusrules.monitoring.coreos.com; kubectl delete crd servicemonitors.monitoring.coreos.com; kubectl delete crd podmonitors.monitoring.coreos.com;
Environment: <none>
Mounts: <none>
Volumes: <none>
Events: <none>
答案1
答案2
找到问题了。不确定为什么我在作业描述中没有任何事件,但再次运行删除并能够检查生成的清理 pod 日志:
Error from server (Forbidden): prometheuses.monitoring.coreos.com is forbidden: User "system:serviceaccount:monitoring:prometheus-operator-operator" cannot list resource "prometheuses" in API group "monitoring.coreos.com" in the namespace "monitoring"
该问题是由不完整的 helm release 删除(中断)引起的。在此删除过程中,prometheus 操作员的服务帐户及其关联的 clusterrolebinding+clusterrole 被删除,并且在第二次 helm delete 尝试时,它缺少删除第一次尝试中未删除的所有其他内容所需的权限。