我已经运行了 3 个 Cassandra 节点/pod。我删除了它们,并尝试在同一个集群上使用以下相同的 YAML 文件重新创建它们Kind
,但它停留在待处理状态:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: cassandra
labels:
app: cassandra
spec:
serviceName: cassandra
replicas: 3
selector:
matchLabels:
app: cassandra
template:
metadata:
labels:
app: cassandra
spec:
terminationGracePeriodSeconds: 1800
containers:
- name: cassandra
image: gcr.io/google-samples/cassandra:v13
imagePullPolicy: Always
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7001
name: tls-intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
resources:
limits:
cpu: "500m"
memory: 1Gi
requests:
cpu: "500m"
memory: 1Gi
securityContext:
capabilities:
add:
- IPC_LOCK
lifecycle:
preStop:
exec:
command:
- /bin/sh
- -c
- nodetool drain
env:
- name: MAX_HEAP_SIZE
value: 512M
- name: HEAP_NEWSIZE
value: 100M
- name: CASSANDRA_SEEDS
value: "cassandra-0.cassandra.default.svc.cluster.local"
- name: CASSANDRA_CLUSTER_NAME
value: "K8Demo"
- name: CASSANDRA_DC
value: "DC1-K8Demo"
- name: CASSANDRA_RACK
value: "Rack1-K8Demo"
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
readinessProbe:
exec:
command:
- /bin/bash
- -c
- /ready-probe.sh
initialDelaySeconds: 15
timeoutSeconds: 5
# These volume mounts are persistent. They are like inline claims,
# but not exactly because the names need to match exactly one of
# the stateful pod volumes.
volumeMounts:
- name: cassandra-data
mountPath: /cassandra_data
# These are converted to volume claims by the controller
# and mounted at the paths mentioned above.
# do not use these in production until ssd GCEPersistentDisk or other ssd pd
volumeClaimTemplates:
- metadata:
name: cassandra-data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: fast
resources:
requests:
storage: 1Gi
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: fast
provisioner: k8s.io/minikube-hostpath
parameters:
type: pd-ssd
---
apiVersion: v1
kind: Service
metadata:
labels:
app: cassandra
name: cassandra
spec:
clusterIP: None
ports:
- port: 9042
selector:
app: cassandra
我在网上搜索后,认为问题是由于资源不足造成的,但我猜想这是因为之前分配给已删除节点/pod的资源仍然被占用。但我不知道如何释放它们?
我试过kubectl top nodes
:
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
kind-control-plane 205m 2% 1046Mi 6%
kind-worker 171m 2% 2612Mi 16%
看上去一切都还好?
也许问题出在硬盘分配上,我不知道如何检查?
答案1
如果 Pod 处于待处理状态,那么通常是由于缺少资源.首先检查事件了解 pod 处于待处理状态的原因。为此,请使用以下命令
Kubectl describe pod-name
这事件将深入了解 Pod 处于待处理状态的原因。Pod 进入待处理状态的一个常见原因是缺乏记忆或者贮存. 您可能已经耗尽了节点中可用的资源。恢复耗尽资源的一种方法是通过删除不需要的 pod 和部署来清理节点。
这官方文件包含调试 kubernetes 中的 pod 的信息。
这文档有助于调试 k8s 中的 statefulsets。
如果你需要一个在 k8s 中部署 cassandra 的示例,这个官方k8s文档会有所帮助。