Kubernetes - 尽管标签选择器匹配，但服务没有端点，相同的无头服务获取端点

2024-6-19 • tag-icon

Kubernetes - 尽管标签选择器匹配，但服务没有端点，相同的无头服务获取端点

我正在部署以下 statefulset，包含两个不同的服务。一个服务用于集群访问 pod（crdb-service.yaml），另一个服务用于 pod 的内部通信（crdb.yaml）。

crdb-service.yaml

apiVersion: v1
kind: Service
metadata:
  name: crdb-service
  labels:
    app: crdb
spec:
  ports:
  - port: 26257
    targetPort: 26257
    name: grpc
  - port: 80
    targetPort: 8080
    name: http
  selector:
    app: crdb

crdb.yaml

apiVersion: v1
kind: Service
metadata:
  name: crdb
  labels:
    app: crdb
  annotations:
    service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
    prometheus.io/scrape: "true"
    prometheus.io/path: "_status/vars"
    prometheus.io/port: "8080"
spec:
  ports:
  - port: 26257
    targetPort: 26257
    name: grpc
  - port: 8080
    targetPort: 8080
    name: http
  publishNotReadyAddresses: true
  clusterIP: None
  selector:
    app: crdb

statefulset.yaml

apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
  name: crdb
  labels:
    app: crdb
spec:
  serviceName: "crdb"
  replicas: 5
  template:
    metadata:
      labels:
        app: crdb
    spec:
      serviceAccountName: crdb
      containers:
      - name: crdb
        image: cockroachdb/cockroach:v19.1.2
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 26257
          name: grpc
        - containerPort: 8080
          name: http
        livenessProbe:
          httpGet:
            path: "/health"
            port: http
            scheme: HTTPS
          initialDelaySeconds: 30
          periodSeconds: 5
        readinessProbe:
          httpGet:
            path: "/health?ready=1"
            port: http
            scheme: HTTPS
          initialDelaySeconds: 10
          periodSeconds: 5
          failureThreshold: 2
        volumeMounts:
        - name: datadir
          mountPath: /cockroach/cockroach-data
        - name: certs
          mountPath: /cockroach/cockroach-certs
        env:
        - name: STATEFULSET_NAME
          value: "crdb"
        - name: STATEFULSET_FQDN
          value: "crdb.default.svc.cluster.local"
        - name: COCKROACH_CHANNEL
          value: kubernetes-secure
        command:
          - "/bin/bash"
          - "-ecx"
          - "exec /cockroach/cockroach start --logtostderr --certs-dir /cockroach/cockroach-certs --advertise-host $(hostname -f) --http-host 0.0.0.0 --join crdb-0.crdb,crdb-1.crdb,crdb-2.crdb,crdb-3.crdb,crdb-4.crdb --cache 25% --max-sql-memory 25%"
      terminationGracePeriodSeconds: 60
      volumes:
      - name: datadir
        persistentVolumeClaim:
          claimName: datadir
      - name: certs
        emptyDir: {}
  podManagementPolicy: Parallel
  updateStrategy:
    type: RollingUpdate
  volumeClaimTemplates:
  - metadata:
      name: datadir
    spec:
      accessModes:
        - "ReadWriteOnce"
      storageClassName: local-crdb-space
      resources:
        requests:
          storage: 1800Gi

现在我检查已部署的服务：

$ kubectl describe service crdb
Name:              crdb
Namespace:         default
Labels:            app=crdb
Annotations:       kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"v1","kind":"Service","metadata":{"annotations":{"prometheus.io/path":"_status/vars","prometheus.io/port":"8080","prometheus.io/scrape":"...
                   prometheus.io/path=_status/vars
                   prometheus.io/port=8080
                   prometheus.io/scrape=true
                   service.alpha.kubernetes.io/tolerate-unready-endpoints=true
Selector:          app=crdb
Type:              ClusterIP
IP:                None
Port:              grpc  26257/TCP
TargetPort:        26257/TCP
Endpoints:         10.244.10.24:26257,10.244.2.23:26257,10.244.3.18:26257 + 2 more...
Port:              http  8080/TCP
TargetPort:        8080/TCP
Endpoints:         10.244.10.24:8080,10.244.2.23:8080,10.244.3.18:8080 + 2 more...
Session Affinity:  None
Events:            <none>

$ kubectl describe service crdb-service
Name:              crdb-service
Namespace:         default
Labels:            app=crdb
Annotations:       kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"app":"crdb"},"name":"crdb-service","namespace":"default"},"spec":{"ports":[...
Selector:          app=crdb
Type:              ClusterIP
IP:                10.100.71.172
Port:              grpc  26257/TCP
TargetPort:        26257/TCP
Endpoints:         
Port:              http  80/TCP
TargetPort:        8080/TCP
Endpoints:         
Session Affinity:  None
Events:            <none>

尽管具有完全相同的标签选择器，但集群服务的端点字段为空。检查https://github.com/kubernetes/kubernetes/issues/11795 https://kubernetes.io/docs/tasks/debug-application-cluster/debug-service没有发现原因。

一些可能与当前问题相关的其他信息。我已将集群从 1.13 更新到 1.14 再到 1.15。Pod 在最近添加到集群的节点上运行。之前在新节点上部署的 Pod 存在网络问题（由于 DNS 故障而无法访问，通过net.ipv4.ip_forward = 1在新节点上进行设置解决了此问题）

我怎样才能让服务识别这些吊舱？

答案1

publishNotReadyAddresses: trueNVM，浪费了 2 个小时。它只是需要添加到服务中的字段，用于在启动时发布其 IP 的 Pod。

答案1

相关内容