我有一个由一个服务器节点和三个工作节点组成的 k3s 集群,目前正在运行。我想添加一个新的服务器节点,以确保在服务器节点发生故障时实现 HA。我正在使用 ksup 和 kubevip。我已经将 vip 添加到主节点,现在可以 ping 通,但当我尝试运行下面的脚本时,它会失败。当我登录到失败的节点并查看日志时,我看到此错误:
domain='domain.local'
hostname='k3s-master02'
master='k3s-cluster01'
k3sup join --host "$hostname"."$domain" \
--server --server-host "$master"."$domain" \
--user my_user --k3s-channel stable \
--k3s-extra-args '--disable servicelb --disable traefik'
Dec 23 13:22:57 k3s-master02.domain.local sh[18740]: + /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service
Dec 23 13:22:57 k3s-master02.zbs.local systemctl[18741]: Failed to get unit file state for nm-cloud-setup.service: No such file or directory
Dec 23 13:22:57 k3s-master02.domain.local k3s[18744]: time="2023-12-23T13:22:57-05:00" level=info msg="Starting k3s v1.28.4+k3s2 (6ba6c1b6)"
Dec 23 13:22:57 k3s-master02.domain.local k3s[18744]: time="2023-12-23T13:22:57-05:00" level=fatal msg="starting kubernetes: preparing server: CA cert validation failed: Get \"https://k3s-cluster01.domain.>
Dec 23 13:22:57 k3s-master02.domain.local systemd[1]: k3s.service: Main process exited, code=exited, status=1/FAILURE
Dec 23 13:22:57 k3s-master02.domain.local systemd[1]: k3s.service: Failed with result 'exit-code'.
Dec 23 13:22:57 k3s-master02.domain.local systemd[1]: Failed to start k3s.service - Lightweight Kubernetes.