当跟踪进程出现时,Keepalived 没有获得法定人数

当跟踪进程出现时,Keepalived 没有获得法定人数

我在 CentOS7 上使用 keepalived v2.0.19,其中有一个 vrrp 实例跟踪 haproxy 进程的存在。不幸的是,在 haproxy 进程重新启动后,vrrp 实例从未离开 FAULT 状态

这是我的配置

vrrp_track_process chk_service {
    process haproxy
    weight 0
}

vrrp_instance VI_1 {
    interface eth0
    state MASTER
    virtual_router_id 51
        priority 101
        advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        10.0.0.100 dev eth0 label eth0:shared
    }
    track_process {
        chk_service
    }
}

syslogs 日志显示,当 haproxy 进程关闭时,仲裁丢失,但是当 haproxy 进程几秒钟后重新上线时,仲裁从未获得。

systemd: Stopping HAProxy Load Balancer...
haproxy: [WARNING] 330/081104 (72258) : Exiting Master process...
haproxy: [ALERT] 330/081104 (72258) : Current program 'dataplane-api' (72260) exited with code 0 (Exit)
haproxy: [ALERT] 330/081104 (72258) : Current worker #1 (72261) exited with code 143 (Terminated)
haproxy: [WARNING] 330/081104 (72258) : All workers exited. Exiting... (0)
systemd: Stopped HAProxy Load Balancer.
Keepalived_vrrp[72335]: Quorum lost for tracked process chk_service
Keepalived_vrrp[72335]: (VI_1) Entering FAULT STATE
Keepalived_vrrp[72335]: (VI_1) sent 0 priority
Keepalived_vrrp[72335]: (VI_1) removing VIPs.
systemd: Starting HAProxy Load Balancer...
haproxy[113178]: Proxy stats started.
haproxy[113178]: Proxy main started.
haproxy[113178]: Proxy app started.
haproxy: [NOTICE] 330/081112 (113178) : New program 'dataplane-api' (113179) forked
haproxy: [NOTICE] 330/081112 (113178) : New worker #1 (113180) forked
systemd: Started HAProxy Load Balancer.

请注意,当我启动 keepalived 进程时,可以正确检测到 haproxy 进程的存在

以下是 keepalived -v 的输出

Keepalived v2.0.19 (unknown)

Copyright(C) 2001-2019 Alexandre Cassen, <[email protected]>

Built with kernel headers for Linux 3.10.0
Running on Linux 3.10.0-1062.el7.x86_64 #1 SMP Thu Jul 18 20:25:13 UTC 2019

configure options: --prefix=/opt/keepalived

Config options:  LIBIPTC LIBIPSET_DYNAMIC LVS VRRP VRRP_AUTH OLD_CHKSUM_COMPAT FIB_ROUTING

System options:  PIPE2 SIGNALFD INOTIFY_INIT1 VSYSLOG EPOLL_CREATE1 IPV6_ADVANCED_API LIBNL3 RTA_ENCAP RTA_EXPIRES RTA_PREF FRA_SUPPRESS_PREFIXLEN FRA_TUN_ID RTAX_CC_ALGO RTAX_QUICKACK FRA_OIFNAME IFA_FLAGS IP_MULTICAST_ALL LIBIPTC NET_LINUX_IF_H_COLLISION LIBIPVS_NETLINK VRRP_VMAC IFLA_LINK_NETNSID CN_PROC SOCK_NONBLOCK SOCK_CLOEXEC O_PATH GLOB_BRACE INET6_ADDR_GEN_MODE SO_MARK SCHED_RT SCHED_RESET_ON_FORK

我尝试设置法定人数的最小值和最大值,但没有成功。

有人遇到过同样的问题吗?

答案1

keepalived 2.0.19 版本也遇到同样的问题。

在我们的案例中,问题是对于 pid 大于 32767 的进程,keepalived 尝试打开文件:/proc/xxxxx/comm,其中 xxxx 为负数。因此,如果计算机运行时间较长,并且 pid 变得很大,您可以试验这种行为。

幸运的是,keepalived 2.0.20 修复了这个错误,如下所述:

  • 修复 PID > 32767 的 track_process

https://www.keepalived.org/changelog.html(版本 2.0.20)

相关内容