Postgresql keepalived gluster | 故障状态

Postgresql keepalived gluster | 故障状态

我正在尝试使用 Keeplalived、postgres 和 gluster 实现简单的故障转移。

使用 CentOs 7

我已经在两个节点的‘/var/lib/pgsql’上安装了复制的 gluster 卷。

Shared ip(Keepalived): 192.168.1.20
node01: 192.168.1.11
node02: 192.168.1.12

pgsql-check脚本内容:

#!/usr/bin/python

import subprocess
import sys

try:
    subprocess.check_call(['/usr/bin/systemctl', 'status', 'postgresql.service'])
    sys.exit(0)
except subprocess.CalledProcessError:
    sys.exit(3)

通知脚本内容:

#!/usr/bin/python

import sys
import subprocess

if sys.argv[3] == "MASTER":
    try:
        subprocess.check_call(['/usr/bin/systemctl start postgresql.service'])
    except subprocess.CalledProcessError:
        pass
    sys.exit(0)

if sys.argv[3] == "BACKUP":
    try:
        subprocess.check_call(['/usr/bin/systemctl', 'stop', 'postgresql.service'])
    except subprocess.CalledProcessError:
        pass
    sys.exit(0)

if sys.argv[3] == "FAULT":
    try:
        subprocess.check_call(['/usr/bin/systemctl', 'stop', 'postgresql.service'])
    except subprocess.CalledProcessError:
        pass
    sys.exit(0)

sys.exit(1)

keepalived.conf:

vrrp_script chk_pgsql {
  script       "/etc/keepalived/pgsql-check"
  interval 2   # check every 2 seconds
  fall 2       # require 2 failures for KO
  rise 2       # require 2 successes for OK
}

vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.1.20
    }
    track_script {
        chk_pgsql
    }
    notify "/etc/keepalived/notify"
}

机器启动时会进入故障状态。但主机需要进入主状态。当我手动启动 postgres 并在主服务器上重新启动 keepalived 时,一切都正常。当我尝试进行故障转移时,两台机器都处于故障状态并且无法恢复。

有人能帮忙配置/脚本吗?我误解了通知或检查机制吗?

答案1

当为脚本指定权重 1 时,如下所示:

vrrp_script chk_pgsql {
  script       "/etc/keepalived/check-pgsql"
  interval 1  
  fall 3      
  rise 1      
  weight 1
}

然后突然一切都按预期工作了。默认权重为 0。

阅读此链接后我发现了这一点:http://comments.gmane.org/gmane.linux.keepalived.devel/2586

这不是答案,但它为我指明了正确的方向。

当前配置:

vrrp_script chk_pgsql {
  script       "/etc/keepalived/check-pgsql"
  interval 1  
  fall 3    
  rise 1     
  weight 1
}

vrrp_instance VI_1 {
    state BACKUP
    interface eth0
    virtual_router_id 51
    priority 99
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    track_script {
        chk_pgsql
    }
    virtual_ipaddress {
        192.168.1.20
    }
    notify_master "/etc/keepalived/start-pgsql"
    notify_backup "/etc/keepalived/stop-pgsql"
    notify_fault "/etc/keepalived/stop-pgsql"
    notify_stop "/etc/keepalived/stop-pgsql"
}

相关内容