Keepalived监控postgresql进程自动切换ip的脚本_环境搭建_软件开发_文章

Keepalived监控postgresql进程自动切换ip的脚本

admin

2021-03-03 09:52:29

0次

Keepalived监控postgresql进程自动切换ip的脚本

keepalived通过vrrp协议实现抢占式、非抢占式的路由切换功能。

在同一个网段内，管理多个主机的虚拟ip绑定到同一个公网ip，使得这些主机形成一个master主机多个backup备机的工作状态；

假如当前master主机出现故障，keepalived自动完成公网ip绑定到这些主机其中的一台backup备机上，使得对公网外的客户机，可以正常访问服务，主机的故障对外没有影响，从而实现主机宕机故障转移的需求。例如web服务器的场景，每台主机上运行着nginx服务器，当前master上的ngxin故障后，keepalived自动把流量访问到另外一台backup备机的nginx上。

在高可用场景，每个服务都应该配置一个备机，实现业务的可靠稳定性。例如postgresql的部署，它本身具备一台primary主机和多台standby备机的部署机制。当primary主机故障发生后无法提供服务，业务需求马上能把备机投入到业务中，充当起主机的工作职位。

本文就是通过脚本的方式，监控postgresql的进程工作状态，一旦postgresql进程停止，部署的repmgrd进程会通知提升另外一台postgresql备机充当主机提供服务。然后，keepalived 同时也监听到了postgresql进程状态，把ip切换到新主机上，继续提供业务服务。

 #!/bin/sh
#############################################################################
#   Author: MaXiang
#  Created: 2020/09/01
#  ToDo:
#    Daemon for monitoring keepalived failover scripts
#############################################################################
SCR_DIR='/home/postgres'
KEEP_MGR=`ps -ef|grep 'keepalived_repmgr.sh'|grep -v grep|wc -l`
if [[ ${KEEP_MGR} -lt 1 ]]; then
    nohup /bin/sh ${SCR_DIR}/keepalived_repmgr.sh >> ${SCR_DIR}/keepalived_repmgr.log &
fi

这里需要先配置 postgresql 主机之间可以通过ssh互信访问，即ssh IP地址能访问到对方；

#!/bin/sh
#############################################################################
#   Author: MaXiang
#  Created: 2020/09/01
#  ToDo:
#    Created for monitoring keepalived for PostgreSQL repmgr cluster
#    When production shutdown with errors,it needs about 10s to failover.
#############################################################################
NODES='pg1 pg2 pg3'
VIP='192.168.56.204'
PGDATA=/PostgreSQL/data
PGHOME=/PostgreSQL
REPMGR_CONF='/etc/repmgr.conf'
KEEP_CONF='/etc/keepalived/keepalived.conf'
# state time function to write logfiles
function log_time(){
    date '+%Y-%m-%d %H:%M:%S '
}
echo "$(log_time) Repmgr and KeepAlived auto-failover monitoring start..."
while true
do
    for NODE in ${NODES}
    do
        KEEP_STA=`ssh $NODE "/usr/sbin/ip a|grep -w ${VIP}|wc -l"`
        if [[ ${KEEP_STA} -eq 1 ]]; then
            KEEP_MASTER=$NODE
            echo "$(log_time) KeepAlived Master is ${KEEP_MASTER}."
        fi
    done
    for NODE in ${NODES}
    do
        MGR_STA=`ssh $NODE "$PGHOME/bin/pg_controldata -D ${PGDATA}|grep 'in production'|wc -l"`
        if [[ ${MGR_STA} -ne 1 ]]; then
            continue
        else
            MGR_PRIMARY=$NODE
            echo "$(log_time) Repmgr Cluster Primary is ${MGR_PRIMARY}."
            break
        fi
    done
    sleep 1s
    if [[ ${KEEP_MASTER} == ${MGR_PRIMARY} ]]; then
        echo "$(log_time) High Available Architecture running normally."
    else
        echo "$(log_time) KeepAlived Master is ${KEEP_MASTER}, but Repmgr Primary is ${MGR_PRIMARY}."
        for NODE in ${NODES}
        do
            case $NODE in
                ${MGR_PRIMARY} )
                    ssh ${NODE} "sed -i 's/priority .*/priority 100/g' ${KEEP_CONF}"
                    ssh ${NODE} "sudo systemctl restart keepalived"
                    echo "$(log_time) Modify KeepAlived priority for ${NODE} to 100."
                    ;;
                ${KEEP_MASTER} )
                    ssh ${NODE} "sed -i 's/priority .*/priority 60/g' ${KEEP_CONF}"
                    ssh ${NODE} "sudo systemctl restart keepalived"
                    echo "$(log_time) Modify KeepAlived priority for ${NODE} to 60."
                    ;;
                * )
                    ssh ${NODE} "sed -i 's/priority .*/priority 80/g' ${KEEP_CONF}"
                    ssh ${NODE} "sudo systemctl restart keepalived"
                    echo "$(log_time) Modify KeepAlived priority for ${NODE} to 80."
                    ;;
            esac
        done
        SECS=10
        while [[ ${SECS} -ge 0 ]]
        do
            echo "$(log_time) Waiting for failover to target: ${SECS}s"
            let SECS--
            if [[ ${SECS} -eq 0 ]]; then
                break
            fi
            sleep 1s
        done
        echo "$(log_time) Failover KeepAlived to ${MGR_PRIMARY} completed."
    fi
    sleep 2s
done

好了，本文主要功能描述介绍，希望能帮助到你。

有帮助

没帮助

上一篇：Ubuntu Server下配置UTF-8中文环境,ubuntu server zh_CN.UTF-8

下一篇：./nginx: error while loading shared libraries: libpcre.so.1: cannot open shared object file: No such

栏目索引