MongoDB 辅助副本永久处于不可访问/健康状态

MongoDB 辅助副本永久处于不可访问/健康状态

我有一个标准的 3 节点 MongoDB 副本集:

  • 10.0.2.35- 基本的
  • 10.0.3.169- 中学
  • 10.0.1.48- 中学

我目前无法将它们作为副本集连接,只能通过主服务器连接。如果我rs.status()在主服务器上运行,我会反复收到:

{
        "set" : "ecReplica",
        "date" : ISODate("2018-04-23T19:12:10.014Z"),
        "myState" : 1,
        "term" : NumberLong(-1),
        "heartbeatIntervalMillis" : NumberLong(2000),
        "members" : [
                {
                        "_id" : 0,
                        "name" : "ip-10-0-3-169:27017",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 10717677,
                        "optime" : Timestamp(1524510722, 15),
                        "optimeDate" : ISODate("2018-04-23T19:12:02Z"),
                        "lastHeartbeat" : ISODate("2018-04-23T19:12:08.186Z"),
                        "lastHeartbeatRecv" : ISODate("2018-04-23T19:12:09.656Z"),
                        "pingMs" : NumberLong(1),
                        "syncingTo" : "ip-10-0-2-35:27017",
                        "configVersion" : 405240
                },
                {
                        "_id" : 1,
                        "name" : "ip-10-0-1-48:27017",
                        "health" : 0,
                        "state" : 6,
                        "stateStr" : "(not reachable/healthy)",
                        "uptime" : 0,
                        "optime" : Timestamp(0, 0),
                        "optimeDate" : ISODate("1970-01-01T00:00:00Z"),
                        "lastHeartbeat" : ISODate("2018-04-23T19:12:09.116Z"),
                        "lastHeartbeatRecv" : ISODate("2018-04-23T19:12:08.404Z"),
                        "pingMs" : NumberLong(0),
                        "authenticated" : false,
                        "configVersion" : -1
                },
                {
                        "_id" : 2,
                        "name" : "ip-10-0-2-35:27017",
                        "health" : 1,
                        "state" : 1,
                        "stateStr" : "PRIMARY",
                        "uptime" : 10717680,
                        "optime" : Timestamp(1524510722, 15),
                        "optimeDate" : ISODate("2018-04-23T19:12:02Z"),
                        "electionTime" : Timestamp(1524486537, 1),
                        "electionDate" : ISODate("2018-04-23T12:28:57Z"),
                        "configVersion" : 405240,
                        "self" : true
                }
        ],
        "ok" : 1
}

当我 ssh 进入主服务器时,看到以下错误/var/log/mongodb/mongod.log

2018-04-23T19:23:54.326 + 0000 I REPL [ReplicationExecutor] 对 ip-​​10-0-1-48:27017 的心跳请求出错;未经授权,管理员未授权执行命令 {replSetHeartbeat:“ecReplica”,pv:1,v:405240,来自:“ip-10-0-2-35:27017”,fromId:2,checkEmpty:false}

附加信息

连接

使用 Mongo Shell 和 Robo3T 通过 SSH 隧道分别连接到所有 3 个节点,但我无法将这 3 个节点作为副本集连接到。

生产服务器显然已成功连接到副本集。

远程登录

telnet 10.0.1.48 27017来自10.0.2.35作品。

/etc/mongod.conf

配置文件几乎完全相同,唯一的区别在于net部分:

节点10.0.1.48

# network interfaces
net:
  port: 27017
  bindIp: [127.0.0.1,10.0.3.169,10.0.2.35]

节点10.0.3.169

# network interfaces
net:
  port: 27017
  bindIp: [10.0.1.48,10.0.2.35,127.0.0.1]

节点10.0.2.35

# network interfaces
net:
  port: 27017
  bindIp: [127.0.0.1,10.0.3.169,10.0.1.48]

注意:security部分是空的,因此这不是密钥文件问题。

数据库.版本()

3.2.0

基础设施

所有节点都在同一个 AWS VPC 中运行,虽然它们位于不同的可用区,但它们属于同一个安全组并使用相同的网络 ACL 和路由表。


这是一个继承的设置,它已经运行了 2 年多。

我错过了什么?

答案1

看来您的设置auth在 ReplicaSet 中仍然被禁用。

要启用它,只需在设置中添加security.keyFile--keyFile在命令行选项中使用。

下面是一个展示如何生成此类文件的示例:

openssl rand -base64 756 > <path-to-keyfile>
chmod 400 <path-to-keyfile>

然后添加到mongod.conf生成的密钥文件的路径:

security:
   authorization: enabled
   keyFile: /path/to/keyfile

重新启动 mongod 服务,现在你的 mongo 应该已经auth启用。

有关更多信息keyFile,请参阅 在副本集中强制实施密钥文件访问控制

答案2

感谢@Stennie 的意见,我找到了问题的根源。

我已将副本集配置更改为使用实际 IP 地址而不是 AWS 内部 DNS。

另外,我修复了bindIp主机的选项:

# Where and how to store data.
storage:
  dbPath: /var/lib/mongodb
  journal:
    enabled: true
#  engine:
#  mmapv1:
#  wiredTiger:

# where to write logging data.
systemLog:
  destination: file
  logAppend: true
  path: /var/log/mongodb/mongod.log

# network interfaces
net:
  port: 27017
  bindIp: [127.0.0.1,**INSTANCE PRIVATE IP**]


#processManagement:

#security:
#  authorization: enabled

#operationProfiling:

replication:
  replSetName: ecReplica
#sharding:

## Enterprise-Only Options:

#auditLog:

#snmp:

尽管如此,跑步rs.status()还是给了我:

{
    "set" : "ecReplica",
    "date" : ISODate("2018-04-25T16:03:27.277Z"),
    "myState" : 1,
    "term" : NumberLong(-1),
    "heartbeatIntervalMillis" : NumberLong(2000),
    "members" : [
        {
            "_id" : 0,
            "name" : "10.0.3.169:27017",
            "health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY",
            "uptime" : 6153,
            "optime" : Timestamp(1524671340, 1),
            "optimeDate" : ISODate("2018-04-25T15:49:00Z"),
            "electionTime" : Timestamp(1524672170, 1),
            "electionDate" : ISODate("2018-04-25T16:02:50Z"),
            "configVersion" : 405245,
            "self" : true
        },
        {
            "_id" : 1,
            "name" : "10.0.1.48:27017",
            "health" : 0,
            "state" : 6,
            "stateStr" : "(not reachable/healthy)",
            "uptime" : 0,
            "optime" : Timestamp(0, 0),
            "optimeDate" : ISODate("1970-01-01T00:00:00Z"),
            "lastHeartbeat" : ISODate("2018-04-25T16:03:26.532Z"),
            "lastHeartbeatRecv" : ISODate("2018-04-25T16:03:25.651Z"),
            "pingMs" : NumberLong(0),
            "authenticated" : false,
            "configVersion" : -1
        },
        {
            "_id" : 2,
            "name" : "10.0.2.35:27017",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 5916,
            "optime" : Timestamp(1524671340, 1),
            "optimeDate" : ISODate("2018-04-25T15:49:00Z"),
            "lastHeartbeat" : ISODate("2018-04-25T16:03:26.531Z"),
            "lastHeartbeatRecv" : ISODate("2018-04-25T16:03:25.760Z"),
            "pingMs" : NumberLong(1),
            "lastHeartbeatMessage" : "could not find member to sync from",
            "configVersion" : 405245
        }
    ],
    "ok" : 1
}

另外,我注意到以下行在/etc/mongod.confat中被取消注释10.0.1.48

security:
  authorization: enabled

在另外两个配置文件中,该行已被注释。

当我注释掉该authorization行并重新启动服务时,恶意节点终于可以同步了:

{
    "set" : "ecReplica",
    "date" : ISODate("2018-04-25T16:14:18.489Z"),
    "myState" : 1,
    "term" : NumberLong(-1),
    "heartbeatIntervalMillis" : NumberLong(2000),
    "members" : [
        {
            "_id" : 0,
            "name" : "10.0.3.169:27017",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 70,
            "optime" : Timestamp(1524672835, 2),
            "optimeDate" : ISODate("2018-04-25T16:13:55Z"),
            "lastHeartbeat" : ISODate("2018-04-25T16:14:18.248Z"),
            "lastHeartbeatRecv" : ISODate("2018-04-25T16:14:16.605Z"),
            "pingMs" : NumberLong(1),
            "lastHeartbeatMessage" : "syncing from: 10.0.2.35:27017",
            "syncingTo" : "10.0.2.35:27017",
            "configVersion" : 405245
        },
        {
            "_id" : 1,
            "name" : "10.0.1.48:27017",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 186,
            "optime" : Timestamp(1524672835, 2),
            "optimeDate" : ISODate("2018-04-25T16:13:55Z"),
            "lastHeartbeat" : ISODate("2018-04-25T16:14:18.313Z"),
            "lastHeartbeatRecv" : ISODate("2018-04-25T16:14:17.533Z"),
            "pingMs" : NumberLong(3),
            "lastHeartbeatMessage" : "syncing from: 10.0.3.169:27017",
            "syncingTo" : "10.0.3.169:27017",
            "configVersion" : 405245
        },
        {
            "_id" : 2,
            "name" : "10.0.2.35:27017",
            "health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY",
            "uptime" : 186,
            "optime" : Timestamp(1524672835, 2),
            "optimeDate" : ISODate("2018-04-25T16:13:55Z"),
            "infoMessage" : "could not find member to sync from",
            "electionTime" : Timestamp(1524672748, 1),
            "electionDate" : ISODate("2018-04-25T16:12:28Z"),
            "configVersion" : 405245,
            "self" : true
        }
    ],
    "ok" : 1
}

相关内容