我正在尝试在 2 台 Fedora 38 VM 上使用 DRBD 和 Pacemake 设置高可用性 NFS 存储(第一次这样做)。
我已经设法启动了起搏器集群并在我的主机上挂载了 NFS 共享文件夹,但是当我尝试在该文件夹中写入某些内容时,出现了权限被拒绝错误。
将挂载点权限更改为 666 或 777 没有帮助。
知道可能是什么问题吗?
我的 DRBD 配置如下:
#> sudo vi /etc/drbd.d/global_common.conf
global {
usage-count yes;
}
common {
disk {
no-disk-flushes;
no-disk-barrier;
c-fill-target 24M;
c-max-rate 720M;
c-plan-ahead 15;
c-min-rate 4M;
}
net {
protocol C;
max-buffers 36k;
sndbuf-size 1024k;
rcvbuf-size 2048k;
}
}
#> sudo vi /etc/drbd.d/ha_nfs.res
resource ha_nfs {
device "/dev/drbd1003";
disk "/dev/nfs/share";
meta-disk internal;
on server1.test {
address 192.168.1.116:7789;
}
on server2.test {
address 192.168.1.167:7789;
}
}
起搏器配置如下:
crm> configure edit
node 1: server1.test
node 2: server2.test
primitive p_drbd_attr ocf:linbit:drbd-attr
primitive p_drbd_ha_nfs ocf:linbit:drbd \
params drbd_resource=ha_nfs \
op monitor timeout=20s interval=21s role=Slave start-delay=12s \
op monitor timeout=20s interval=20s role=Master start-delay=8s
primitive p_expfs_nfsshare_exports_HA exportfs \
params clientspec="192.168.1.0/24" directory="/nfsshare/exports/HA" fsid=1003 unlock_on_stop=1 options="rw,mountpoint" \
op monitor interval=15s timeout=40s start-delay=15s \
op_params OCF_CHECK_LEVEL=0 \
op start interval=0s timeout=40s \
op stop interval=0s timeout=120s
primitive p_fs_nfsshare_exports_HA Filesystem \
params device="/dev/drbd1003" directory="/nfsshare/exports/HA" fstype=ext4 run_fsck=no \
op monitor interval=15s timeout=40s start-delay=15s \
op_params OCF_CHECK_LEVEL=0 \
op start interval=0s timeout=60s \
op stop interval=0s timeout=60s
primitive p_nfsserver nfsserver
primitive p_pb_block portblock \
params action=block ip=192.168.1.101 portno=2049 protocol=tcp
primitive p_pb_unblock portblock \
params action=unblock ip=192.168.1.101 portno=2049 tickle_dir="/srv/drbd-nfs/nfstest/.tickle" reset_local_on_unblock_stop=1 protocol=tcp \
op monitor interval=10s timeout=20s start-delay=15s
primitive p_virtip IPaddr2 \
params ip=192.168.1.101 cidr_netmask=32 \
op monitor interval=1s timeout=40s start-delay=0s \
op start interval=0s timeout=20s \
op stop interval=0s timeout=20s
ms ms_drbd_ha_nfs p_drbd_ha_nfs \
meta master-max=1 master-node-max=1 clone-node-max=1 clone-max=2 notify=true
clone c_drbd_attr p_drbd_attr
colocation co_ha_nfs inf: p_pb_block p_virtip ms_drbd_ha_nfs:Master p_fs_nfsshare_exports_HA p_expfs_nfsshare_exports_HA p_nfsserver p_pb_unblock
property cib-bootstrap-options: \
have-watchdog=false \
cluster-infrastructure=corosync \
cluster-name=nfsCluster \
stonith-enabled=false \
no-quorum-policy=ignore
PCS 状态输出:
[bebe@server2 share]$ sudo pcs status
[sudo] password for bebe:
Cluster name: nfsCluster
Cluster Summary:
* Stack: corosync (Pacemaker is running)
* Current DC: server1.test (version 2.1.6-4.fc38-6fdc9deea29) - partition with quorum
* Last updated: Thu Jul 13 08:50:34 2023 on server2.test
* Last change: Thu Jul 13 08:27:46 2023 by hacluster via crmd on server1.test
* 2 nodes configured
* 10 resource instances configured
Node List:
* Online: [ server1.test server2.test ]
Full List of Resources:
* p_virtip (ocf::heartbeat:IPaddr2): Started server2.test
* p_expfs_nfsshare_exports_HA (ocf::heartbeat:exportfs): Started server2.test
* p_fs_nfsshare_exports_HA (ocf::heartbeat:Filesystem): Started server2.test
* p_nfsserver (ocf::heartbeat:nfsserver): Started server2.test
* p_pb_block (ocf::heartbeat:portblock): Started server2.test
* p_pb_unblock (ocf::heartbeat:portblock): Started server2.test
* Clone Set: ms_drbd_ha_nfs [p_drbd_ha_nfs] (promotable):
* Masters: [ server2.test ]
* Slaves: [ server1.test ]
* Clone Set: c_drbd_attr [p_drbd_attr]:
* Started: [ server1.test server2.test ]
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
DRBD 状态输出:
[bebe@server2 share]$ sudo drbdadm status ha_nfs
ha_nfs role:Primary
disk:UpToDate
peer role:Secondary
replication:Established peer-disk:UpToDate
答案1
听起来像是权限设置错误。要排除故障,请破坏您的设置并尝试从头开始重新创建故障转移 NFS 挂载点。
PS 总体而言,这是一种脆弱的设置,主动-被动 DRBD 复制容易出现故障转移挂载/卸载以及类似的配置错误问题。应改用主动-主动块级复制与集群感知文件系统相结合。
答案2
我的猜测是权限仍然不正确,或者您在服务器挂载文件系统之前在挂载点上设置了权限。
在文件系统挂载时,我会尝试在 DRBD Primary 的挂载点上进行递归chown
和。此外,我通常会将 NFS 导出的根目录 chown 为,如果您尝试以 root 用户身份从客户端系统写入共享,这可能会有所帮助(因为这是默认的 NFS 导出选项)。您还可以尝试在 exportfs 资源上设置参数,看看这是否是您要面对的问题,但出于安全原因,您通常不想将其保持启用状态。chmod
nobody:nobody
root_squash
option="no_root_squash"
另外,我通常会在资源options=rw
上设置参数exportfs
,但这可能是默认的。