mount.ocfs2：安装时传输端点未连接……？

Question

噢耶！问题解决了。

注意UUID：

# mounted.ocfs2 -d
Device                FS     Stack  UUID                              Label
/dev/sdb              ocfs2  o2cb   12963EAF4E16484DB81ECB0251177C26  ocfs2_drbd1
/dev/drbd1            ocfs2  o2cb   12963EAF4E16484DB81ECB0251177C26  ocfs2_drbd1

但：

# ls -l /sys/kernel/config/cluster/cpc/heartbeat/
drwxr-xr-x 2 root root    0 Dec 24 22:53 72EF09EA3D0D4F51BDC00B47432B1EB2

这可能是因为我“意外”强制重新格式化了 OCFS2 卷。我面临的问题类似于这在 Ocfs2-user 邮件列表上。

这也是以下错误的原因：

ocfs2_hb_ctl：停止心跳时 ocfs2_lookup 未找到文件

因为无法在中ocfs2_hb_ctl找到具有 UUID 的设备。72EF09EA3D0D4F51BDC00B47432B1EB2/proc/partitions

我想到了一个想法：我可以更改 OCFS2 卷的 UUID 吗？

查看tunefs.ocfs2手册页：

Usage: tunefs.ocfs2 [options] <device> [new-size]
       tunefs.ocfs2 -h|--help
       tunefs.ocfs2 -V|--version
[options] can be any mix of:
        -U|--uuid-reset[=new-uuid]

所以我执行以下命令：

# tunefs.ocfs2 --uuid-reset=72EF09EA3D0D4F51BDC00B47432B1EB2 /dev/drbd1
WARNING!!! OCFS2 uses the UUID to uniquely identify a file system. 
Having two OCFS2 file systems with the same UUID could, in the least, 
cause erratic behavior, and if unlucky, cause file system damage. 
Please choose the UUID with care.
Update the UUID ?yes

核实：

# tunefs.ocfs2 -Q "%U\n" /dev/drbd1 
72EF09EA3D0D4F51BDC00B47432B1EB2

尝试再次杀死心跳区域以查看会发生什么：

# ocfs2_hb_ctl -K -u 72EF09EA3D0D4F51BDC00B47432B1EB2
# ocfs2_hb_ctl -I -u 72EF09EA3D0D4F51BDC00B47432B1EB2
72EF09EA3D0D4F51BDC00B47432B1EB2: 6 refs

继续杀戮，直到我看到然后0 refs关闭集群：

# /etc/init.d/o2cb offline cpc
Stopping O2CB cluster cpc: OK

并停止它：

# /etc/init.d/o2cb stop
Stopping O2CB cluster cpc: OK
Unloading module "ocfs2": OK
Unmounting ocfs2_dlmfs filesystem: OK
Unloading module "ocfs2_dlmfs": OK
Unmounting configfs filesystem: OK
Unloading module "configfs": OK

重新启动以查看新节点是否已更新：

# /etc/init.d/o2cb start
Loading filesystem "configfs": OK
Mounting configfs filesystem at /sys/kernel/config: OK
Loading filesystem "ocfs2_dlmfs": OK
Mounting ocfs2_dlmfs filesystem at /dlm: OK
Starting O2CB cluster cpc: OK

# ls -l /sys/kernel/config/cluster/cpc/node/
total 0
drwxr-xr-x 2 root root 0 Dec 26 19:02 SVR022-293.localdomain
drwxr-xr-x 2 root root 0 Dec 26 19:02 SVR233NTC-3145.localdomain

OK，在对等节点（192.168.2.93）上，尝试启动OCFS2：

# /etc/init.d/ocfs2 start
Starting Oracle Cluster File System (OCFS2)                [  OK  ]

谢谢苏尼尔·穆什兰因为这线程帮助我解决了这个问题。

教训是：

IP 地址、端口等只能在集群离线时更改。请参阅常问问题。
切勿强制重新格式化 OCFS2 卷。

Answer 1