Ubuntu 22.04,我使用 lvmcache(类型 writecache)配置了 SSD 缓存 LV,如下所示lvmcache 手册页。
系统启动正常,但缓存的 LV 状态始终不可用。
LV Path /dev/vg_fast3/disk3_lv
LV Name disk3_lv
VG Name vg_fast3
LV UUID UdAPKL-ZQQq-1ytp-CI54-FwVS-4iz5-EAJ0l0
LV Write Access read/write
LV Creation host, time minio1, 2022-05-07 18:13:22 +0200
LV Status NOT available
LV Size <9.10 TiB
Current LE 2384383
Segments 1
Allocation inherit
Read ahead sectors auto
然后我尝试手动激活 LV(我有 3 个)。第一次激活需要一点时间,并且不会消耗任何实际的系统内存,但第二次激活需要更长时间,并且会消耗所有主机内存(32GB)。这是一个新系统,RAM 使用率相当低。当我尝试激活第三个 LV 时,OOM 杀手开始行动。
请查看当我激活第二个 LV 时系统的内存使用情况如何变化。
root@minio1 /etc/systemd/system # free
total used free shared buff/cache available
Mem: 32602580 3594536 28791428 1272 216616 28629324
Swap: 4194300 0 4194300
root@minio1 /etc/systemd/system # lvchange -ay -vvvv /dev/vg_fast2/disk2_lv
root@minio1 /etc/systemd/system # free
total used free shared buff/cache available
Mem: 32602580 29741396 2644252 1284 216932 2482304
Swap: 4194300 0 4194300
如您所见,大部分时间(2 分钟)都花在了 RESUME 任务上。
12:08:48.119176 lvchange[1154] device_mapper/libdm-common.c:2677 Udev cookie 0xd4d1f00 (semid 1) assigned to RESUME task(5) with flags DISABLE_LIBRARY_FALLBACK (0x20)
12:08:48.119186 lvchange[1154] device_mapper/ioctl/libdm-iface.c:1876 dm resume (253:5) [ noopencount flush ] [16384] (*1)
12:11:38.391470 lvchange[1154] device_mapper/libdm-common.c:1484 vg_fast2-disk2_lv: Stacking NODE_ADD (253,5) 0:6 0660 [trust_udev]
12:11:38.391498 lvchange[1154] device_mapper/libdm-common.c:1495 vg_fast2-disk2_lv: Stacking NODE_READ_AHEAD 256 (flags=1)
12:11:38.391514 lvchange[1154] activate/dev_manager.c:3686 Creating CLEAN tree for vg_fast2/disk2_lv.
12:11:38.391551 lvchange[1154] activate/dev_manager.c:817 Getting device info for vg_fast2-disk2_lv [LVM-Wm7lKJTqlwERGip92OXeIgBi5nEqzsqieWtt9lFS16fwbF1XWqeoZhb7zyj3vDzc].
12:11:38.391576 lvchange[1154] device_mapper/ioctl/libdm-iface.c:1876 dm info LVM-Wm7lKJTqlwERGip92OXeIgBi5nEqzsqieWtt9lFS16fwbF1XWqeoZhb7zyj3vDzc [ opencount flush ] [16384] (*1)
12:11:38.391609 lvchange[1154] device_mapper/ioctl/libdm-iface.c:1876 dm deps (253:5) [ opencount flush ] [16384] (*1)
12:11:38.391655 lvchange[1154] device_mapper/ioctl/libdm-iface.c:1876 dm deps (253:3) [ opencount flush ] [16384] (*1)
12:11:38.391675 lvchange[1154] device_mapper/ioctl/libdm-iface.c:1876 dm deps (253:4) [ opencount flush ] [16384] (*1)
12:11:38.391703 lvchange[1154] activate/dev_manager.c:817 Getting device info for vg_fast2-disk2_lv-real [LVM-Wm7lKJTqlwERGip92OXeIgBi5nEqzsqieWtt9lFS16fwbF1XWqeoZhb7zyj3vDzc-real].
12:11:38.391714 lvchange[1154] device_mapper/ioctl/libdm-iface.c:1876 dm info LVM-Wm7lKJTqlwERGip92OXeIgBi5nEqzsqieWtt9lFS16fwbF1XWqeoZhb7zyj3vDzc-real [ opencount flush ] [16384] (*1)
12:11:38.391728 lvchange[1154] activate/dev_manager.c:817 Getting device info for vg_fast2-disk2_lv-cow [LVM-Wm7lKJTqlwERGip92OXeIgBi5nEqzsqieWtt9lFS16fwbF1XWqeoZhb7zyj3vDzc-cow].
12:11:38.391737 lvchange[1154] device_mapper/ioctl/libdm-iface.c:1876 dm info LVM-Wm7lKJTqlwERGip92OXeIgBi5nEqzsqieWtt9lFS16fwbF1XWqeoZhb7zyj3vDzc-cow [ opencount flush ] [16384] (*1)
12:11:38.391750 lvchange[1154] activate/dev_manager.c:817 Getting device info for vg_fast2-ssd2_lv_cvol-real [LVM-Wm7lKJTqlwERGip92OXeIgBi5nEqzsqiZmOf0I2TWMotwHgUKQs49RxDYIbrB8oq-real].
12:11:38.391760 lvchange[1154] device_mapper/ioctl/libdm-iface.c:1876 dm info LVM-Wm7lKJTqlwERGip92OXeIgBi5nEqzsqiZmOf0I2TWMotwHgUKQs49RxDYIbrB8oq-real [ opencount flush ] [16384] (*1)
12:11:38.391775 lvchange[1154] activate/dev_manager.c:817 Getting device info for vg_fast2-disk2_lv_wcorig [LVM-Wm7lKJTqlwERGip92OXeIgBi5nEqzsqiq94eYXDdC8TGT1RVERIDZQNnf5iaJFx9-real].
12:11:38.391785 lvchange[1154] device_mapper/ioctl/libdm-iface.c:1876 dm info LVM-Wm7lKJTqlwERGip92OXeIgBi5nEqzsqiq94eYXDdC8TGT1RVERIDZQNnf5iaJFx9-real [ opencount flush ] [16384] (*1)
12:11:38.391798 lvchange[1154] activate/dev_manager.c:817 Getting device info for vg_fast2-disk2_lv_wcorig-real [LVM-Wm7lKJTqlwERGip92OXeIgBi5nEqzsqiq94eYXDdC8TGT1RVERIDZQNnf5iaJFx9-real].
12:11:38.391807 lvchange[1154] device_mapper/ioctl/libdm-iface.c:1876 dm info LVM-Wm7lKJTqlwERGip92OXeIgBi5nEqzsqiq94eYXDdC8TGT1RVERIDZQNnf5iaJFx9-real [ opencount flush ] [16384] (*1)
12:11:38.391820 lvchange[1154] activate/dev_manager.c:817 Getting device info for vg_fast2-disk2_lv_wcorig-cow [LVM-Wm7lKJTqlwERGip92OXeIgBi5nEqzsqiq94eYXDdC8TGT1RVERIDZQNnf5iaJFx9-cow].
12:11:38.391828 lvchange[1154] device_mapper/ioctl/libdm-iface.c:1876 dm info LVM-Wm7lKJTqlwERGip92OXeIgBi5nEqzsqiq94eYXDdC8TGT1RVERIDZQNnf5iaJFx9-cow [ opencount flush ] [16384] (*1)
12:11:38.391844 lvchange[1154] mm/memlock.c:641 Leaving section (activated).
12:11:38.391856 lvchange[1154] mm/memlock.c:597 Unlock: Memlock counters: prioritized:1 locked:0 critical:0 daemon:0 suspended:0
12:11:38.391867 lvchange[1154] mm/memlock.c:506 Restoring original task priority 0.
12:11:38.391874 lvchange[1154] activate/fs.c:492 Syncing device names
12:11:38.391888 lvchange[1154] device_mapper/libdm-common.c:2479 Udev cookie 0xd4d1f00 (semid 1) decremented to 1
12:11:38.391896 lvchange[1154] device_mapper/libdm-common.c:2765 Udev cookie 0xd4d1f00 (semid 1) waiting for zero
12:11:38.603440 lvchange[1154] device_mapper/libdm-common.c:2494 Udev cookie 0xd4d1f00 (semid 1) destroyed
12:11:38.603459 lvchange[1154] device_mapper/libdm-common.c:1484 vg_fast2-ssd2_lv_cvol: Skipping NODE_ADD (253,3) 0:6 0660 [trust_udev]
12:11:38.603466 lvchange[1154] device_mapper/libdm-common.c:1495 vg_fast2-ssd2_lv_cvol: Processing NODE_READ_AHEAD 256 (flags=1)
12:11:38.603498 lvchange[1154] device_mapper/libdm-common.c:1249 vg_fast2-ssd2_lv_cvol (253:3): read ahead is 256
12:11:38.603506 lvchange[1154] device_mapper/libdm-common.c:1373 vg_fast2-ssd2_lv_cvol: retaining kernel read ahead of 256 (requested 256)
12:11:38.603510 lvchange[1154] device_mapper/libdm-common.c:1484 vg_fast2-disk2_lv_wcorig: Skipping NODE_ADD (253,4) 0:6 0660 [trust_udev]
12:11:38.603515 lvchange[1154] device_mapper/libdm-common.c:1495 vg_fast2-disk2_lv_wcorig: Processing NODE_READ_AHEAD 256 (flags=1)
12:11:38.603531 lvchange[1154] device_mapper/libdm-common.c:1249 vg_fast2-disk2_lv_wcorig (253:4): read ahead is 256
12:11:38.603537 lvchange[1154] device_mapper/libdm-common.c:1373 vg_fast2-disk2_lv_wcorig: retaining kernel read ahead of 256 (requested 256)
12:11:38.603542 lvchange[1154] device_mapper/libdm-common.c:1484 vg_fast2-disk2_lv: Skipping NODE_ADD (253,5) 0:6 0660 [trust_udev]
12:11:38.603547 lvchange[1154] device_mapper/libdm-common.c:1495 vg_fast2-disk2_lv: Processing NODE_READ_AHEAD 256 (flags=1)
12:11:38.603561 lvchange[1154] device_mapper/libdm-common.c:1249 vg_fast2-disk2_lv (253:5): read ahead is 256
12:11:38.603567 lvchange[1154] device_mapper/libdm-common.c:1373 vg_fast2-disk2_lv: retaining kernel read ahead of 256 (requested 256)
12:11:38.603576 lvchange[1154] misc/lvm-flock.c:84 Unlocking /run/lock/lvm/V_vg_fast2
12:11:38.603583 lvchange[1154] misc/lvm-flock.c:47 _undo_flock /run/lock/lvm/V_vg_fast2
12:11:38.603604 lvchange[1154] metadata/vg.c:79 Freeing VG vg_fast2 at 0x55a132b4b5a0.
12:11:38.603754 lvchange[1154] notify/lvmnotify.c:54 Nofify dbus at com.redhat.lvmdbus1.
12:11:38.604168 lvchange[1154] notify/lvmnotify.c:69 D-Bus notification failed: The name com.redhat.lvmdbus1 was not provided by any .service files
12:11:38.604210 lvchange[1154] cache/lvmcache.c:2091 Destroy lvmcache content
12:11:38.649830 lvchange[1154] lvmcmdline.c:3168 Completed: lvchange -ay -vvvv /dev/vg_fast2/disk2_lv
我想创建一个 bug,但我决定先在这里交换一些想法。
提前感谢您的反馈。
答案1
我又做了几个测试,发现当缓存的 LV 被激活时,内存会被使用。使 LV 处于非活动状态会恢复丢失的内存。
--- Logical volume ---
LV Path /dev/vg_fast3/disk3_lv
LV Name disk3_lv
VG Name vg_fast3
LV UUID DiqET1-VMau-4zuc-4vVt-4WDk-yfeJ-KORpwq
LV Write Access read/write
LV Creation host, time minio2, 2022-05-08 13:36:15 +0200
LV Status available
# open 0
LV Size <9.10 TiB
Current LE 2384383
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:0
--- Logical volume ---
LV Path /dev/vg_fast2/disk2_lv
LV Name disk2_lv
VG Name vg_fast2
LV UUID H0FAlm-1KOe-fLeL-EsgX-iupN-JsYl-zsThtM
LV Write Access read/write
LV Creation host, time minio2, 2022-05-07 19:03:33 +0200
LV Status NOT available
LV Size <9.10 TiB
Current LE 2384383
Segments 1
Allocation inherit
Read ahead sectors auto
--- Logical volume ---
LV Path /dev/vg_fast1/disk1_lv
LV Name disk1_lv
VG Name vg_fast1
LV UUID yo32U4-IfzJ-W0sj-JTKn-Qj4B-syJd-zgu9qK
LV Write Access read/write
LV Creation host, time minio2, 2022-05-07 18:58:01 +0200
LV Status NOT available
LV Size <9.10 TiB
Current LE 2384383
Segments 1
Allocation inherit
Read ahead sectors auto
root@minio2 ~ # free
total used free shared buff/cache available
Mem: 32600636 26401620 5984508 1284 214508 5821340
Swap: 4194300 0 4194300
root@minio2 ~ # lvchange -an /dev/vg_fast3/disk3_lv
root@minio2 ~ # free
total used free shared buff/cache available
Mem: 32600636 291212 32095032 1272 214392 31931824
Swap: 4194300 0 4194300
我使用 160GB 作为写入缓存,也许这太多了。
答案2
好的,看起来内存需求会根据缓存 lv 大小而变化,几百 GB 占用了我所有的服务器内存。我不确定我是否遗漏了什么,但我决定使用 bcache,它似乎不太耗内存。
现在一切都按预期工作,没有任何明显的内存消耗。
谢谢。