删除 ceph 中的文件不会释放磁盘空间

2024-6-2 • tag-icon

Ceph 版本：16.2.13 (pacific)（我知道 pacific 已弃用，但整个环境都是旧版（如 centos 7.3）。我也没有权限升级。）集群有 6 台服务器（22 个 osd，97 个 pg）。有一个通过 NFS 导出的 CephFS。客户端通过 NFSv4.1（NFS-Ganesha）访问集群。客户端使用以下命令进行挂载：

# mount -t nfs -o nfsvers=4.1,noauto,soft,sync,proto=tcp 172.20.0.31:/exports /cephmnt

我将一个文件夹（大约 5.2GB）复制到 /cephmnt。

# cp sysdir /cephmnt

因此空间按预期扩大（检查df -Th和的输出后ceph df detail）。

# df -Th | grep -i ceph
172.20.0.31:/exports    nfs4    26T    5.2G    26T    1%    /cephmnt

# ceph df | grep -i cephfs
cephfs.new_storage.meta    8    32    26 MiB    28      79 MiB    0      25 TiB
cephfs.new_storage.data    9    32    5.2 GiB   1.42k   15 GiB    0.02   25 TiB

但是，当我删除该文件夹时，空间并没有缩小。

# rm -rf sysdir

# df -Th | grep -i ceph
172.20.0.31:/exports    nfs4    26T    5.2G    26T    1%    /cephmnt

# ceph df | grep -i cephfs
cephfs.new_storage.meta    8    32    26 MiB     28       79 MiB    0       25 TiB
cephfs.new_storage.data    9    32    5.2 GiB    1.42k    15 GiB    0.02    25 TiB

我能够通过以下方式查看数据池中的对象列表：

# rados -p cephfs.new_storage.data ls

我对 ceph 还很陌生，所以我不知道这是否是 ceph 中的正常行为，尽管我怀疑是后者，所以我尝试进行一些挖掘。

快照已禁用，并且两个池都没有现有快照：

# ceph fs set new_storage allow_new_snaps false
# rados -p cephfs.new_storage.meta lssnap
0 snaps
# rados -p cephfs.new_storage.data lssnap
0 snaps

bdev_async_discard我曾在某处读到，如果和bdev_enable_discard设置为 true，OSD 中的 bluestore 将自动删除不可用的数据，所以我设置了它们

# ceph config get osd bdev_async_discard
true
# ceph config get osd bdev_enable_discard
true

但这没有效果。我多次卸载并安装 nfs 共享（有一次甚至整夜都处于卸载状态），但每次我重新安装时df -Th仍然ceph df显示相同的空间被占用。我cd进入 /cephmnt 目录并发出同步命令。仍然没有效果。

如何释放已删除文件的空间？

我确实读过这里cephfs 有延迟删除，但我不知道我的情况是否如此，或者是否存在其他问题。如果是延迟删除，我该如何确认，如何触发实际删除？如果不是延迟删除，那么实际问题是什么？

请询问是否需要任何其他数据来进行故障排除。我已经研究这个问题将近 3 天了，完全没有主意了，所以非常感谢任何帮助。

编辑 1：添加了更多详细信息

[root@cephserver1 ~]# ceph osd df
ID  CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE   DATA      OMAP     META     AVAIL    %USE  VAR   PGS  STATUS
 0    hdd  3.63869   1.00000  3.6 TiB   1.6 GiB   593 MiB    2 KiB  1.0 GiB  3.6 TiB  0.04  1.07   10      up
 1    hdd  3.63869   1.00000  3.6 TiB   1.1 GiB   544 MiB   19 KiB  559 MiB  3.6 TiB  0.03  0.71    9      up
 2    hdd  3.63869   1.00000  3.6 TiB   1.7 GiB   669 MiB    6 KiB  1.0 GiB  3.6 TiB  0.05  1.12   13      up
 4    hdd  3.63869   1.00000  3.6 TiB   1.6 GiB   742 MiB   26 KiB  918 MiB  3.6 TiB  0.04  1.07   13      up
13    hdd  3.63869   1.00000  3.6 TiB   1.7 GiB   596 MiB    4 KiB  1.2 GiB  3.6 TiB  0.05  1.15    8      up
 5    hdd  3.63869   1.00000  3.6 TiB   1.9 GiB   1.2 GiB   56 MiB  713 MiB  3.6 TiB  0.05  1.26   16      up
 6    hdd  3.63869   1.00000  3.6 TiB   1.6 GiB   407 MiB  124 MiB  1.1 GiB  3.6 TiB  0.04  1.04    9      up
 7    hdd  3.63869   1.00000  3.6 TiB   1.3 GiB   418 MiB   67 MiB  887 MiB  3.6 TiB  0.04  0.89   12      up
 8    hdd  3.63869   1.00000  3.6 TiB   1.1 GiB   667 MiB   73 MiB  372 MiB  3.6 TiB  0.03  0.72   15      up
 9    hdd  3.63869   1.00000  3.6 TiB   1.7 GiB   1.2 GiB    7 KiB  526 MiB  3.6 TiB  0.05  1.13   18      up
10    hdd  3.63869   1.00000  3.6 TiB   1.5 GiB   906 MiB    8 KiB  579 MiB  3.6 TiB  0.04  0.96   11      up
11    hdd  3.63869   1.00000  3.6 TiB   1.7 GiB   1.1 GiB    6 KiB  628 MiB  3.6 TiB  0.05  1.15   11      up
12    hdd  3.63869   1.00000  3.6 TiB   1.8 GiB   600 MiB   16 MiB  1.2 GiB  3.6 TiB  0.05  1.17   15      up
 3    hdd  3.63869   1.00000  3.6 TiB   2.8 GiB   1.6 GiB   37 MiB  1.2 GiB  3.6 TiB  0.08  1.86   17      up
14    hdd  3.63869   1.00000  3.6 TiB   1.6 GiB   857 MiB   37 KiB  781 MiB  3.6 TiB  0.04  1.06   12      up
15    hdd  3.63869   1.00000  3.6 TiB   1.9 GiB   1.4 GiB    2 KiB  499 MiB  3.6 TiB  0.05  1.26   12      up
16    hdd  3.63869   1.00000  3.6 TiB   2.2 GiB   972 MiB    1 KiB  1.2 GiB  3.6 TiB  0.06  1.44   15      up
17    hdd  3.63869   1.00000  3.6 TiB  1002 MiB   981 MiB    8 KiB   20 MiB  3.6 TiB  0.03  0.65   17      up
18    hdd  3.63869   1.00000  3.6 TiB   935 MiB   915 MiB    3 KiB   20 MiB  3.6 TiB  0.02  0.60   17      up
19    hdd  3.63869   1.00000  3.6 TiB   1.0 GiB  1006 MiB      0 B   28 MiB  3.6 TiB  0.03  0.67   10      up
20    hdd  3.63869   1.00000  3.6 TiB   866 MiB   835 MiB      0 B   31 MiB  3.6 TiB  0.02  0.56   20      up
21    hdd  3.63869   1.00000  3.6 TiB   731 MiB   709 MiB      0 B   22 MiB  3.6 TiB  0.02  0.47   11      up
                       TOTAL   80 TiB    33 GiB    19 GiB  374 MiB   14 GiB   80 TiB  0.04
MIN/MAX VAR: 0.47/1.86  STDDEV: 0.01

[root@cephserver1 ~]# ceph fs status
new_storage - 4 clients
======================
RANK  STATE                      MDS                        ACTIVITY     DNS    INOS   DIRS   CAPS
 0    active  new_storage.cephserver2.gvflgv  Reqs:    0 /s   161    163     52    154
               POOL                   TYPE     USED  AVAIL
cephfs.new_storage.meta  metadata  79.4M  25.3T
cephfs.new_storage.data    data    18.2G  25.3T
               STANDBY MDS
new_storage.cephserver3.wxrhxm
new_storage.cephserver4.xwpidi
new_storage.cephserver1.fwjpoi
MDS version: ceph version 16.2.13 (5378749ba6be3a0868b51803968ee9cde4833a3e) pacific (stable)

[root@cephserver1 ~]# ceph -s
  cluster:
    id:     dcad37bc-1185-11ee-88c0-7cc2556f5050
    health: HEALTH_WARN
            1 failed cephadm daemon(s)

  services:
    mon: 5 daemons, quorum cephserver1,cephserver2,cephserver3,cephserver4,cephserver5 (age 8d)
    mgr: cephserver2.sztiyq(active, since 2w), standbys: cephserver1.emjcaa
    mds: 1/1 daemons up, 3 standby
    osd: 22 osds: 22 up (since 3h), 22 in (since 8d)

  data:
    volumes: 1/1 healthy
    pools:   4 pools, 97 pgs
    objects: 1.81k objects, 6.2 GiB
    usage:   33 GiB used, 80 TiB / 80 TiB avail
    pgs:     97 active+clean

  io:
    client:   462 B/s rd, 0 op/s rd, 0 op/s wr

[root@cephserver1 ~]# ceph health detail
HEALTH_WARN 1 failed cephadm daemon(s)
[WRN] CEPHADM_FAILED_DAEMON: 1 failed cephadm daemon(s)
    daemon grafana.cephserver1 on cephserver1 is in error state

编辑2：我忘了提一个重点，整个存储集群处于隔离环境中。

编辑3：我尝试按照 eblock 评论中的建议在线压缩 OSD，部分成功了。这是ceph df压缩前显示的内容：

[root@cephserver1 ~]# ceph df
--- RAW STORAGE ---
CLASS    SIZE   AVAIL    USED  RAW USED  %RAW USED
hdd    80 TiB  80 TiB  **33 GiB**    **33 GiB**       0.04
TOTAL  80 TiB  80 TiB  **33 GiB**    **33 GiB**       0.04

--- POOLS ---
POOL                                ID  PGS   STORED  OBJECTS    USED  %USED  MAX AVAIL
device_health_metrics                1    1   17 MiB       29   50 MiB      0     25 TiB
cephfs.new_storage.meta   8   32   26 MiB       28   79 MiB      0     25 TiB
cephfs.new_storage.data   9   32   5.2GiB     1.42k  15 GiB   0.02     25 TiB
.nfs                                10   32  1.7 KiB        7   40 KiB      0     25 TiB

压缩后，33GiB 减少到 23GiB，如下所示

[root@cephserver1 ~]# ceph df
--- RAW STORAGE ---
CLASS    SIZE   AVAIL    USED  RAW USED  %RAW USED
hdd    80 TiB  80 TiB  **23 GiB**    **23 GiB**       0.03
TOTAL  80 TiB  80 TiB  **23 GiB**    **23 GiB**       0.03

--- POOLS ---
POOL                                ID  PGS   STORED  OBJECTS     USED  %USED  MAX AVAIL
device_health_metrics                1    1   18 MiB       29   54 MiB      0     25 TiB
cephfs.new_storage.meta   8   32   26 MiB       28   79 MiB      0     25 TiB
cephfs.new_storage.data   9   32   5.2GiB     1.42k  15 GiB   0.02     25 TiB
.nfs                                10   32   32 KiB        7  131 KiB      0     25 TiB

但是，池中的数据并没有下降。因此，我们衷心欢迎大家提出建议。

编辑4：我使用以下命令本地安装了 CephFS（即，不使用内核命令之间的 NFS）：

# mount -t ceph 172.30.0.31:6789,172.30.0.32:6789,172.30.0.33:6789:/ /cephmnt -o name=user1

安装后，我执行了该操作ls -a /cephmnt，但没有看到旧数据。但是，当我df -Th在安装了 CephFS 的客户端上执行该操作时，仍然可以看到旧数据占用的空间（5.2GB）。所以我怀疑问题不在于 NFS。

相关内容