我正在尝试从客户案例中找出根本原因,其中使用相同命令格式化的 2 个相同驱动器由于额外的 Inode 开销而导致总磁盘空间差异约 55GB。
我想了解
Inodes per group
2x 如何转换为 2x 的数学Inode count
- 使用标志时如何
Inodes per group
设置lazy_itable_init
环境:
2 个驱动器位于 2 个相同的硬件服务器上,在相同的操作系统上运行。以下是 2 个驱动器的详细信息(已编辑敏感信息):
驱动器A:
=== START OF INFORMATION SECTION ===
Vendor: HPE
Product: <strip>
Revision: HPD4
Compliance: SPC-5
User Capacity: 7,681,501,126,656 bytes [7.68 TB]
Logical block size: 512 bytes
Physical block size: 4096 bytes
LU is resource provisioned, LBPRZ=1
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
Logical Unit id: <strip>
Serial number: <strip>
Device type: disk
Transport protocol: SAS (SPL-3)
Local Time is: Mon Apr 25 07:39:27 2022 GMT
SMART support is: Available - device has SMART capability.
驱动器 B:
=== START OF INFORMATION SECTION ===
Vendor: HPE
Product: <strip>
Revision: HPD4
Compliance: SPC-5
User Capacity: 7,681,501,126,656 bytes [7.68 TB]
Logical block size: 512 bytes
Physical block size: 4096 bytes
LU is resource provisioned, LBPRZ=1
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
Logical Unit id: <strip>
Serial number: <strip>
Device type: disk
Transport protocol: SAS (SPL-3)
Local Time is: Mon Apr 25 07:39:23 2022 GMT
SMART support is: Available - device has SMART capability.
运行格式化驱动器的命令是:
sudo mke2fs -F -m 1 -t ext4 -E lazy_itable_init,nodiscard /dev/sdc1
问题:
df -h
驱动器 A 和 B 的输出分别显示大小为 6.9T 的驱动器 A 与大小为 7.0T 的驱动器 B :
/dev/sdc1 6.9T 89M 6.9T 1% /home/<strip>/data/<serial>
...
/dev/sdc1 7.0T 3.0G 6.9T 1% /home/<strip>/data/<serial>
观察结果:
- 两个驱动器上的 fdisk 输出显示它们都有相同的分区。
驱动器A:
Disk /dev/sdc: 7681.5 GB, 7681501126656 bytes, 15002931888 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 8192 bytes / 8192 bytes
Disk label type: gpt
Disk identifier: 70627C8E-9F97-468E-8EE6-54E960492318
# Start End Size Type Name
1 2048 15002929151 7T Microsoft basic primary
驱动器B:
Disk /dev/sdc: 7681.5 GB, 7681501126656 bytes, 15002931888 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 8192 bytes / 8192 bytes
Disk label type: gpt
Disk identifier: 702A42FA-9A20-4CE4-B938-83D3AB3DCC49
# Start End Size Type Name
1 2048 15002929151 7T Microsoft basic primary
/etc/mke2fs.conf
两个系统上的内容是相同的,所以这里没有什么有趣的事情:
================== DriveA =================
[defaults]
base_features = sparse_super,filetype,resize_inode,dir_index,ext_attr
enable_periodic_fsck = 1
blocksize = 4096
inode_size = 256
inode_ratio = 16384
[fs_types]
ext3 = {
features = has_journal
}
ext4 = {
features = has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize,64bit
inode_size = 256
}
...
================== DriveB =================
[defaults]
base_features = sparse_super,filetype,resize_inode,dir_index,ext_attr
enable_periodic_fsck = 1
blocksize = 4096
inode_size = 256
inode_ratio = 16384
[fs_types]
ext3 = {
features = has_journal
}
ext4 = {
features = has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize,64bit
inode_size = 256
}
- 如果我们对两个驱动器的une2fs -l 输出进行比较,我们会看到
Inodes per group
DriveA 上有 2x DriveB - 我们还在
Inode count
DriveA 上看到 2xDriveB(完整差异这里)
DriveA:
Inode count: 468844544
Block count: 1875365888
Reserved block count: 18753658
Free blocks: 1845578463
Free inodes: 468843793
...
Fragments per group: 32768
Inodes per group: 8192
Inode blocks per group: 512
Flex block group size: 16
DriveB:
Inode count: 234422272 <----- Half of A
Block count: 1875365888
Reserved block count: 18753658
Free blocks: 1860525018
Free inodes: 234422261
...
Fragments per group: 32768
Inodes per group: 4096 <---------- Half of A
Inode blocks per group: 256 <---------- Half of A
Flex block group size: 16
从如何计算 ext2 文件系统上的“每组 Inode 块”?我理解
Inode blocks per group
是由于Inodes per group
来自 mke2fs 代码(来源),值似乎仅在提供时才
Inodes per group
在函数中调用:write_inode_tables
lazy_itable_init
write_inode_tables(fs, lazy_itable_init, itable_zeroed);
...
static void write_inode_tables(ext2_filsys fs, int lazy_flag, int itable_zeroed)
...
if (lazy_flag)
num = ext2fs_div_ceil((fs->super->s_inodes_per_group - <--------- here
ext2fs_bg_itable_unused(fs, i)) *
EXT2_INODE_SIZE(fs->super),
EXT2_BLOCK_SIZE(fs->super));
如果我们将 inode 计数的差值乘以常量 inode 大小 (256),我们会得到(468844544-234422272)*256 = 60012101632 bytes
约 55GiB 的额外 inode 开销。
谁能帮我算一下当 Inode 计数增加到 2 倍时如何
Inodes per group
增加到 2 倍?是否
lazy_itable_init
会在运行时影响决定 的值Inodes per group
,如果是的话,我们如何理解它将设置什么值? (该标志是代码中对 s_inodes_per_group 的唯一引用)
答案1
我发现这两种情况的差异在于 e2fsprogs 版本的差异 - 1.42.9 和 1.45.4。我没有想到检查这一点,只依赖于 mke2fs.conf 文件。对这个明显的失误表示歉意,并感谢@lustreone 的建议。
我仍然很想知道与每组索引节点和索引节点计数相关的数学。