xfs:块大小和扇区大小之间的差异

xfs:块大小和扇区大小之间的差异

mkfs.xfs其中有以下两个选项:

-b block_size_options
      This  option  specifies  the  fundamental  block  size  of  the  filesystem.    The   valid
      block_size_options  are:  log=value  or size=value and only one can be supplied.  The block
      size is specified either as a base two logarithm value with log=, or in bytes  with  size=.
      The  default  value is 4096 bytes (4 KiB), the minimum is 512, and the maximum is 65536 (64
      KiB).  Although mkfs.xfs will accept any of these values and create a valid filesystem, XFS
      on Linux can only mount filesystems with pagesize or smaller blocks.
      
      
-s sector_size
      This option specifies the fundamental sector size of the filesystem.   The  sector_size  is
      specified  either as a value in bytes with size=value or as a base two logarithm value with
      log=value.  The default sector_size is 512 bytes. The minimum value for sector size is 512;
      the maximum is 32768 (32 KiB). The sector_size must be a power of 2 size and cannot be made
      larger than the filesystem block size.      

嗯,这个描述是不是多余的。唯一暗示扇区可能是块内部使用的东西的是“sector_size 必须是 2 的幂大小,并且不能大于文件系统块大小”。也许这里的扇区是指底层块设备的扇区大小?默认值 512 字节表明了这一点。

显然,这只是猜测。我想知道在 XFS 的上下文中,块和扇区之间的区别是什么,以及它们如何影响文件系统性能。

答案1

扇区大小指底层块设备的扇区大小。它是磁盘的分配单元。这是磁盘的“硬件”属性。您可以使用以下命令查看它:

lsblk -o NAME,PHY-SEC,LOG-SEC,MAJ:MIN,SIZE,RO,TYPE,MOUNTPOINTS,VENDOR,MODEL,SERIAL

mkfs.xfs 的默认扇区大小是设备公布的扇区大小。如果 LOG-SEC 为 512,PHY-SEC 为 4096,则应使用 4096。如有疑问,请使用 PHY-SEC 以提高性能。

请注意,文件系统无法从扇区大小为 512 的块设备复制到物理扇区大小为 4096(或 8192)的块设备。您可以复制文件,但不能将其作为 PV 添加到 LVM VG 并用于pvmove移动数据。

块大小是文件系统的分配单位,又称簇大小。它是文件系统可以为文件或元数据分配的最小量。

块大小需要更大,并且 a 应该是 2 的幂扇区大小。如果您打算只将文件系统用于大文件,则应增加块大小,否则保留默认值。

如果您使用 RAID 阵列或任何块设备抽象,则应遵循制造商文档以获得最佳性能。

出于性能原因,对齐分区也很重要。大多数现代 Linux 工具都会创建与 1MB 对齐的分区,这在大多数情况下都很好。

如果您不知道该怎么做,请保留默认设置。它们适用于正常用例。如果您想提高性能,请避免使用磁盘存储,使用基于 RAM 的存储,使用 zram(基于压缩 RAM 的交换),使用 SSD。

扇区大小由文件系统,并且手册页已过时。这是我的测试:

[mvutcovi@laptop-rh ~]$ truncate --size=1G xfs-test.img
[mvutcovi@laptop-rh ~]$ ls -lh xfs-test.img 
-rw-r--r--. 1 mvutcovi mvutcovi 1.0G Jun 10 10:12 xfs-test.img
[mvutcovi@laptop-rh ~]$ 

[mvutcovi@laptop-rh ~]$ sudo losetup --sector-size=4096 --find --show xfs-test.img 
/dev/loop0
[mvutcovi@laptop-rh ~]$ lsblk -o NAME,PHY-SEC,LOG-SEC,MAJ:MIN,SIZE,RO,TYPE,MOUNTPOINTS,VENDOR,MODEL,SERIAL /dev/loop0 
NAME  PHY-SEC LOG-SEC MAJ:MIN SIZE RO TYPE MOUNTPOINTS VENDOR MODEL SERIAL
loop0    4096    4096   7:0     1G  0 loop                          
[mvutcovi@laptop-rh ~]$ 

[mvutcovi@laptop-rh ~]$ sudo mkfs.xfs /dev/loop0 
meta-data=/dev/loop0             isize=512    agcount=4, agsize=65536 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1    bigtime=1 inobtcount=1 nrext64=0
data     =                       bsize=4096   blocks=262144, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=16384, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
Discarding blocks...Done.
[mvutcovi@laptop-rh ~]$ 

[mvutcovi@laptop-rh ~]$ sudo wipefs -a /dev/loop0
/dev/loop0: 4 bytes were erased at offset 0x00000000 (xfs): 58 46 53 42
[mvutcovi@laptop-rh ~]$ sudo mkfs.xfs -s size=4096 /dev/loop0 
meta-data=/dev/loop0             isize=512    agcount=4, agsize=65536 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1    bigtime=1 inobtcount=1 nrext64=0
data     =                       bsize=4096   blocks=262144, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=16384, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
Discarding blocks...Done.
[mvutcovi@laptop-rh ~]$

[mvutcovi@laptop-rh ~]$ sudo wipefs -a /dev/loop0
/dev/loop0: 4 bytes were erased at offset 0x00000000 (xfs): 58 46 53 42
[mvutcovi@laptop-rh ~]$ sudo mkfs.xfs -s size=512 /dev/loop0 
illegal sector size 512; hw sector is 4096
Usage: mkfs.xfs
/* blocksize */     [-b size=num]
/* config file */   [-c options=xxx]
/* metadata */      [-m crc=0|1,finobt=0|1,uuid=xxx,rmapbt=0|1,reflink=0|1,
                inobtcount=0|1,bigtime=0|1]
/* data subvol */   [-d agcount=n,agsize=n,file,name=xxx,size=num,
                (sunit=value,swidth=value|su=num,sw=num|noalign),
                sectsize=num
/* force overwrite */   [-f]
/* inode size */    [-i perblock=n|size=num,maxpct=n,attr=0|1|2,
                projid32bit=0|1,sparse=0|1,nrext64=0|1]
/* no discard */    [-K]
/* log subvol */    [-l agnum=n,internal,size=num,logdev=xxx,version=n
                sunit=value|su=num,sectsize=num,lazy-count=0|1]
/* label */     [-L label (maximum 12 characters)]
/* naming */        [-n size=num,version=2|ci,ftype=0|1]
/* no-op info only */   [-N]
/* prototype file */    [-p fname]
/* quiet */     [-q]
/* realtime subvol */   [-r extsize=num,size=num,rtdev=xxx]
/* sectorsize */    [-s size=num]
/* version */       [-V]
            devicename
<devicename> is required unless -d name=xxx is given.
<num> is xxx (bytes), xxxs (sectors), xxxb (fs blocks), xxxk (xxx KiB),
      xxxm (xxx MiB), xxxg (xxx GiB), xxxt (xxx TiB) or xxxp (xxx PiB).
<value> is xxx (512 byte blocks).
[mvutcovi@laptop-rh ~]$ 




[mvutcovi@laptop-rh ~]$ sudo wipefs -a /dev/loop0
[mvutcovi@laptop-rh ~]$ sudo losetup --detach /dev/loop0
[mvutcovi@laptop-rh ~]$ sudo losetup --sector-size=512 --find --show xfs-test.img 
/dev/loop0
[mvutcovi@laptop-rh ~]$ sudo mkfs.xfs /dev/loop0 
meta-data=/dev/loop0             isize=512    agcount=4, agsize=65536 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1    bigtime=1 inobtcount=1 nrext64=0
data     =                       bsize=4096   blocks=262144, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=16384, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
Discarding blocks...Done.
[mvutcovi@laptop-rh ~]$ 

[mvutcovi@laptop-rh ~]$ sudo wipefs -a /dev/loop0
/dev/loop0: 4 bytes were erased at offset 0x00000000 (xfs): 58 46 53 42
[mvutcovi@laptop-rh ~]$ sudo mkfs.xfs -s size=4096 /dev/loop0 
meta-data=/dev/loop0             isize=512    agcount=4, agsize=65536 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1    bigtime=1 inobtcount=1 nrext64=0
data     =                       bsize=4096   blocks=262144, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=16384, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
Discarding blocks...Done.
[mvutcovi@laptop-rh ~]$

以下是其代码部分:

  /* set configured sector sizes in preparation for checks */
  if (!cli->sectorsize) {
    /*
     * Unless specified manually on the command line use the
     * advertised sector size of the device.  We use the physical
     * sector size unless the requested block size is smaller
     * than that, then we can use logical, but warn about the
     * inefficiency.
     *
     * Set the topology sectors if they were not probed to the
     * minimum supported sector size.
     */
    if (!ft->lsectorsize)
      ft->lsectorsize = dft->sectorsize;

答案2

简短的回答是块大小是最小分配大小,而扇区大小是底层物理设备扇区大小。然而,如此简洁的答案无法传达块和扇区大小之间的真正区别。

要理解的关键点是扇区大小是个原子写入大小底层物理设备的大小——换句话说,预计彻底成功或失败, 和中间结果(即:部分写入)。这个概念对于 XFS 日志保护措施极其重要:错误配置扇区大小意味着冒险进入危险境地。

块大小是一个更“普通”的单位:它描述了文件数据的最小文件系统分配。在具有 4k 块大小的文件系统上,写入单个字节的数据(即:)echo -n 0 > /root/test.file将产生具有 4K 真实大小的文件:

[root@localhost ~]# echo -n 0 > test.file
[root@localhost ~]# stat test.file
  File: test.file
  Size: 1               Blocks: 8          IO Block: 4096   regular file
Device: fd00h/64768d    Inode: 100664426   Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Context: unconfined_u:object_r:admin_home_t:s0
Access: 2023-06-10 17:47:50.973092242 +0200
Modify: 2023-06-10 17:47:50.974092238 +0200
Change: 2023-06-10 17:47:50.974092238 +0200
 Birth: 2023-06-10 17:47:50.973092242 +0200
[root@localhost ~]# du -hs test.file
4.0K    test.file

边注:从中可以看出stat,Linux 内部以 512B 大小的“逻辑扇区”为单位计算大小(在上面的例子中,8x 512B“linux”块 = 1x 4K XFS 块)。

简短的总结是,虽然块大小“仅仅”是一个优化参数,扇区大小应该是正确的(因此可以自动检测)——或者崩溃/断电时文件系统可能损坏。

相关内容