每当我跑步时journalctl
,它都跑得慢得惊人。通常需要几分钟甚至几小时才能返回结果。我在受影响的机器上遇到了一些其他性能问题,这些问题似乎与 I/O 密集型应用程序有关;我最近更换了 SATA 电缆,这提高了性能,但并没有完全解决问题。我已经在所有驱动器上运行了 SMART 测试,结果都很好,基于这个清单预测磁盘故障的“关键”属性。
请注意,不幸的是,我将在几个月内无法物理访问受影响的计算机。
我以 root身份journalctl -xn
运行;strace -T
这是输出的示例:
fstatfs(23, {f_type=0x9123683e, f_bsize=4096, f_blocks=272871862, f_bfree=10246662, f_bavail=9257376, f_files=0, f_ffree=0, f_fsid={-1736151745, 782050752}, f_namelen=255, f_frsize=4096}) = 0 <0.000044>
open("/var/log/journal/40b4c4d188900ed4b228417f53bce3a8/[email protected]~", O_RDONLY|O_CLOEXEC) = 24 <0.000045>
fstat(24, {st_mode=S_IFREG|0640, st_size=67108864, ...}) = 0 <0.000044>
mmap(NULL, 4096, PROT_READ, MAP_SHARED, 24, 0) = 0x7fce771b7000 <0.000052>
mmap(NULL, 8388608, PROT_READ, MAP_SHARED, 24, 0) = 0x7fce6be4c000 <0.000101>
fstatfs(24, {f_type=0x9123683e, f_bsize=4096, f_blocks=272871862, f_bfree=10246681, f_bavail=9257367, f_files=0, f_ffree=0, f_fsid={-1736151745, 782050752}, f_namelen=255, f_frsize=4096}) = 0 <0.000044>
open("/var/log/journal/40b4c4d188900ed4b228417f53bce3a8/[email protected]~", O_RDONLY|O_CLOEXEC) = 25 <0.000040>
fstat(25, {st_mode=S_IFREG|0640, st_size=83886080, ...}) = 0 <0.000040>
mmap(NULL, 4096, PROT_READ, MAP_SHARED, 25, 0) = 0x7fce771b6000 <0.000177>
mmap(NULL, 8388608, PROT_READ, MAP_SHARED, 25, 0) = 0x7fce6b64c000 <0.000126>
fstatfs(25, {f_type=0x9123683e, f_bsize=4096, f_blocks=272871862, f_bfree=10246807, f_bavail=9257509, f_files=0, f_ffree=0, f_fsid={-1736151745, 782050752}, f_namelen=255, f_frsize=4096}) = 0 <0.000071>
open("/var/log/journal/40b4c4d188900ed4b228417f53bce3a8/[email protected]~", O_RDONLY|O_CLOEXEC) = 26 <0.000073>
fstat(26, {st_mode=S_IFREG|0640, st_size=67108864, ...}) = 0 <0.000046>
mmap(NULL, 4096, PROT_READ, MAP_SHARED, 26, 0) = 0x7fce771b5000 <0.554027>
mmap(NULL, 8388608, PROT_READ, MAP_SHARED, 26, 0) = 0x7fce6ae4c000 <0.000155>
fstatfs(26, {f_type=0x9123683e, f_bsize=4096, f_blocks=272871862, f_bfree=10248655, f_bavail=9257373, f_files=0, f_ffree=0, f_fsid={-1736151745, 782050752}, f_namelen=255, f_frsize=4096}) = 0 <0.000091>
open("/var/log/journal/40b4c4d188900ed4b228417f53bce3a8/[email protected]~", O_RDONLY|O_CLOEXEC) = 27 <0.000082>
fstat(27, {st_mode=S_IFREG|0640, st_size=8388608, ...}) = 0 <0.000064>
mmap(NULL, 4096, PROT_READ, MAP_SHARED, 27, 0) = 0x7fce771b4000 <0.129685>
mmap(NULL, 8388608, PROT_READ, MAP_SHARED, 27, 0) = 0x7fce6a64c000 <0.000119>
fstatfs(27, {f_type=0x9123683e, f_bsize=4096, f_blocks=272871862, f_bfree=10245778, f_bavail=9256592, f_files=0, f_ffree=0, f_fsid={-1736151745, 782050752}, f_namelen=255, f_frsize=4096}) = 0 <0.000043>
open("/var/log/journal/40b4c4d188900ed4b228417f53bce3a8/[email protected]~", O_RDONLY|O_CLOEXEC) = 28 <0.000043>
fstat(28, {st_mode=S_IFREG|0640, st_size=109051904, ...}) = 0 <0.000047>
mmap(NULL, 4096, PROT_READ, MAP_SHARED, 28, 0) = 0x7fce771b3000 <2.106414>
mmap(NULL, 8388608, PROT_READ, MAP_SHARED, 28, 0) = 0x7fce69e4c000 <0.000099>
fstatfs(28, {f_type=0x9123683e, f_bsize=4096, f_blocks=272871862, f_bfree=10247093, f_bavail=9257207, f_files=0, f_ffree=0, f_fsid={-1736151745, 782050752}, f_namelen=255, f_frsize=4096}) = 0 <0.000037>
open("/var/log/journal/40b4c4d188900ed4b228417f53bce3a8/[email protected]~", O_RDONLY|O_CLOEXEC) = 29 <0.000039>
fstat(29, {st_mode=S_IFREG|0640, st_size=50331648, ...}) = 0 <0.000034>
mmap(NULL, 4096, PROT_READ, MAP_SHARED, 29, 0) = 0x7fce771b2000 <0.000043>
mmap(NULL, 8388608, PROT_READ, MAP_SHARED, 29, 0) = 0x7fce6964c000 <0.000026>
open("/dev/urandom", O_RDONLY|O_NOCTTY|O_CLOEXEC) = 30 <0.000019>
read(30, "_\f\327\367\310\220\344\23U\307\23N@\310\323\226", 16) = 16 <0.000010>
close(30) = 0 <0.000010>
open("/dev/urandom", O_RDONLY|O_NOCTTY|O_CLOEXEC) = 30 <0.000014>
read(30, "u\207\326\2360t\0T*\34\222\226\4\22\361w", 16) = 16 <0.000011>
close(30) = 0 <0.000010>
fstatfs(29, {f_type=0x9123683e, f_bsize=4096, f_blocks=272871862, f_bfree=10246938, f_bavail=9257024, f_files=0, f_ffree=0, f_fsid={-1736151745, 782050752}, f_namelen=255, f_frsize=4096}) = 0 <0.000015>
open("/var/log/journal/40b4c4d188900ed4b228417f53bce3a8/[email protected]~", O_RDONLY|O_CLOEXEC) = 30 <0.000014>
fstat(30, {st_mode=S_IFREG|0640, st_size=58720256, ...}) = 0 <0.000010>
mmap(NULL, 4096, PROT_READ, MAP_SHARED, 30, 0) = 0x7fce771b1000 <3.332606>
mmap()
对于看起来与上一个类似的调用,该过程似乎总是挂起一段时间。我曾经dd if=/dev/zero of=testfile.tmp
对磁盘 I/O 进行基准测试。执行此操作时,我观察到磁盘写入速度徘徊在 1000 M/s 左右(大多数时间+/- ~ 150 M/s),尽管它偶尔会下降到 300 M/s 或更低(一路下降到 ~ 70 M/s) /秒)。当我杀了它的时候dd
就报告了15689293824 bytes (16 GB) copied, 174.361 s, 90.0 MB/s
。值得注意的是,当向后运行相同的测试时 ( dd if=testfile.tmp of=/dev/null
),报告的磁盘读取性能iotop
没有超过 250 M/s,几乎总是低于 100 M/s,并且经常低于 50 M/s。杀戮这第二个dd
过程就产生了10573660160 bytes (11 GB) copied, 148.573 s, 71.2 MB/s
。
这里有一些更多的调试信息(注意这些是运行的后我杀死了该journalctl -xn
进程):
% find /var/log/journal | wc -l
115
% uname -a
Linux steevie 4.9.0-0.bpo.5-amd64 #1 SMP Debian 4.9.65-3+deb9u2~bpo8+1 (2017-01-05) x86_64 GNU/Linux
% vmstat
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 1495592 160420 80 4586516 199 179 15827 1391 15 8 7 5 37 51 0
请注意,文件系统位于 btrfs RAID1 阵列上,其中一个设备位于 LUKS 容器内(由于历史原因):
% sudo btrfs filesystem show /
Label: '[root]' uuid: 06c6565d-af6c-4123-9e42-3b6281f164e6
Total devices 4 FS bytes used 990.93GiB
devid 2 size 931.47GiB used 931.47GiB path /dev/mapper/sda2_crypt
devid 4 size 109.45GiB used 109.45GiB path /dev/sda6
devid 5 size 109.45GiB used 109.45GiB path /dev/sdb6
devid 6 size 931.47GiB used 931.47GiB path /dev/sdc2
% sudo btrfs device usage /
/dev/mapper/sda2_crypt, ID: 2
Device size: 931.47GiB
Device slack: 0.00B
Data,RAID1: 917.50GiB
Metadata,RAID1: 13.94GiB
System,RAID1: 32.00MiB
Unallocated: 1.02MiB
/dev/sda6, ID: 4
Device size: 109.45GiB
Device slack: 0.00B
Data,RAID1: 102.51GiB
Metadata,RAID1: 6.94GiB
Unallocated: 1.00MiB
/dev/sdb6, ID: 5
Device size: 109.45GiB
Device slack: 0.00B
Data,RAID1: 104.42GiB
Metadata,RAID1: 5.00GiB
System,RAID1: 32.00MiB
Unallocated: 3.00MiB
/dev/sdc2, ID: 6
Device size: 931.47GiB
Device slack: 0.00B
Data,RAID1: 927.47GiB
Metadata,RAID1: 4.00GiB
Unallocated: 1.02MiB
sda2_crypt
这里有误导性;它实际上是由/dev/sdd2
./dev/sda
和/dev/sdb
是SSD;/dev/sdc
和/dev/sdd
是 HDD。
我使用的是 Debian 8。如果需要,很乐意发布smartctl
日志,但它们有点长,这就是为什么我现在省略它们。