NFS openat 性能极慢

NFS openat 性能极慢

我在 Ubuntu 20.04 上安装了一个 NFS 服务器和一个 FreeIPA Ubuntu 20.04 客户端,用户主目录由 NFS 服务器提供服务。访问文件时性能极慢。当我跟踪系统调用所花费的时间时,我发现 openat 有时需要超过 1 秒才能处理 NFS 文件!(见下文)。不用说,服务器上的文件访问不会出现这样的问题。openat 是唯一缓慢的操作。

我附上了 openat 所用时间的直方图(我修剪了顶部的 bin,以便可以看到尾部)。有超过 800 个 openat 调用在 10 毫秒内完成,但正是尾部导致了总时间的差异,并且有很多调用花费了超过 100 毫秒的时间,这是不合理的。

我怀疑这可能与 Kerberos 授权或类似的东西有关,但我不知道如何调查这个问题。

/etc/exports 中的选项:

/home   *(rw,sec=krb5:krb5i:krb5p,async,no_subtree_check)

在客户端上挂载:

server:/home/... on /home/... type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=krb5,clientaddr=xx.xx.xx.x1,local_lock=none,addr=xx.xx.xx.x2)

任何帮助或线索都将不胜感激,

尤瓦尔。

0.000064 : stat("/home/.../lib/python3.8/site-packages/pandas/core", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
0.000040 : stat("/home/.../lib/python3.8/site-packages/pandas/core/nanops.py", {st_mode=S_IFREG|0664, st_size=50002, ...}) = 0
0.000095 : stat("/home/.../lib/python3.8/site-packages/pandas/core/nanops.py", {st_mode=S_IFREG|0664, st_size=50002, ...}) = 0
0.664737 : openat(AT_FDCWD, "/home/.../lib/python3.8/site-packages/pandas/core/__pycache__/nanops.cpython-38.pyc", O_RDONLY|O_CLOEXEC) = 6
0.000122 : fstat(6, {st_mode=S_IFREG|0664, st_size=36431, ...}) = 0
0.000116 : ioctl(6, TCGETS, 0x7ffed1278d60)        = -1 ENOTTY (Inappropriate ioctl for device)
0.000049 : lseek(6, 0, SEEK_CUR)                   = 0
0.000024 : lseek(6, 0, SEEK_CUR)                   = 0
0.000028 : fstat(6, {st_mode=S_IFREG|0664, st_size=36431, ...}) = 0
0.000052 : read(6, "U\r\r\n\0\0\0\0\216t\362aR\303\0\0\343\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 36432) = 36431
0.000024 : read(6, "", 1)                          = 0
0.000438 : close(6)                                = 0
0.000083 : mmap(NULL, 262144, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f23fcbaf000
0.000100 : stat("/home/.../bin", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
0.000120 : stat("/usr/lib/python3.8", {st_mode=S_IFDIR|0755, st_size=20480, ...}) = 0
0.000122 : stat("/usr/lib/python3.8/lib-dynload", {st_mode=S_IFDIR|0755, st_size=12288, ...}) = 0
0.000037 : getcwd("/home/yuval/src/themis", 1024)  = 23
0.000046 : stat("/home/yuval/src/themis", {st_mode=S_IFDIR|0775, st_size=130, ...}) = 0
0.000037 : stat("/home/.../lib/python3.8/site-packages", {st_mode=S_IFDIR|0775, st_size=12288, ...}) = 0
0.000051 : stat("/home/.../lib/python3.8/site-packages/pandas/core/array_algos", {st_mode=S_IFDIR|0775, st_size=163, ...}) = 0
0.000041 : stat("/home/.../lib/python3.8/site-packages/pandas/core/array_algos/masked_reductions.py", {st_mode=S_IFREG|0664, st_size=3721, ...}) = 0
0.000085 : stat("/home/.../lib/python3.8/site-packages/pandas/core/array_algos/masked_reductions.py", {st_mode=S_IFREG|0664, st_size=3721, ...}) = 0
0.411113 : openat(AT_FDCWD, "/home/.../lib/python3.8/site-packages/pandas/core/array_algos/__pycache__/masked_reductions.cpython-38.pyc", O_RDONLY|O_CLOEXEC) = 6
0.000053 : fstat(6, {st_mode=S_IFREG|0664, st_size=3329, ...}) = 0
0.000027 : ioctl(6, TCGETS, 0x7ffed1278d60)        = -1 ENOTTY (Inappropriate ioctl for device)
0.000043 : lseek(6, 0, SEEK_CUR)                   = 0
0.000037 : lseek(6, 0, SEEK_CUR)                   = 0
0.000025 : fstat(6, {st_mode=S_IFREG|0664, st_size=3329, ...}) = 0
0.000032 : read(6, "U\r\r\n\0\0\0\0\216t\362a\211\16\0\0\343\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3330) = 3329
0.000025 : read(6, "", 1)                          = 0
0.000438 : close(6)                                = 0
0.000105 : stat("/home/.../lib/python3.8/site-packages/pandas/core/arrays", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
0.000102 : stat("/home/.../lib/python3.8/site-packages/pandas/core/arrays/categorical.py", {st_mode=S_IFREG|0664, st_size=94502, ...}) = 0
0.000101 : stat("/home/.../lib/python3.8/site-packages/pandas/core/arrays/categorical.py", {st_mode=S_IFREG|0664, st_size=94502, ...}) = 0
0.413090 : openat(AT_FDCWD, "/home/.../lib/python3.8/site-packages/pandas/core/arrays/__pycache__/categorical.cpython-38.pyc", O_RDONLY|O_CLOEXEC) = 6
0.000063 : fstat(6, {st_mode=S_IFREG|0664, st_size=77947, ...}) = 0
0.000041 : ioctl(6, TCGETS, 0x7ffed127b180)        = -1 ENOTTY (Inappropriate ioctl for device)
0.000037 : lseek(6, 0, SEEK_CUR)                   = 0
0.000023 : lseek(6, 0, SEEK_CUR)                   = 0
0.000031 : fstat(6, {st_mode=S_IFREG|0664, st_size=77947, ...}) = 0
0.000085 : read(6, "U\r\r\n\0\0\0\0\216t\362a&q\1\0\343\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 77948) = 77947

openat 中打开 Python 和导入 pytorch 所用时间的直方图

答案1

安装了哪个版本的内核?这可能与几周前刚刚修补的已知 NFS 客户端服务器错误有关:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2009325

相关内容