如何调试系统中的 futex 拥塞

如何调试系统中的 futex 拥塞

我有使用多个 perl 库的 perl 多线程脚本(“assp antispam proxy”)。在我的其中一台服务器上,大约 2 小时后,脚本开始产生大量 futexed,这会消耗大量 CPU 电量。在其他服务器上不会发生这种情况,因此它可能是在服务器资源出现某些竞争条件后开始发生的。向开发人员询问有关如何调试此问题的任何信息,但没有得到答复。

使用 strace 或 sysdig 我发现这个 perl 脚本开始发出大量 futex 请求。我该如何调试它并找出所有这些内核 futex 是在哪个文件上生成的?我试过 sysdig 和 oprofile,但没有找到任何方法来检查它。

运行 strace 10 秒:

# strace -f -p 6130 -c
Process 6130 attached with 8 threads - interrupt to quit
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 57.44    3.265528       16493       198           nanosleep
 28.53    1.622291       19313        84           poll
  8.54    0.485535           1    787123    169255 futex
  5.28    0.300018       75005         4           restart_syscall
  0.21    0.012001        2400         5           select
  0.00    0.000209           0     35751           sched_yield
 ------ ----------- ----------- --------- --------- ----------------
100.00    5.685582                823355    169273 total

这些 futexes 的小例子:

    75853 13:09:36.167879875 2 perl (6248) > futex addr=7F51EF8E0A5C op=128(FUTEX_PRIVATE_FLAG) val=115743049
75855 13:09:36.167880152 2 perl (6248) > switch next=0 pgft_maj=0 pgft_min=40961 vm_size=990864 vm_rss=669216 vm_swap=0
75856 13:09:36.167880436 0 perl (6154) < futex res=0
75857 13:09:36.167880702 0 perl (6154) > futex addr=7F51EF8E0A5C op=133(FUTEX_PRIVATE_FLAG|FUTEX_WAKE_OP) val=1
75858 13:09:36.167881606 0 perl (6154) < futex res=1
75863 13:09:36.167882554 2 perl (6248) < futex res=0
75865 13:09:36.167882949 2 perl (6248) > futex addr=7F51EF8E0A20 op=128(FUTEX_PRIVATE_FLAG) val=2
75866 13:09:36.167883026 0 perl (6154) > futex addr=7F51EF8E0A58 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1
75869 13:09:36.167883226 2 perl (6248) > switch next=0 pgft_maj=0 pgft_min=40961 vm_size=990864 vm_rss=669216 vm_swap=0
75870 13:09:36.167883231 0 perl (6154) < futex res=0
75873 13:09:36.167883503 0 perl (6154) > futex addr=7F51EF8E0A20 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1
75877 13:09:36.167884305 0 perl (6154) < futex res=1
75885 13:09:36.167885569 2 perl (6248) < futex res=0
75887 13:09:36.167885810 2 perl (6248) > futex addr=7F51EF8E0A20 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1
75888 13:09:36.167885945 2 perl (6248) < futex res=0
75890 13:09:36.167886138 2 perl (6248) > futex addr=7F51EF8E0A5C op=128(FUTEX_PRIVATE_FLAG) val=115743051
75891 13:09:36.167886404 2 perl (6248) > switch next=0 pgft_maj=0 pgft_min=40961 vm_size=990864 vm_rss=669216 vm_swap=0
75892 13:09:36.167886585 0 perl (6154) > futex addr=7F51EF8E0A5C op=133(FUTEX_PRIVATE_FLAG|FUTEX_WAKE_OP) val=1
75894 13:09:36.167887449 0 perl (6154) < futex res=1
75896 13:09:36.167888690 2 perl (6248) < futex res=0
75897 13:09:36.167888969 2 perl (6248) > futex addr=7F51EF8E0A20 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1
75899 13:09:36.167889092 2 perl (6248) < futex res=0
75900 13:09:36.167889194 0 perl (6154) > futex addr=7F51EF8E0A58 op=128(FUTEX_PRIVATE_FLAG) val=2
75902 13:09:36.167889326 2 perl (6248) > futex addr=7F51EF8E0A58 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1
75903 13:09:36.167889510 2 perl (6248) < futex res=0
75904 13:09:36.167889619 0 perl (6154) < futex res=-11(EAGAIN)
75905 13:09:36.167889716 2 perl (6248) > futex addr=7F51EF8E0A5C op=128(FUTEX_PRIVATE_FLAG) val=115743053
75906 13:09:36.167889945 0 perl (6154) > futex addr=7F51EF8E0A5C op=133(FUTEX_PRIVATE_FLAG|FUTEX_WAKE_OP) val=1
75907 13:09:36.167889977 2 perl (6248) > switch next=0 pgft_maj=0 pgft_min=40961 vm_size=990864 vm_rss=669216 vm_swap=0
75908 13:09:36.167890792 0 perl (6154) < futex res=1
75910 13:09:36.167892302 2 perl (6248) < futex res=0
75911 13:09:36.167892572 2 perl (6248) > futex addr=7F51EF8E0A20 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1
75912 13:09:36.167892712 2 perl (6248) < futex res=0
75913 13:09:36.167892744 0 perl (6154) > futex addr=7F51EF8E0A5C op=128(FUTEX_PRIVATE_FLAG) val=115743055
75915 13:09:36.167893152 2 perl (6248) > futex addr=7F51EF8E0A5C op=133(FUTEX_PRIVATE_FLAG|FUTEX_WAKE_OP) val=1
75916 13:09:36.167893352 0 perl (6154) > switch next=0 pgft_maj=2 pgft_min=224301 vm_size=990864 vm_rss=669216 vm_swap=0
75917 13:09:36.167894180 2 perl (6248) < futex res=1
75921 13:09:36.167894912 0 perl (6154) < futex res=0
75922 13:09:36.167895247 0 perl (6154) > futex addr=7F51EF8E0A20 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1
75924 13:09:36.167895461 0 perl (6154) < futex res=0
75925 13:09:36.167895929 2 perl (6248) > futex addr=7F51EF8E0A5C op=128(FUTEX_PRIVATE_FLAG) val=115743057
75926 13:09:36.167895988 0 perl (6154) > futex addr=7F51EF8E0A5C op=133(FUTEX_PRIVATE_FLAG|FUTEX_WAKE_OP) val=1
75927 13:09:36.167896092 2 perl (6248) < futex res=-11(EAGAIN)
75929 13:09:36.167896259 0 perl (6154) < futex res=0
75930 13:09:36.167896312 2 perl (6248) > futex addr=7F51EF8E0A20 op=128(FUTEX_PRIVATE_FLAG) val=2
75931 13:09:36.167896547 0 perl (6154) > futex addr=7F51EF8E0A20 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1
75933 13:09:36.167896605 2 perl (6248) > switch next=0 pgft_maj=0 pgft_min=40961 vm_size=990864 vm_rss=669216 vm_swap=0
75939 13:09:36.167897494 0 perl (6154) < futex res=1
75944 13:09:36.167899107 2 perl (6248) < futex res=0
75945 13:09:36.167899365 2 perl (6248) > futex addr=7F51EF8E0A20 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1
75946 13:09:36.167899498 2 perl (6248) < futex res=0
75947 13:09:36.167899541 0 perl (6154) > futex addr=7F51EF8E0A5C op=128(FUTEX_PRIVATE_FLAG) val=115743059
75950 13:09:36.167900094 0 perl (6154) > switch next=0 pgft_maj=2 pgft_min=224301 vm_size=990864 vm_rss=669216 vm_swap=0
75952 13:09:36.167900811 2 perl (6248) > futex addr=7F51EF8E0A5C op=133(FUTEX_PRIVATE_FLAG|FUTEX_WAKE_OP) val=1
75953 13:09:36.167901474 2 perl (6248) < futex res=1
75958 13:09:36.167903232 0 perl (6154) < futex res=0
75959 13:09:36.167903727 2 perl (6248) > futex addr=7F51EF8E0A20 op=128(FUTEX_PRIVATE_FLAG) val=2
75960 13:09:36.167903788 0 perl (6154) > futex addr=7F51EF8E0A20 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1
75961 13:09:36.167903875 2 perl (6248) < futex res=-11(EAGAIN)
75963 13:09:36.167904047 0 perl (6154) < futex res=0
75964 13:09:36.167904106 2 perl (6248) > futex addr=7F51EF8E0A58 op=128(FUTEX_PRIVATE_FLAG) val=2
75967 13:09:36.167904348 0 perl (6154) > futex addr=7F51EF8E0A58 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1
75968 13:09:36.167904374 2 perl (6248) > switch next=0 pgft_maj=0 pgft_min=40961 vm_size=990864 vm_rss=669216 vm_swap=0
75971 13:09:36.167905247 0 perl (6154) < futex res=1
75973 13:09:36.167905490 0 perl (6154) > futex addr=7F51EF8E0A5C op=128(FUTEX_PRIVATE_FLAG) val=115743061
75978 13:09:36.167905988 0 perl (6154) > switch next=0 pgft_maj=2 pgft_min=224301 vm_size=990864 vm_rss=669216 vm_swap=0
75983 13:09:36.167906745 2 perl (6248) < futex res=0
75985 13:09:36.167906986 2 perl (6248) > futex addr=7F51EF8E0A5C op=133(FUTEX_PRIVATE_FLAG|FUTEX_WAKE_OP) val=1
75990 13:09:36.167907588 2 perl (6248) < futex res=1
75992 13:09:36.167907788 2 perl (6248) > futex addr=7F51EF8E0A20 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1
75993 13:09:36.167907927 2 perl (6248) < futex res=0
75998 13:09:36.167909046 0 perl (6154) < futex res=0
76002 13:09:36.167909432 0 perl (6154) > futex addr=7F51EF8E0A20 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1
76004 13:09:36.167909584 2 perl (6248) > futex addr=7F51EF8E0A58 op=128(FUTEX_PRIVATE_FLAG) val=2
76005 13:09:36.167909674 0 perl (6154) < futex res=0
76008 13:09:36.167909846 2 perl (6248) > switch next=0 pgft_maj=0 pgft_min=40961 vm_size=990864 vm_rss=669216 vm_swap=0
76009 13:09:36.167909969 0 perl (6154) > futex addr=7F51EF8E0A58 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1
76011 13:09:36.167910704 0 perl (6154) < futex res=1
76012 13:09:36.167910946 0 perl (6154) > futex addr=7F51EF8E0A5C op=128(FUTEX_PRIVATE_FLAG) val=115743063
76015 13:09:36.167911454 0 perl (6154) > switch next=0 pgft_maj=2 pgft_min=224301 vm_size=990864 vm_rss=669216 vm_swap=0
76020 13:09:36.167912232 2 perl (6248) < futex res=0
76022 13:09:36.167912491 2 perl (6248) > futex addr=7F51EF8E0A5C op=133(FUTEX_PRIVATE_FLAG|FUTEX_WAKE_OP) val=1

答案1

在 GNU/Linux 上你可以尝试系统水龙头。需要事先做一些准备。请查阅在线文档关于这一点。

你会发现一个例子目标定位快速用户空间互斥系统调用,为了完整性,下面重现。

值得注意的是(从上面引用的两个来源来看):

来自维基百科文章:

正确编程的基于 futex 的锁不会使用系统调用,除非锁发生争用;由于大多数操作不需要进程之间的仲裁,因此在大多数情况下不会发生这种情况

来自systemtap文档:

简单来说,当多个进程试图同时访问同一个锁变量时,就会发生 futex 争用。这可能会导致性能不佳,因为锁会串行执行;一个进程获得锁,而其他进程必须等待锁变量再次可用。

来自 systemtap 示例的脚本:

#! /usr/bin/env stap

# This script tries to identify contended user-space locks by hooking
# into the futex system call.

global FUTEX_WAIT = 0 /*, FUTEX_WAKE = 1 */
global FUTEX_PRIVATE_FLAG = 128 /* linux 2.6.22+ */
global FUTEX_CLOCK_REALTIME = 256 /* linux 2.6.29+ */

global lock_waits # long-lived stats on (tid,lock) blockage elapsed time
global process_names # long-lived pid-to-execname mapping

probe syscall.futex.return {  
  if (($op & ~(FUTEX_PRIVATE_FLAG|FUTEX_CLOCK_REALTIME)) != FUTEX_WAIT) next
  process_names[pid()] = execname()
  elapsed = gettimeofday_us() - @entry(gettimeofday_us())
  lock_waits[pid(), $uaddr] <<< elapsed
}

probe end {
  foreach ([pid+, lock] in lock_waits) 
    printf ("%s[%d] lock %p contended %d times, %d avg us\n",
            process_names[pid], pid, lock, @count(lock_waits[pid,lock]),
            @avg(lock_waits[pid,lock]))
}

您还可以尝试使用开发者::NYTProf在你的 perl 程序上查明行为不当的代码。

相关内容