我们遇到了交换器进程占用 40% CPU 利用率的问题。
这是一台 HP DL360G8 服务器,具有 16 个核心,超线程至 32 个 vCPU,搭载 Ubuntu 16.04。
uname -a
Linux ubuntu 4.4.0-87-generic #110-Ubuntu SMP Tue Jul 18 12:55:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
我们同时运行 10 个 Chrome 实例。Chrome 的总体平均 CPU 利用率约为 20%。但盒子的利用率达到 50-60%。
top 命令结果:
top - 12:28:56 up 18:22, 1 user, load average: 26.06, 25.48, 26.25
Tasks: 531 total, 14 running, 517 sleeping, 0 stopped, 0 zombie
%Cpu(s): 51.8 us, 16.7 sy, 0.0 ni, 30.9 id, 0.0 wa, 0.0 hi, 0.6 si, 0.0 st
KiB Mem : 32903968 total, 5360712 free, 4363668 used, 23179588 buff/cache
KiB Swap: 33521660 total, 33521660 free, 0 used. 27271240 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
16450 ubuntu 20 0 1272700 371316 90316 R 128.5 1.1 0:31.28 chrome
20228 ubuntu 20 0 1035308 231700 112168 R 118.0 0.7 0:08.68 chrome
17929 ubuntu 20 0 1168144 300908 78488 R 110.8 0.9 0:19.71 chrome
20236 ubuntu 20 0 976584 181224 71084 R 107.9 0.6 0:08.52 chrome
17364 ubuntu 20 0 1094608 222588 86896 R 104.3 0.7 0:14.03 chrome
18048 ubuntu 20 0 876428 153676 103216 R 81.6 0.5 0:07.96 chrome
18917 ubuntu 20 0 906296 111216 58764 R 77.0 0.3 0:06.04 chrome
17178 ubuntu 20 0 950044 124616 57396 R 69.8 0.4 0:08.81 chrome
20231 ubuntu 20 0 975728 155644 80336 R 62.6 0.5 0:05.00 chrome
16861 ubuntu 20 0 790176 143856 94232 R 62.3 0.4 0:13.35 chrome
18247 ubuntu 20 0 789240 144924 97188 R 60.0 0.4 0:05.93 chrome
19052 ubuntu 20 0 876516 94588 57136 R 59.0 0.3 0:03.85 chrome
20816 ubuntu 20 0 862732 119076 83696 S 52.5 0.4 0:01.68 chrome
20845 ubuntu 20 0 765556 116208 89024 S 47.5 0.4 0:01.45 chrome
20881 ubuntu 20 0 767336 116076 88860 S 47.5 0.4 0:01.45 chrome
15032 ubuntu 20 0 833228 154676 100316 S 40.3 0.5 0:18.29 chrome
19242 ubuntu 20 0 807540 141460 93840 S 40.0 0.4 0:07.46 chrome
16419 ubuntu 20 0 802776 151296 98852 S 34.4 0.5 0:12.39 chrome
19746 ubuntu 20 0 802080 143508 96040 S 34.4 0.4 0:03.45 chrome
19563 ubuntu 20 0 866784 102160 57740 S 32.8 0.3 0:04.11 chrome
15606 ubuntu 20 0 916200 134464 58336 S 28.9 0.4 0:11.71 chrome
16747 ubuntu 20 0 935936 91460 58256 S 22.3 0.3 0:07.39 chrome
21507 ubuntu 20 0 809420 77648 54356 S 21.3 0.2 0:00.65 chrome
45917 root 20 0 0 0 0 S 21.3 0.0 0:19.54 kworker/7:4
24222 root 20 0 0 0 0 S 20.0 0.0 2:32.20 kworker/11:3
21235 ubuntu 20 0 807404 77448 54220 S 19.7 0.2 0:00.60 chrome
52972 root 20 0 0 0 0 S 19.0 0.0 1:07.90 kworker/23:1
37232 root 20 0 0 0 0 S 18.7 0.0 1:01.20 kworker/3:2
48449 root 20 0 0 0 0 S 18.4 0.0 0:05.88 kworker/1:4
6863 root 20 0 0 0 0 S 17.7 0.0 0:39.47 kworker/9:1
26492 root 20 0 0 0 0 S 17.7 0.0 1:17.70 kworker/16:1
性能:
sudo perf record -g -a sleep 10
- 33.36% 0.00% swapper [kernel.kallsyms] [k] cpu_startup_entry ▒
- 33.36% cpu_startup_entry ▒
- 31.16% call_cpuidle ▒
- 31.16% cpuidle_enter ▒
- 23.98% cpuidle_enter_state ▒
23.51% intel_idle ▒
0.00% leave_mm ▒
0.00% poll_idle ▒
+ 0.00% ktime_get ▒
0.00% sched_idle_set_state ▒
0.00% read_tsc ▒
+ 6.02% apic_timer_interrupt ▒
+ 1.16% reschedule_interrupt ▒
+ 0.00% ret_from_intr ▒
+ 0.00% call_function_interrupt ▒
0.00% ktime_get ▒
0.00% intel_idle ▒
0.00% sched_idle_set_state ▒
0.00% native_irq_return_iret ▒
0.00% restore_c_regs_and_iret ▒
0.00% retint_kernel ▒
0.00% common_interrupt ▒
+ 0.00% call_function_single_interrupt ▒
0.00% native_iret ▒
0.00% cpuidle_enter_state ▒
+ 1.06% schedule_preempt_disabled ▒
0.61% cpuidle_not_available ▒
+ 0.53% cpuidle_select ▒
+ 0.00% tick_nohz_idle_enter ▒
+ 0.00% sched_ttwu_pending ▒
+ 0.00% tick_nohz_idle_exit ▒
0.00% rcu_idle_enter ▒
+ 0.00% arch_cpu_idle_enter ▒
0.00% rcu_idle_exit ▒
0.00% schedule ▒
所有核心的 CPU 回溯都显示相同的内容(Chrome 进程除外):
交换进程
[63714.313210] NMI backtrace for cpu 29
[63714.313221] CPU: 29 PID: 0 Comm: swapper/29 Not tainted 4.4.0-87-generic #110-Ubuntu
[63714.313227] Hardware name: HP ProLiant DL360p Gen8, BIOS P71 07/01/2015
[63714.313232] task: ffff88082d9aaa00 ti: ffff88082d9c4000 task.ti: ffff88082d9c4000
[63714.313239] RIP: 0010:[<ffffffff8148bb98>] [<ffffffff8148bb98>] intel_idle+0xa8/0x130
[63714.313250] RSP: 0018:ffff88082d9c7e40 EFLAGS: 00000046
[63714.313257] RAX: 0000000000000030 RBX: 0000000000000010 RCX: 0000000000000001
[63714.313262] RDX: 0000000000000000 RSI: ffff88082d9c8000 RDI: 0000000001e0a000
[63714.313268] RBP: ffff88082d9c7e60 R08: 0000000000000981 R09: 0000000000000018
[63714.313273] R10: 0000000000000022 R11: 000000000000103a R12: 0000000000000004
[63714.313286] R13: 0000000000000005 R14: 0000000000000030 R15: ffffffff81eb3658
[63714.313292] FS: 0000000000000000(0000) GS:ffff88083f740000(0000) knlGS:0000000000000000
[63714.313298] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[63714.313307] CR2: 000019aa9ecccee8 CR3: 0000000001e0a000 CR4: 00000000000406e0
[63714.313317] Stack:
[63714.313320] 0000000000000005 ffffffff81eb3460 ffffe8ffff940600 000039f31051e675
[63714.313322] ffff88082d9c7ea8 ffffffff816d4e67 000000003f753bc0 ffffffff81eb3460
[63714.313324] ffffffff81f38d00 ffff88082d9c8000 ffffe8ffff940600 ffffffff81eb3460
[63714.313325] Call Trace:
[63714.313326] [<ffffffff816d4e67>] cpuidle_enter_state+0xe7/0x2b0
[63714.313327] [<ffffffff816d5067>] cpuidle_enter+0x17/0x20
[63714.313329] [<ffffffff810c46d2>] call_cpuidle+0x32/0x60
[63714.313330] [<ffffffff816d5043>] ? cpuidle_select+0x13/0x20
[63714.313332] [<ffffffff810c4990>] cpu_startup_entry+0x290/0x350
[63714.313333] [<ffffffff810517b4>] start_secondary+0x154/0x190
[63714.313335] Code: 48 8b 34 25 c4 42 01 00 48 89 d1 48 8d 86 08 c0 ff ff 0f 01 c8 48 8b 86 08 c0 ff ff a8 08 75 0b b9 01 00 00 00 4c 89 f0 0f 01 c9 <65> 48 8b 04 25 c4 42 01 00 f0 80 a0 0a c0 ff ff df 0f ae f0 48
kworker 进程
[63548.059703] CPU: 31 PID: 22556 Comm: kworker/31:0 Not tainted 4.4.0-87-generic #110-Ubuntu
[63548.059705] Hardware name: HP ProLiant DL360p Gen8, BIOS P71 07/01/2015
[63548.059707] task: ffff88079b7d5400 ti: ffff8807fa150000 task.ti: ffff8807fa150000
[63548.059709] RIP: 0010:[<ffffffff8183d72a>] [<ffffffff8183d72a>] __schedule+0x3ea/0xa30
[63548.059711] RSP: 0000:ffff8807fa153df8 EFLAGS: 00000003
[63548.059713] RAX: 000000000000001f RBX: ffff88083f7d6dc0 RCX: 0000000000000000
[63548.059715] RDX: ffffffff81f38d00 RSI: 0000000000000000 RDI: 0000000000000000
[63548.059717] RBP: ffff8807fa153e38 R08: 0000000000000000 R09: 0000000000000000
[63548.059721] R10: 00000000000002eb R11: 0000000000000000 R12: 0000000000000000
[63548.059722] R13: 0000000000016dc0 R14: ffff88079b7d5970 R15: ffff88083f7d6dc0
[63548.059724] FS: 0000000000000000(0000) GS:ffff88083f7c0000(0000) knlGS:0000000000000000
[63548.059726] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[63548.059728] CR2: 00000000000000b8 CR3: 00000007f5661000 CR4: 00000000000406e0
[63548.059730] Stack:
[63548.059732] ffff88083f7daf00 ffff8807e622b800 ffff88079b7d5400 ffff8807fa154000
[63548.059734] ffff88083f7d65c0 0000000000000008 ffff88083f7d65d8 ffff88082c6d7200
[63548.059736] ffff8807fa153e50 ffffffff8183dda5 ffff88083f7d65c0 ffff8807fa153eb8
[63548.059738] Call Trace:
[63548.059740] [<ffffffff8183dda5>] schedule+0x35/0x80
[63548.059742] [<ffffffff8109a9cb>] worker_thread+0xcb/0x4c0
[63548.059744] [<ffffffff8109a900>] ? process_one_work+0x480/0x480
[63548.059747] [<ffffffff810a0c85>] kthread+0xe5/0x100
[63548.059748] [<ffffffff810a0ba0>] ? kthread_create_on_node+0x1e0/0x1e0
[63548.059750] [<ffffffff8184224f>] ret_from_fork+0x3f/0x70
[63548.059758] [<ffffffff810a0ba0>] ? kthread_create_on_node+0x1e0/0x1e0
[63548.059767] Code: 00 00 0f 85 8e 03 00 00 48 83 c4 18 5b 41 5c 41 5d 41 5e 41 5f 5d c3 65 8b 05 c3 ca 7c 7e 48 8b 15 44 47 1d 00 89 c0 48 0f a3 02 <19> c0 85 c0 0f 84 c9 fd ff ff 4c 8b 2d d5 41 6d 00 4d 85 ed 74
查看 acpi 中断
grep . -r /sys/firmware/acpi/interrupts/
也没有发现任何可疑之处。
我不知道是什么原因导致了这种现象,因此如果能提供任何帮助我都会很感激。
提前致谢!