目前,我看到 CPU 使用率很高events/1
,我想知道如何找出原因是什么?
cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 CPU8 CPU9 CPU10 CPU11 CPU12 CPU13 CPU14 CPU15
0: 575075290 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-edge timer
1: 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-edge i8042
8: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-edge rtc0
9: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-fasteoi acpi
18: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-fasteoi ata_piix
19: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-fasteoi ata_piix
20: 27 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-fasteoi ehci_hcd:usb1
21: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-fasteoi uhci_hcd:usb5, uhci_hcd:usb8
22: 48 0 0 0 203 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-fasteoi uhci_hcd:usb4, uhci_hcd:usb7
23: 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-fasteoi ehci_hcd:usb2, uhci_hcd:usb3, uhci_hcd:usb6
32: 7511 0 0 0 0 0 0 169991 0 0 0 0 0 0 0 0 IO-APIC-fasteoi megasas
49: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge eth0
50: 14013 79342 0 0 13699 5817 0 0 916084 0 0 0 43447 0 0 0 PCI-MSI-edge eth0-TxRx-0
51: 21 0 0 0 0 0 0 0 0 491153 0 0 0 0 0 0 PCI-MSI-edge eth0-TxRx-1
52: 16 0 0 0 0 0 0 0 0 0 0 0 0 490363 0 0 PCI-MSI-edge eth0-TxRx-2
53: 15 0 0 0 0 0 0 0 0 0 512295 0 0 0 0 0 PCI-MSI-edge eth0-TxRx-3
54: 14 0 0 0 0 0 0 0 0 0 0 0 0 0 2137545 0 PCI-MSI-edge eth0-TxRx-4
55: 14 0 0 0 0 0 0 0 0 0 0 472322 0 0 0 0 PCI-MSI-edge eth0-TxRx-5
56: 14 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1261400 PCI-MSI-edge eth0-TxRx-6
57: 46039 101974 16307 0 177472 22446 0 0 78146 0 0 0 3060 0 0 0 PCI-MSI-edge eth0-TxRx-7
NMI: 116990 104030 89718 76478 57282 42770 27106 7229 11177 12608 16074 17471 15123 17563 17220 9457 Non-maskable interrupts
LOC: 1240959 513359079 608524106 453845650 545193366 480747439 402785555 456482461 620998409 526207907 405289993 406272537 426647321 459716091 532029492 578607757 Local timer interrupts
SPU: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Spurious interrupts
PMI: 116990 104030 89718 76478 57282 42770 27106 7229 11177 12608 16074 17471 15123 17563 17220 9457 Performance monitoring interrupts
IWI: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IRQ work interrupts
RES: 3518 3786 1864 535 5509 4985 1982 2083 2640 1854 1547 1067 2261 2259 1744 1742 Rescheduling interrupts
CAL: 57062 229 228 228 3541 227 228 225 222 210 222 226 212 212 224 217 Function call interrupts
TLB: 10184 9632 3623 3081 15017 12586 3242 2803 7624 33023 5225 4085 4565 45383 7271 4827 TLB shootdowns
TRM: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Thermal event interrupts
THR: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Threshold APIC interrupts
MCE: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Machine check exceptions
MCP: 1185 1185 1185 1185 1185 1185 1185 1185 1185 1185 1185 1185 1185 1185 1185 1185 Machine check polls
ERR: 0
MIS: 0
这可能是硬件问题吗?
编辑:
ps auxf
答案1
好的,感谢您提供的补充信息。所以,events/1 是问题所在。这个 events/ 表示 events/CPU_NO。events threadds 是作为内核唤醒和执行工作的一种方式引入的,它取代了 keventd。然而,在 v3.9-rc8 git 树中,我仍然看到 keventd 和 events/per_cpu。
我假设 irq 57 正在占用 CPU 1。否则,在 /proc/interrupts 中,我没有看到任何异常。工作负载基本均衡。您能看到 irq 57 是什么吗?
另外,老实说,我认为我们无法使用传统的 top、ps 等方法来调试这个问题。从各个方面来说,这都需要进行性能分析。如果您能做到这一点,请告诉我。如果您确实做到了,您需要将其邮寄给我,因为您可以在这里放置那么多数据。