我在 VMWARE 上安装了 RHEL 5.5,内存为 64GB。当我们解压文件(日常流程的一部分)时,CPU 负载有时会激增 - 导致服务器基本无响应。磁盘是厚配置的,因此不需要动态扩展磁盘。
我无法访问底层硬件,但需要找出导致这种情况的原因,因为这些数据库服务器实际上不应该一次 30 分钟无响应。
以下是解压过程中的负载情况(请注意,cron 作业显然在 20:56 到 21:25 之间备份)解压开始于 20:51,结束于 21:02
top - 20:50:02 up 4 days, 21:26, 0 users, load average: 0.62, 0.62, 0.75
top - 20:52:01 up 4 days, 21:28, 0 users, load average: 4.16, 1.48, 1.02
top - 20:54:02 up 4 days, 21:30, 0 users, load average: 11.28, 5.13, 2.41
top - 20:56:07 up 4 days, 21:33, 0 users, load average: 14.44, 8.57, 4.02
top - 21:25:29 up 4 days, 22:02, 0 users, load average: 154.01, 139.20, 99.18
top - 21:25:30 up 4 days, 22:02, 0 users, load average: 154.01, 139.20, 99.18
top - 21:25:30 up 4 days, 22:02, 0 users, load average: 154.01, 139.20, 99.18
top - 21:25:35 up 4 days, 22:02, 0 users, load average: 159.38, 140.56, 99.84
top - 21:25:36 up 4 days, 22:02, 0 users, load average: 163.67, 141.76, 100.44
top - 21:25:36 up 4 days, 22:02, 0 users, load average: 163.67, 141.76, 100.44
top - 21:25:37 up 4 days, 22:02, 0 users, load average: 163.67, 141.76, 100.44
top - 21:25:37 up 4 days, 22:02, 0 users, load average: 163.67, 141.76, 100.44
top - 21:25:35 up 4 days, 22:02, 0 users, load average: 159.38, 140.56, 99.84
top - 21:25:39 up 4 days, 22:02, 0 users, load average: 163.67, 141.76, 100.44
top - 21:25:39 up 4 days, 22:02, 0 users, load average: 163.67, 141.76, 100.44
top - 21:25:41 up 4 days, 22:02, 0 users, load average: 171.47, 143.74, 101.31
top - 21:25:42 up 4 days, 22:02, 0 users, load average: 171.47, 143.74, 101.31
top - 21:26:02 up 4 days, 22:02, 0 users, load average: 137.27, 137.85, 100.28
top - 21:28:01 up 4 days, 22:04, 0 users, load average: 21.04, 94.49, 89.05
top - 21:30:02 up 4 days, 22:06, 1 user, load average: 3.09, 63.30, 78.28
top - 21:32:01 up 4 days, 22:08, 1 user, load average: 2.17, 43.05, 69.04
top - 21:34:02 up 4 days, 22:10, 1 user, load average: 0.79, 29.08, 60.77
top - 21:36:02 up 4 days, 22:12, 1 user, load average: 0.90, 19.76, 53.51
top - 21:38:01 up 4 days, 22:14, 1 user, load average: 0.71, 13.45, 47.10
cat /proc/interrupts 显示以下内容:
CPU0 CPU1 CPU2 CPU3
0: 469063418 0 0 0 IO-APIC-edge timer
1: 131 64 0 0 IO-APIC-edge i8042
6: 5 0 0 0 IO-APIC-edge floppy
7: 0 0 0 0 IO-APIC-edge parport0
8: 1 0 0 0 IO-APIC-edge rtc
9: 0 0 0 0 IO-APIC-level acpi
12: 428 8 0 140 IO-APIC-edge i8042
15: 3577805 630018 6354 720 IO-APIC-edge ide1
51: 3158220 744800 30127 12814 IO-APIC-level ioc0
67: 20134707 6847632 0 5355226 PCI-MSI eth0
83: 0 0 0 0 PCI-MSI vmci
NMI: 0 0 0 0
LOC: 469060990 469075818 469075515 469075350
ERR: 0
MIS: 0
答案1
在其他条件相同的情况下,压缩/解压缩(如加密/解密)是一项 CPU 密集型操作。您之所以看到 CPU 峰值,是因为您正在执行的操作会给 CPU 带来沉重的负载。
如果这导致您的服务器无响应,则您需要在运行此任务时开始考虑 CPU 优先级或最大 CPU 利用率,并限制解压过程的优先级。