如何在服务器崩溃前知道内存是否用完

如何在服务器崩溃前知道内存是否用完

我有一台服务器,它总是崩溃。我知道服务器崩溃的原因有很多。但如果原因是系统在崩溃前内存不足,我该如何确认这是原因?我应该查看哪些日志文件?我应该查找哪些行/错误消息?我正在运行 CentOS。大量使用 php 解析 xml 文件,最多超过 2 GB。该服务器有 16GB 内存。

编辑1

[root@61540 ~]# free -m
             total       used       free     shared    buffers     cached
Mem:         16035       1526      14509          0         40       1002
-/+ buffers/cache:        483      15552
Swap:         8197          0       8197

编辑 2 /var/log/messages

Feb 17 20:38:26 61540 syslogd 1.4.1: restart.
Feb 17 20:38:26 61540 proftpd[3896]: 66.90.101.85 - received SIGHUP -- master server reparsing configuration file
Feb 17 22:23:06 61540 avahi-daemon[3984]: recvmsg(): Resource temporarily unavailable
Feb 17 23:07:37 61540 proftpd[10620] - (Several lines of ftp session)
Feb 18 23:03:48 61540 syslogd 1.4.1: restart.
Feb 18 23:03:48 61540 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Feb 18 23:03:48 61540 kernel: Linux version 2.6.18-308.el5 ([email protected]) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-52)) #1 SMP Tue Feb 21 20:06:06 EST 2012
Feb 18 23:03:48 61540 kernel: Command line: ro root=LABEL=/
Feb 18 23:03:48 61540 kernel: BIOS-provided physical RAM map:
Feb 18 23:03:48 61540 kernel:  BIOS-e820: 0000000000010000 - 000000000009a000 (usable)
Feb 18 23:03:48 61540 kernel:  BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
Feb 18 23:03:48 61540 kernel:  BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
Feb 18 23:03:48 61540 kernel:  BIOS-e820: 0000000000100000 - 00000000cfda0000 (usable)
Feb 18 23:03:48 61540 kernel:  BIOS-e820: 00000000cfda0000 - 00000000cfdd1000 (ACPI NVS)
Feb 18 23:03:48 61540 kernel:  BIOS-e820: 00000000cfdd1000 - 00000000cfe00000 (ACPI data)
Feb 18 23:03:48 61540 kernel:  BIOS-e820: 00000000cfe00000 - 00000000cff00000 (reserved)
Feb 18 23:03:48 61540 kernel:  BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
Feb 18 23:03:48 61540 kernel:  BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
Feb 18 23:03:48 61540 kernel:  BIOS-e820: 0000000100000000 - 000000042f000000 (usable)
Feb 18 23:03:48 61540 kernel: DMI 2.4 present.
Feb 18 23:03:48 61540 kernel: No NUMA configuration found
Feb 18 23:03:48 61540 kernel: Faking a node at 0000000000000000-000000042f000000
Feb 18 23:03:48 61540 kernel: Bootmem setup node 0 0000000000000000-000000042f000000
Feb 18 23:03:48 61540 kernel: Memory for crash kernel (0x0 to 0x0) notwithin permissible range
Feb 18 23:03:48 61540 kernel: disabling kdump
Feb 18 23:03:48 61540 kernel: ACPI: PM-Timer IO Port: 0x808
Feb 18 23:03:48 61540 kernel: ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Feb 18 23:03:48 61540 kernel: Processor #0 5:1 APIC version 16
Feb 18 23:03:48 61540 kernel: ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
Feb 18 23:03:48 61540 kernel: Processor #1 5:1 APIC version 16
Feb 18 23:03:48 61540 kernel: ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
Feb 18 23:03:48 61540 kernel: Processor #2 5:1 APIC version 16
Feb 18 23:03:48 61540 kernel: ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] enabled)
Feb 18 23:03:48 61540 kernel: Processor #3 5:1 APIC version 16
Feb 18 23:03:48 61540 kernel: ACPI: LAPIC (acpi_id[0x04] lapic_id[0x04] enabled)
Feb 18 23:03:48 61540 kernel: Processor #4 5:1 APIC version 16
Feb 18 23:03:48 61540 kernel: ACPI: LAPIC (acpi_id[0x05] lapic_id[0x05] enabled)
Feb 18 23:03:48 61540 kernel: Processor #5 5:1 APIC version 16
Feb 18 23:03:48 61540 kernel: ACPI: LAPIC (acpi_id[0x06] lapic_id[0x06] enabled)
Feb 18 23:03:48 61540 kernel: Processor #6 5:1 APIC version 16
Feb 18 23:03:48 61540 kernel: ACPI: LAPIC (acpi_id[0x07] lapic_id[0x07] enabled)
Feb 18 23:03:48 61540 kernel: Processor #7 5:1 APIC version 16

答案1

您应该检查/var/log/messagesdmesg命令在这种情况下将无用,因为它仅显示内核消息自上次启动以来

“内存不足”通常不足以使 Linux 完全崩溃。当内存不足时,Linux 将开始终止进程​​(OOM 杀手)所以你可能会寻找一些内核恐慌。如果您正在使用less阅读日志,您可以按键进行搜索/

但最重要的是:您应该首先阅读 /var/log/messages。它是按时间排序的,因此很容易找到服务器上次启动的时间。检查在此之前发生了什么,导致您的服务器崩溃。

答案2

如果 Linux 内存不足,它通常会启动 OOM killer(内存不足)。这是一个内核进程,它会终止其他进程以释放内存。如果发生这种情况,您在输入时应该会看到相应的日志dmesg

试试这个:dmesg | grep -i oom。如果没有输出,OOM 杀手可能没有杀死你的进程。

相关内容