是什么导致我的 CentOS 6.4 机器自行重启?

是什么导致我的 CentOS 6.4 机器自行重启?

我注意到我的 CentOS 6.4 机器上运行的 MySQL 守护进程突然不再运行了。我检查了 MySQL 日志,但没有看到任何相关内容:

121229 22:17:45 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
121229 22:17:50 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
121229 22:17:50  InnoDB: Initializing buffer pool, size = 8.0M
121229 22:17:50  InnoDB: Completed initialization of buffer pool
121229 22:17:50  InnoDB: Started; log sequence number 0 206087326
121229 22:17:50 [Note] Event Scheduler: Loaded 0 events
121229 22:17:50 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.1.66-log'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Source distribution
130205 11:09:32 [Note] /usr/libexec/mysqld: Normal shutdown

130205 11:09:32 [Note] Event Scheduler: Purging the queue. 0 events
130205 11:09:34  InnoDB: Starting shutdown...
130205 11:09:36  InnoDB: Shutdown completed; log sequence number 0 529664030
130205 11:09:36 [Note] /usr/libexec/mysqld: Shutdown complete

130205 11:09:36 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
130205 11:09:37 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
130205 11:09:37  InnoDB: Initializing buffer pool, size = 8.0M
130205 11:09:37  InnoDB: Completed initialization of buffer pool
130205 11:09:37  InnoDB: Started; log sequence number 0 529664030
130205 11:09:37 [Note] Event Scheduler: Loaded 0 events
130205 11:09:37 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.1.67-log'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Source distribution
130310 11:33:12 [Note] /usr/libexec/mysqld: Normal shutdown

130310 11:33:12 [Note] Event Scheduler: Purging the queue. 0 events
130310 11:33:14  InnoDB: Starting shutdown...
130310 11:33:16  InnoDB: Shutdown completed; log sequence number 0 788753738
130310 11:33:16 [Note] /usr/libexec/mysqld: Shutdown complete

130310 11:33:16 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
130310 11:36:03 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
130310 11:36:03  InnoDB: Initializing buffer pool, size = 8.0M
130310 11:36:03  InnoDB: Completed initialization of buffer pool
130310 11:36:04  InnoDB: Started; log sequence number 0 788753738
130310 11:36:04 [Note] Event Scheduler: Loaded 0 events
130310 11:36:04 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.1.67-log'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Source distribution
130413 20:56:55 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
130413 20:56:56  InnoDB: Initializing buffer pool, size = 8.0M
130413 20:56:56  InnoDB: Completed initialization of buffer pool
InnoDB: Log scan progressed past the checkpoint lsn 0 1139894636
130413 20:56:56  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
InnoDB: Doing recovery: scanned up to log sequence number 0 1139895853
130413 20:56:56  InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percents: 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 
InnoDB: Apply batch completed
InnoDB: Last MySQL binlog file position 0 335782050, file name ./mysql-bin.000003
130413 20:56:57  InnoDB: Started; log sequence number 0 1139895853
130413 20:56:57 [Note] Recovering after a crash using mysql-bin
130413 20:56:59 [ERROR] Error in Log_event::read_log_event(): 'read error', data_len: 809, event_type: 2
130413 20:56:59 [Note] Starting crash recovery...
130413 20:56:59 [Note] Crash recovery finished.
130413 20:56:59 [Note] Event Scheduler: Loaded 0 events
130413 20:56:59 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.1.67-log'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Source distribution

然后我检查了一下/var/log/messages,发现系统由于某种原因重新启动了:

Apr  7 03:48:03 localhost rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="1335" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
Apr 13 17:19:07 localhost kernel: imklog 5.8.10, log source = /proc/kmsg started.
Apr 13 17:19:07 localhost rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="1370" x-info="http://www.rsyslog.com"] start
Apr 13 17:19:07 localhost kernel: Initializing cgroup subsys cpuset
Apr 13 17:19:07 localhost kernel: Initializing cgroup subsys cpu
Apr 13 17:19:07 localhost kernel: Linux version 2.6.32-358.2.1.el6.x86_64 ([email protected]) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC) ) #1 SMP Wed Mar 13 00:26:49 UTC 2013
Apr 13 17:19:07 localhost kernel: Command line: ro root=/dev/mapper/VolGroup-lv_root rd_NO_LUKS LANG=en_US.UTF-8 rd_NO_MD rd_LVM_LV=VolGroup/lv_swap SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=VolGroup/lv_root  KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet
Apr 13 17:19:07 localhost kernel: KERNEL supported cpus:
Apr 13 17:19:07 localhost kernel:  Intel GenuineIntel
Apr 13 17:19:07 localhost kernel:  AMD AuthenticAMD
Apr 13 17:19:07 localhost kernel:  Centaur CentaurHauls
Apr 13 17:19:07 localhost kernel: BIOS-provided physical RAM map:
Apr 13 17:19:07 localhost kernel: BIOS-e820: 0000000000000000 - 000000000009b000 (usable)
Apr 13 17:19:07 localhost kernel: BIOS-e820: 000000000009b000 - 00000000000a0000 (reserved)
Apr 13 17:19:07 localhost kernel: BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
Apr 13 17:19:07 localhost kernel: BIOS-e820: 0000000000100000 - 000000008bf64000 (usable)
Apr 13 17:19:07 localhost kernel: BIOS-e820: 000000008bf64000 - 000000008c051000 (ACPI NVS)
Apr 13 17:19:07 localhost kernel: BIOS-e820: 000000008c051000 - 000000008c13d000 (ACPI data)
Apr 13 17:19:07 localhost kernel: BIOS-e820: 000000008c13d000 - 000000008d53d000 (ACPI NVS)
Apr 13 17:19:07 localhost kernel: BIOS-e820: 000000008d53d000 - 000000008f602000 (ACPI data)
Apr 13 17:19:07 localhost kernel: BIOS-e820: 000000008f602000 - 000000008f64f000 (reserved)
Apr 13 17:19:07 localhost kernel: BIOS-e820: 000000008f64f000 - 000000008f6e4000 (ACPI data)
Apr 13 17:19:07 localhost kernel: BIOS-e820: 000000008f6e4000 - 000000008f6ef000 (ACPI NVS)
Apr 13 17:19:07 localhost kernel: BIOS-e820: 000000008f6ef000 - 000000008f6f1000 (ACPI data)
Apr 13 17:19:07 localhost kernel: BIOS-e820: 000000008f6f1000 - 000000008f7cf000 (ACPI NVS)
Apr 13 17:19:07 localhost kernel: BIOS-e820: 000000008f7cf000 - 000000008f800000 (ACPI data)
Apr 13 17:19:07 localhost kernel: BIOS-e820: 000000008f800000 - 0000000090000000 (reserved)
Apr 13 17:19:07 localhost kernel: BIOS-e820: 00000000a0000000 - 00000000b0000000 (reserved)
Apr 13 17:19:07 localhost kernel: BIOS-e820: 00000000fc000000 - 00000000fd000000 (reserved)
Apr 13 17:19:07 localhost kernel: BIOS-e820: 00000000fed1c000 - 00000000fed20000 (reserved)
Apr 13 17:19:07 localhost kernel: BIOS-e820: 00000000ff800000 - 0000000100000000 (reserved)
Apr 13 17:19:07 localhost kernel: BIOS-e820: 0000000100000000 - 0000000270000000 (usable)
Apr 13 17:19:07 localhost kernel: DMI 2.5 present.
Apr 13 17:19:07 localhost kernel: SMBIOS version 2.5 @ 0xF0440
Apr 13 17:19:07 localhost kernel: last_pfn = 0x270000 max_arch_pfn = 0x400000000
Apr 13 17:19:07 localhost kernel: x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
Apr 13 17:19:07 localhost kernel: last_pfn = 0x8bf64 max_arch_pfn = 0x400000000
Apr 13 17:19:07 localhost kernel: Using GB pages for direct mapping
Apr 13 17:19:07 localhost kernel: init_memory_mapping: 0000000000000000-000000008bf64000
Apr 13 17:19:07 localhost kernel: init_memory_mapping: 0000000100000000-0000000270000000
Apr 13 17:19:07 localhost kernel: RAMDISK: 3717b000 - 37fef73a
Apr 13 17:19:07 localhost kernel: ACPI: RSDP 00000000000f0410 00024 (v02 Cisco0)
Apr 13 17:19:07 localhost kernel: ACPI: XSDT 000000008f7fe120 0009C (v01 Cisco0 CiscoUCS 00000000      01000013)
Apr 13 17:19:07 localhost kernel: ACPI: FACP 000000008f7fc000 000F4 (v04 Cisco0 CiscoUCS 00000000 MSFT 0100000D)
Apr 13 17:19:07 localhost kernel: ACPI: DSDT 000000008f7f6000 05DBE (v02 Cisco0 CiscoUCS 00000003 MSFT 0100000D)
Apr 13 17:19:07 localhost kernel: ACPI: FACS 000000008f6f1000 00040
Apr 13 17:19:07 localhost kernel: ACPI: APIC 000000008f7f5000 001A8 (v02 Cisco0 CiscoUCS 00000000 MSFT 0100000D)
Apr 13 17:19:07 localhost kernel: ACPI: MCFG 000000008f7f4000 0003C (v01 Cisco0 CiscoUCS 00000001 MSFT 0100000D)
Apr 13 17:19:07 localhost kernel: ACPI: HPET 000000008f7f3000 00038 (v01 Cisco0 CiscoUCS 00000001 MSFT 0100000D)
Apr 13 17:19:07 localhost kernel: ACPI: SLIT 000000008f7f2000 00030 (v01 Cisco0 CiscoUCS 00000001 MSFT 0100000D)
Apr 13 17:19:07 localhost kernel: ACPI: SPCR 000000008f7f1000 00050 (v01 Cisco0 CiscoUCS 00000000 MSFT 0100000D)
Apr 13 17:19:07 localhost kernel: ACPI: WDDT 000000008f7f0000 00040 (v01 Cisco0 CiscoUCS 00000000 MSFT 0100000D)
Apr 13 17:19:07 localhost kernel: ACPI: SSDT 000000008f7d5000 1AFC4 (v02  Cisco SSDT  PM 00004000 INTL 20090730)
Apr 13 17:19:07 localhost kernel: ACPI: SSDT 000000008f7d4000 001D8 (v02  Cisco IPMI     00004000 INTL 20090730)
Apr 13 17:19:07 localhost kernel: ACPI: SSDT 000000008f7d3000 00962 (v02 CISCO  PMETER   00004000 INTL 20090730)
Apr 13 17:19:07 localhost kernel: ACPI: HEST 000000008f7d1000 000A8 (v01 Cisco  CiscoTbl 00000001 CISC 00000001)
Apr 13 17:19:07 localhost kernel: ACPI: BERT 000000008f7d0000 00030 (v01 Cisco  CiscoTbl 00000001 CISC 00000001)
Apr 13 17:19:07 localhost kernel: ACPI: ERST 000000008f7cf000 00230 (v01 Cisco  CiscoTbl 00000001 CISC 00000001)
Apr 13 17:19:07 localhost kernel: ACPI: EINJ 000000008f6f0000 00130 (v01 Cisco  CiscoTbl 00000001 CISC 00000001)
Apr 13 17:19:07 localhost kernel: ACPI: DMAR 000000008f6ef000 001A8 (v01 Cisco0 CiscoUCS 00000001 MSFT 0100000D)
Apr 13 17:19:07 localhost kernel: Setting APIC routing to flat.
Apr 13 17:19:07 localhost kernel: No NUMA configuration found
Apr 13 17:19:07 localhost kernel: Faking a node at 0000000000000000-0000000270000000
Apr 13 17:19:07 localhost kernel: Bootmem setup node 0 0000000000000000-0000000270000000
Apr 13 17:19:07 localhost kernel:  NODE_DATA [000000000000b000 - 000000000003efff]
Apr 13 17:19:07 localhost kernel:  bootmap [000000000003f000 -  000000000008cfff] pages 4e
Apr 13 17:19:07 localhost kernel: (9 early reservations) ==> bootmem [0000000000 - 0270000000]
Apr 13 17:19:07 localhost kernel:  #0 [0000000000 - 0000001000]   BIOS data page ==> [0000000000 - 0000001000]
Apr 13 17:19:07 localhost kernel:  #1 [0000006000 - 0000008000]       TRAMPOLINE ==> [0000006000 - 0000008000]
Apr 13 17:19:07 localhost kernel:  #2 [0001000000 - 000201b0a4]    TEXT DATA BSS ==> [0001000000 - 000201b0a4]
Apr 13 17:19:07 localhost kernel:  #3 [003717b000 - 0037fef73a]          RAMDISK ==> [003717b000 - 0037fef73a]
Apr 13 17:19:07 localhost kernel:  #4 [000009b000 - 0000100000]    BIOS reserved ==> [000009b000 - 0000100000]
Apr 13 17:19:07 localhost kernel:  #5 [000201c000 - 000201c2f8]              BRK ==> [000201c000 - 000201c2f8]
Apr 13 17:19:07 localhost kernel:  #6 [0000008000 - 000000a000]          PGTABLE ==> [0000008000 - 000000a000]
Apr 13 17:19:07 localhost kernel:  #7 [000000a000 - 000000b000]          PGTABLE ==> [000000a000 - 000000b000]
Apr 13 17:19:07 localhost kernel:  #8 [0000001000 - 0000001030]        ACPI SLIT ==> [0000001000 - 0000001030]
Apr 13 17:19:07 localhost kernel: found SMP MP-table at [ffff8800000fc640] fc640
Apr 13 17:19:07 localhost kernel: Reserving 129MB of memory at 48MB for crashkernel (System RAM: 9984MB)
Apr 13 17:19:07 localhost kernel: Zone PFN ranges:
Apr 13 17:19:07 localhost kernel:  DMA      0x00000001 -> 0x00001000
Apr 13 17:19:07 localhost kernel:  DMA32    0x00001000 -> 0x00100000
Apr 13 17:19:07 localhost kernel:  Normal   0x00100000 -> 0x00270000
Apr 13 17:19:07 localhost kernel: Movable zone start PFN for each node
Apr 13 17:19:07 localhost kernel: early_node_map[3] active PFN ranges
Apr 13 17:19:07 localhost kernel:    0: 0x00000001 -> 0x0000009b
Apr 13 17:19:07 localhost kernel:    0: 0x00000100 -> 0x0008bf64
Apr 13 17:19:07 localhost kernel:    0: 0x00100000 -> 0x00270000
Apr 13 17:19:07 localhost kernel: ACPI: PM-Timer IO Port: 0x408
Apr 13 17:19:07 localhost kernel: Setting APIC routing to flat.
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC (acpi_id[0x01] lapic_id[0x02] enabled)
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC (acpi_id[0x02] lapic_id[0x12] enabled)
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC (acpi_id[0x03] lapic_id[0x14] enabled)
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC (acpi_id[0x04] lapic_id[0x01] enabled)
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC (acpi_id[0x05] lapic_id[0x03] enabled)
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC (acpi_id[0x06] lapic_id[0x13] enabled)
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC (acpi_id[0x07] lapic_id[0x15] enabled)
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC (acpi_id[0x08] lapic_id[0xff] disabled)
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC (acpi_id[0x09] lapic_id[0xff] disabled)
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC (acpi_id[0x0a] lapic_id[0xff] disabled)
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC (acpi_id[0x0b] lapic_id[0xff] disabled)
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC (acpi_id[0x0c] lapic_id[0xff] disabled)
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC (acpi_id[0x0d] lapic_id[0xff] disabled)
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC (acpi_id[0x0e] lapic_id[0xff] disabled)
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC (acpi_id[0x0f] lapic_id[0xff] disabled)
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC (acpi_id[0x10] lapic_id[0xff] disabled)
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC (acpi_id[0x11] lapic_id[0xff] disabled)
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC (acpi_id[0x12] lapic_id[0xff] disabled)
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC (acpi_id[0x13] lapic_id[0xff] disabled)
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC (acpi_id[0x14] lapic_id[0xff] disabled)
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC (acpi_id[0x15] lapic_id[0xff] disabled)
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC (acpi_id[0x16] lapic_id[0xff] disabled)
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC (acpi_id[0x17] lapic_id[0xff] disabled)
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC_NMI (acpi_id[0x00] high level lint[0x1])
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC_NMI (acpi_id[0x01] high level lint[0x1])
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC_NMI (acpi_id[0x02] high level lint[0x1])
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC_NMI (acpi_id[0x03] high level lint[0x1])
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC_NMI (acpi_id[0x04] high level lint[0x1])
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC_NMI (acpi_id[0x05] high level lint[0x1])
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC_NMI (acpi_id[0x06] high level lint[0x1])
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC_NMI (acpi_id[0x07] high level lint[0x1])
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC_NMI (acpi_id[0x08] high level lint[0x1])
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC_NMI (acpi_id[0x09] high level lint[0x1])
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC_NMI (acpi_id[0x0a] high level lint[0x1])
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC_NMI (acpi_id[0x0b] high level lint[0x1])
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC_NMI (acpi_id[0x0c] high level lint[0x1])
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC_NMI (acpi_id[0x0d] high level lint[0x1])
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC_NMI (acpi_id[0x0e] high level lint[0x1])
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC_NMI (acpi_id[0x0f] high level lint[0x1])
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC_NMI (acpi_id[0x10] high level lint[0x1])
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC_NMI (acpi_id[0x11] high level lint[0x1])
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC_NMI (acpi_id[0x12] high level lint[0x1])
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC_NMI (acpi_id[0x13] high level lint[0x1])
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC_NMI (acpi_id[0x14] high level lint[0x1])
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC_NMI (acpi_id[0x15] high level lint[0x1])
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC_NMI (acpi_id[0x16] high level lint[0x1])
Apr 13 17:19:07 localhost kernel: ACPI: LAPIC_NMI (acpi_id[0x17] high level lint[0x1])
Apr 13 17:19:07 localhost kernel: ACPI: IOAPIC (id[0x08] address[0xfec00000] gsi_base[0])
Apr 13 17:19:07 localhost kernel: IOAPIC[0]: apic_id 8, version 32, address 0xfec00000, GSI 0-23
Apr 13 17:19:07 localhost kernel: ACPI: IOAPIC (id[0x09] address[0xfec90000] gsi_base[24])
Apr 13 17:19:07 localhost kernel: IOAPIC[1]: apic_id 9, version 32, address 0xfec90000, GSI 24-47
Apr 13 17:19:07 localhost kernel: ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
Apr 13 17:19:07 localhost kernel: ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
Apr 13 17:19:07 localhost kernel: Using ACPI (MADT) for SMP configuration information
Apr 13 17:19:07 localhost kernel: ACPI: HPET id: 0x8086a401 base: 0xfed00000
Apr 13 17:19:07 localhost kernel: SMP: Allowing 24 CPUs, 16 hotplug CPUs
Apr 13 17:19:07 localhost kernel: PM: Registered nosave memory: 000000000009b000 - 00000000000a0000
Apr 13 17:19:07 localhost kernel: PM: Registered nosave memory: 00000000000a0000 - 00000000000e0000
Apr 13 17:19:07 localhost kernel: PM: Registered nosave memory: 00000000000e0000 - 0000000000100000
Apr 13 17:19:07 localhost kernel: PM: Registered nosave memory: 000000008bf64000 - 000000008c051000
Apr 13 17:19:07 localhost kernel: PM: Registered nosave memory: 000000008c051000 - 000000008c13d000
Apr 13 17:19:07 localhost kernel: PM: Registered nosave memory: 000000008c13d000 - 000000008d53d000
Apr 13 17:19:07 localhost kernel: PM: Registered nosave memory: 000000008d53d000 - 000000008f602000
Apr 13 17:19:07 localhost kernel: PM: Registered nosave memory: 000000008f602000 - 000000008f64f000
Apr 13 17:19:07 localhost kernel: PM: Registered nosave memory: 000000008f64f000 - 000000008f6e4000
Apr 13 17:19:07 localhost kernel: PM: Registered nosave memory: 000000008f6e4000 - 000000008f6ef000
Apr 13 17:19:07 localhost kernel: PM: Registered nosave memory: 000000008f6ef000 - 000000008f6f1000
Apr 13 17:19:07 localhost kernel: PM: Registered nosave memory: 000000008f6f1000 - 000000008f7cf000
Apr 13 17:19:07 localhost kernel: PM: Registered nosave memory: 000000008f7cf000 - 000000008f800000
Apr 13 17:19:07 localhost kernel: PM: Registered nosave memory: 000000008f800000 - 0000000090000000
Apr 13 17:19:07 localhost kernel: PM: Registered nosave memory: 0000000090000000 - 00000000a0000000
Apr 13 17:19:07 localhost kernel: PM: Registered nosave memory: 00000000a0000000 - 00000000b0000000
Apr 13 17:19:07 localhost kernel: PM: Registered nosave memory: 00000000b0000000 - 00000000fc000000
Apr 13 17:19:07 localhost kernel: PM: Registered nosave memory: 00000000fc000000 - 00000000fd000000
Apr 13 17:19:07 localhost kernel: PM: Registered nosave memory: 00000000fd000000 - 00000000fed1c000
Apr 13 17:19:07 localhost kernel: PM: Registered nosave memory: 00000000fed1c000 - 00000000fed20000
Apr 13 17:19:07 localhost kernel: PM: Registered nosave memory: 00000000fed20000 - 00000000ff800000
Apr 13 17:19:07 localhost kernel: PM: Registered nosave memory: 00000000ff800000 - 0000000100000000
Apr 13 17:19:07 localhost kernel: Allocating PCI resources starting at b0000000 (gap: b0000000:4c000000)
Apr 13 17:19:07 localhost kernel: Booting paravirtualized kernel on bare hardware
Apr 13 17:19:07 localhost kernel: NR_CPUS:4096 nr_cpumask_bits:24 nr_cpu_ids:24 nr_node_ids:1
Apr 13 17:19:07 localhost kernel: PERCPU: Embedded 31 pages/cpu @ffff88002f800000 s94552 r8192 d24232 u131072
Apr 13 17:19:07 localhost kernel: pcpu-alloc: s94552 r8192 d24232 u131072 alloc=1*2097152
Apr 13 17:19:07 localhost kernel: pcpu-alloc: [0] 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 
Apr 13 17:19:07 localhost kernel: pcpu-alloc: [0] 16 17 18 19 20 21 22 23 -- -- -- -- -- -- -- -- 
Apr 13 17:19:07 localhost kernel: Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 2045459
Apr 13 17:19:07 localhost kernel: Policy zone: Normal
Apr 13 17:19:07 localhost kernel: Kernel command line: ro root=/dev/mapper/VolGroup-lv_root rd_NO_LUKS LANG=en_US.UTF-8 rd_NO_MD rd_LVM_LV=VolGroup/lv_swap SYSFONT=latarcyrheb-sun16 crashkernel=129M@0M rd_LVM_LV=VolGroup/lv_root  KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet
Apr 13 17:19:07 localhost kernel: PID hash table entries: 4096 (order: 3, 32768 bytes)
Apr 13 17:19:07 localhost kernel: Checking aperture...
Apr 13 17:19:07 localhost kernel: No AGP bridge found
Apr 13 17:19:07 localhost kernel: PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
Apr 13 17:19:07 localhost kernel: Placing 64MB software IO TLB between ffff880020000000 - ffff880024000000
Apr 13 17:19:07 localhost kernel: software IO TLB at phys 0x20000000 - 0x24000000
Apr 13 17:19:07 localhost kernel: Memory: 7972340k/10223616k available (5221k kernel code, 1901576k absent, 349700k reserved, 7121k data, 1264k init)
Apr 13 17:19:07 localhost kernel: Hierarchical RCU implementation.
Apr 13 17:19:07 localhost kernel: NR_IRQS:33024 nr_irqs:1008
Apr 13 17:19:07 localhost kernel: Extended CMOS year: 2000
Apr 13 17:19:07 localhost kernel: Console: colour VGA+ 80x25
Apr 13 17:19:07 localhost kernel: console [tty0] enabled
Apr 13 17:19:07 localhost kernel: allocated 33554432 bytes of page_cgroup
Apr 13 17:19:07 localhost kernel: please try 'cgroup_disable=memory' option if you don't want memory cgroups
Apr 13 17:19:07 localhost kernel: Fast TSC calibration using PIT
Apr 13 17:19:07 localhost kernel: Detected 2666.901 MHz processor.
Apr 13 17:19:07 localhost kernel: Calibrating delay loop (skipped), value calculated using timer frequency.. 5333.80 BogoMIPS (lpj=2666901)
Apr 13 17:19:07 localhost kernel: pid_max: default: 32768 minimum: 301
Apr 13 17:19:07 localhost kernel: Security Framework initialized
Apr 13 17:19:07 localhost kernel: SELinux:  Initializing.
Apr 13 17:19:07 localhost kernel: Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes)
Apr 13 17:19:07 localhost kernel: Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes)
Apr 13 17:19:07 localhost kernel: Mount-cache hash table entries: 256
Apr 13 17:19:07 localhost kernel: Initializing cgroup subsys ns
Apr 13 17:19:07 localhost kernel: Initializing cgroup subsys cpuacct
Apr 13 17:19:07 localhost kernel: Initializing cgroup subsys memory
Apr 13 17:19:07 localhost kernel: Initializing cgroup subsys devices
Apr 13 17:19:07 localhost kernel: Initializing cgroup subsys freezer
Apr 13 17:19:07 localhost kernel: Initializing cgroup subsys net_cls
Apr 13 17:19:07 localhost kernel: Initializing cgroup subsys blkio
Apr 13 17:19:07 localhost kernel: Initializing cgroup subsys perf_event
Apr 13 17:19:07 localhost kernel: Initializing cgroup subsys net_prio
Apr 13 17:19:07 localhost kernel: CPU: Physical Processor ID: 0
Apr 13 17:19:07 localhost kernel: CPU: Processor Core ID: 0
Apr 13 17:19:07 localhost kernel: mce: CPU supports 9 MCE banks
Apr 13 17:19:07 localhost kernel: CPU0: Thermal monitoring enabled (TM1)
Apr 13 17:19:07 localhost kernel: using mwait in idle threads.
Apr 13 17:19:07 localhost kernel: ACPI: Core revision 20090903
Apr 13 17:19:07 localhost kernel: ftrace: converting mcount calls to 0f 1f 44 00 00
Apr 13 17:19:07 localhost kernel: ftrace: allocating 21430 entries in 85 pages
Apr 13 17:19:07 localhost kernel: dmar: Host address width 40
Apr 13 17:19:07 localhost kernel: dmar: DRHD base: 0x000000fe710000 flags: 0x1
Apr 13 17:19:07 localhost kernel: dmar: IOMMU 0: reg_base_addr fe710000 ver 1:0 cap c90780106f0462 ecap f020fe
Apr 13 17:19:07 localhost kernel: dmar: RMRR base: 0x0000008f62f000 end: 0x0000008f631fff
Apr 13 17:19:07 localhost kernel: dmar: RMRR base: 0x0000008f61a000 end: 0x0000008f61afff
Apr 13 17:19:07 localhost kernel: dmar: RMRR base: 0x0000008f617000 end: 0x0000008f617fff
Apr 13 17:19:07 localhost kernel: dmar: RMRR base: 0x0000008f614000 end: 0x0000008f614fff
Apr 13 17:19:07 localhost kernel: dmar: RMRR base: 0x0000008f611000 end: 0x0000008f611fff
Apr 13 17:19:07 localhost kernel: dmar: RMRR base: 0x0000008f60e000 end: 0x0000008f60efff
Apr 13 17:19:07 localhost kernel: dmar: RMRR base: 0x0000008f60b000 end: 0x0000008f60bfff
Apr 13 17:19:07 localhost kernel: dmar: RMRR base: 0x0000008f608000 end: 0x0000008f608fff
Apr 13 17:19:07 localhost kernel: dmar: RMRR base: 0x0000008f605000 end: 0x0000008f605fff
Apr 13 17:19:07 localhost kernel: dmar: No ATSR found
Apr 13 17:19:07 localhost kernel: IOAPIC id 8 under DRHD base 0xfe710000
Apr 13 17:19:07 localhost kernel: IOAPIC id 9 under DRHD base 0xfe710000
Apr 13 17:19:07 localhost kernel: Enabled IRQ remapping in xapic mode
Apr 13 17:19:07 localhost kernel: Setting APIC routing to physical flat
Apr 13 17:19:07 localhost kernel: ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
Apr 13 17:19:07 localhost kernel: CPU0: Intel(R) Xeon(R) CPU           E5640  @ 2.67GHz stepping 02
Apr 13 17:19:07 localhost kernel: Performance Events: PEBS fmt1+, Westmere events, Intel PMU driver.
Apr 13 17:19:07 localhost kernel: CPUID marked event: 'bus cycles' unavailable
Apr 13 17:19:07 localhost kernel: ... version:                3
Apr 13 17:19:07 localhost kernel: ... bit width:              48
Apr 13 17:19:07 localhost kernel: ... generic registers:      4
Apr 13 17:19:07 localhost kernel: ... value mask:             0000ffffffffffff
Apr 13 17:19:07 localhost kernel: ... max period:             000000007fffffff
Apr 13 17:19:07 localhost kernel: ... fixed-purpose events:   3
Apr 13 17:19:07 localhost kernel: ... event mask:             000000070000000f
Apr 13 17:19:07 localhost kernel: NMI watchdog enabled, takes one hw-pmu counter.
Apr 13 17:19:07 localhost kernel: Booting Node   0, Processors  #1 #2 #3 #4 #5 #6 #7
Apr 13 17:19:07 localhost kernel: Brought up 8 CPUs
Apr 13 17:19:07 localhost kernel: Total of 8 processors activated (42670.41 BogoMIPS).
Apr 13 17:19:07 localhost kernel: devtmpfs: initialized
Apr 13 17:19:07 localhost kernel: PM: Registering ACPI NVS region at 8bf64000 (970752 bytes)
Apr 13 17:19:07 localhost kernel: PM: Registering ACPI NVS region at 8c13d000 (20971520 bytes)
Apr 13 17:19:07 localhost kernel: PM: Registering ACPI NVS region at 8f6e4000 (45056 bytes)
Apr 13 17:19:07 localhost kernel: PM: Registering ACPI NVS region at 8f6f1000 (909312 bytes)
Apr 13 17:19:07 localhost kernel: regulator: core version 0.5
Apr 13 17:19:07 localhost kernel: NET: Registered protocol family 16
Apr 13 17:19:07 localhost kernel: ACPI FADT declares the system doesn't support PCIe ASPM, so disable it
Apr 13 17:19:07 localhost kernel: ACPI: bus type pci registered
Apr 13 17:19:07 localhost kernel: PCI: MCFG configuration 0: base a0000000 segment 0 buses 0 - 255
Apr 13 17:19:07 localhost kernel: PCI: MCFG area at a0000000 reserved in E820
Apr 13 17:19:07 localhost kernel: PCI: Using MMCONFIG at a0000000 - afffffff
Apr 13 17:19:07 localhost kernel: PCI: Using configuration type 1 for base access
Apr 13 17:19:07 localhost kernel: bio: create slab <bio-0> at 0
Apr 13 17:19:07 localhost kernel: ACPI: Interpreter enabled
Apr 13 17:19:07 localhost kernel: ACPI: (supports S0 S5)

如何确定导致系统重新启动的原因?

答案1

当 Linux 崩溃时,它会严重崩溃,并且很少留下可以为您提供任何帮助的可用内核。对于这样的系统,您应该做一些事情:

  1. 加强系统日志输出。您的系统使用 rsyslogd 会更好。最终你有一个规则,例如:

    *.* /var/log/debug
    

    确保 rsyslogd 中启用了内核模块。我见过一些安装将其注释掉。它是

    $modLoad imklog
    

    确保每天轮换该文件。您只需保存 2 次轮换。在 /etc/logrotate.d/syslog 中

    /var/log/debug
    {
      compress
      daily
      rotate 2
      sharedscripts
      postrotate
        /bin/kill -HUP `cat /var/run/syslogd.pid 2> /dev/null` 2> /dev/null || true
      endscript
    }
    

    我还喜欢让该mark工具每 5 分钟输出一次,但这在网络中以及当您不进行任何其他监视时更有用。

  2. 我创建了一个脚本来捕获 ps 表。它通过 cron 每分钟运行一次。它非常擅长识别哪些进程存在内存泄漏等。我为此使用的命令或多或少是:

    /bin/ps -A --sort tty,comm,pid -ww -o pgrp:8,tty:7,pid,c,pmem:5,rss:8,sz:8,size:8=TSIZE,vsz:8,nlwp,lstart,args
    

    我将其通过管道传输到 perl 脚本中,该脚本删除了“无用”输出,例如用户 shell 和内核线程。如果对完整脚本感兴趣,请发表评论。这些文件每天最多可以增长 12 MB,但每日压缩会将其减少到几百 kB。

  3. 我认为使用该sysstat包至关重要。以下是我的配置方式 ( /etc/cron.d/sysstat)

    */10 * * * * root /usr/lib64/sa/sa1 10 60
    58  23 * * * root rm -f /var/lib/sa/sa$(date +%d --date=tomorrow)
    

    即每 10 秒捕获一次系统数据,每次持续 10 分钟。在负载较重的系统上可以随意降低该值。还要注意这需要大量磁盘空间。为了防止 sar 文件损坏,明天午夜前不久将删除上个月的 sar 文件。 (但是,如果该月在第 30 天结束,则上个月的第 31 天仍将保留。耸肩

将这三件事放入您的系统中,如果您再次不走运,您将拥有大量数据可供使用。也许其中一些可能有用。

答案2

至此,您似乎已经用尽了所有确定的来源来回答这个问题。我想说你现在最好的选择是尝试重现该问题。这种方法不会告诉您到底发生了什么,但它会验证您是否仍然拥有良好的硬件(无论如何,这比“为什么”更重要)。我至少会运行完整的内存测试和驱动测试。

这是一个很好的测试脚本,它不仅可以测试您的内存,还可以测试您的硬件 D​​MA 访问(这很可能是问题所在):http://people.redhat.com/dledford/memtest.shtml

我肯定还会检查您的硬盘驱动器是否有错误;甚至视频和声音硬件也可以通过测试来确定。

相关内容