执行“apachectl stop”时发生令人不安的崩溃。常规系统:
$ uname -a
Linux www.example.com 3.13.0-24-generic #47-Ubuntu SMP Fri May 2 23:30:00 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
$ lsb_release -d
Description: Ubuntu 14.04 LTS
磁盘、内存、CPU 有大量的闲置容量。这是 Amazon EC2 云实例,于今天 2014 年 5 月 7 日下午 1 点运行,区域为 us-east-1a,中型实例,内存为 3.7GB/2CPU。同一 VPC 和同一区域的其他实例都运行正常。
我在其他地方读到,在当今的内核中,除非硬件出现故障,否则不会发生这样的崩溃。亚马逊的云端硬件不太可能出现故障?还是我太乐观了?
无论如何,转储来自dmesg
(系统继续通过提供网页和与数据库对话来运行,但新进程立即挂起,例如ps
和ssh
):
[27917995.400499] 一般保护故障:0000 [#1] SMP [27917995.400515] 链接的模块:isofs crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd [27917995.400537] CPU:0 PID:1672 通信:apache2 未受污染 3.13.0-24-通用#46-Ubuntu [27917995.400545]任务:ffff8800020117f0 ti:ffff88005f012000 task.ti:ffff88005f012000 [27917995.400551] RIP:e030:[][] devpts_kill_index+0x13/0x60 [27917995.400564] RSP:e02b:ffff88005f013d58 EFLAGS:00010286 [27917995.400568] RAX:dc73af5e3df7dcab RBX:ffff880003f30400 RCX:0000000181000079 [27917995.400574] RDX:00000000ffffffff RSI:0000000000000002 RDI:ffff8800aab76ff8 [27917995.400579] RBP:ffff88005f013d68 R08:0000000000000000 R09:0000000000000001 [27917995.400583] R10:ffffea0003a01180 R11:ffffffff8144a320 R12:0000000000000002 [27917995.400588] R13:ffff8800e87a8001 R14:0000000000000002 R15:0000000000000001 [27917995.400598] FS:00007f8d8b320780(0000) GS:ffff8800ef600000(0000) knlGS:00000000000000000 [27917995.400605] CS:e033 DS:0000 ES:0000 CR0:000000008005003b [27917995.400610] CR2:00007f8d79aea7e0 CR3:0000000001c0e000 CR4:0000000000002660 [27917995.400616]堆栈: [27917995.400619] ffff880003f30400 ffff880003f30800 ffff88005f013d78 ffffffff8144caa8 [27917995.400628] ffff88005f013d90 ffffffff81440e47 ffff880003f30400 ffff88005f013e38 [27917995.400636] ffffffff81443159 ffff880003f30610 ffff880003f30628 ffff880003f30630 [27917995.400645] 呼叫追踪: [27917995.400656][]pty_unix98_shutdown+0x18/0x20 [27917995.400662][]release_tty+0x37/0x140 [27917995.400668][]tty_release+0x4b9/0x600 [27917995.400678][]__fput+0xe4/0x260 [27917995.400684][] ____fput+0xe/0x10 [27917995.400693][]任务_工作_运行+0xc4/0xe0 [27917995.400701][]do_exit+0x2ab/0xa50 [27917995.400708][]?vtime_account_user+0x54/0x60 [27917995.400717][]?context_tracking_user_exit+0x4f/0xc0 [27917995.400723][]do_group_exit+0x3f/0xa0 [27917995.400729][]SyS_exit_group+0x14/0x20 [27917995.400738][]tracesys+0xe1/0xe6 [27917995.400742] 代码:0f 1f 84 00 00 00 00 00 48 83 c4 08 b8 fb ff ff ff 5b 41 5c 5d c3 66 90 66 66 66 66 90 55 48 89 e5 41 54 41 89 f4 53 48 8b 47 28 81 78 58 d1 1c 00 00 74 0b 48 8b 05 44 bf d7 00 48 8b 40 08 [27917995.400796] RIP [] devpts_kill_index+0x13/0x60 [27917995.400803] 常规采购清单 [27917995.400811] ---[结束跟踪 5b24303912015285]--- [27917995.400815] 修复递归错误但需要重新启动!
答案1
云仍然由硬件组成,因此完全有可能出现硬件故障。如果您怀疑存在硬件问题,只需停止并重新启动实例即可。这应该会将您带到新的主机上。