在日志中,我有这个(从 06:01:14 的完整内核消息日志中提取):
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.863038] BUG: unable to handle kernel NULL pointer dereference at 0000000000000015
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861081] Process telnet (pid: 20247, threadinfo ffff8800f8598000, task ffff8800024d4500)
然后服务器日志被这条消息淹没了:
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861547] tty_release_dev: pts0: read/write wait queue active!
最后,2 小时后,我不得不重新启动,因为它已经无法访问:负载已增长到 160%。
最后一个命令没有显示当时有任何人登录到 pts0。我也不知道这个 telnet 进程可能来自哪里。
这是一个运行 UBUNTU 10.04 LTS 的 EC2 实例。
完整日志如下:
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.863038] BUG: unable to handle kernel NULL pointer dereference at 0000000000000015
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861007] IP: [<ffffffff81363dde>] n_tty_read+0x2ce/0x970
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861019] PGD ee13d067 PUD f8698067 PMD 0
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861025] Oops: 0000 [#1] SMP
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861028] last sysfs file: /sys/devices/xen/vbd-2208/block/sdk/removable
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861032] CPU 0
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861034] Modules linked in: ipv6
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861040] Pid: 20247, comm: telnet Not tainted 2.6.32-312-ec2 #24-Ubuntu
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861042] RIP: e030:[<ffffffff81363dde>] [<ffffffff81363dde>] n_tty_read+0x2ce/0x970
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861047] RSP: e02b:ffff8800f8599d88 EFLAGS: 00010246
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861049] RAX: 0000000000000015 RBX: ffff8800f8598000 RCX: 0000000001aed069
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861052] RDX: 0000000000000000 RSI: ffff8800f8599e67 RDI: ffff8801dd833d1c
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861054] RBP: ffff8800f8599e98 R08: ffffffff8135eb10 R09: 7fffffffffffffff
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861057] R10: 0000000000000000 R11: 0000000000000246 R12: ffff8801dd833800
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861059] R13: 0000000000000000 R14: ffff8801dd833a68 R15: ffff8801dd833d1c
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861065] FS: 00007f90121f6720(0000) GS:ffff880002c40000(0000) knlGS:0000000000000000
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861068] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861070] CR2: 0000000000000015 CR3: 0000000032a59000 CR4: 0000000000002660
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861073] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861076] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861081] Process telnet (pid: 20247, threadinfo ffff8800f8598000, task ffff8800024d4500)
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861083] Stack:
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861085] 0000000000000000 0000000001aed069 ffff8801dd8339c8 ffff8800024d4500
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861089] <0> ffff8801dd8339c0 ffff8801dd833c90 0000000001aed027 ffff8800024d4500
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861094] <0> ffff8801dd8338d8 0000000000000000 ffff8800024d4500 0000000000000000
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861099] Call Trace:
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861107] [<ffffffff81034bc0>] ? default_wake_function+0x0/0x10
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861113] [<ffffffff8135ebb6>] tty_read+0xa6/0xf0
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861118] [<ffffffff810ee7e5>] vfs_read+0xb5/0x1a0
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861122] [<ffffffff810ee91c>] sys_read+0x4c/0x80
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861127] [<ffffffff81009ba8>] system_call_fastpath+0x16/0x1b
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861131] [<ffffffff81009b40>] ? system_call+0x0/0x52
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861133] Code: 85 d2 0f 84 92 00 00 00 45 8b ac 24 5c 02 00 00 f0 45 0f b3 2e 45 19 ed 49 63 84 24 5c 02 00 00 49 8b 94 24 50 02 00 00 4c 89 ff <0f> be 1c 02 e8 a9 d3 14 00 41 8b 94 24 5c 02 00 00 41 83 ac 24
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861171] RIP [<ffffffff81363dde>] n_tty_read+0x2ce/0x970
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861175] RSP <ffff8800f8599d88>
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861171] RIP [<ffffffff81363dde>] n_tty_read+0x2ce/0x970
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861175] RSP <ffff8800f8599d88>
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861177] CR2: 0000000000000015
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861205] ---[ end trace f10eee2057ff4f6b ]---
Apr 21 06:01:14 ip-10-49-109-107 kernel: [233185.861547] tty_release_dev: pts0: read/write wait queue active!
答案1
亚马逊遇到了 EBS 问题在 us-east-1 可用区域:
太平洋夏令时间凌晨 2:18 我们可以确认影响 EC2 实例的连接错误以及影响 US-EAST-1 区域多个可用区中 EBS 卷的延迟增加。错误率增加正在影响 EBS CreateVolume API 调用。我们将继续努力解决问题。
答案2
最新的内核安全更新为 2.6.32-314.27。我认为你应该更新一下,看看是否有帮助。