系统日志中出现错误,并使用 docker 冻结服务器

系统日志中出现错误,并使用 docker 冻结服务器

我的服务器死机并重启两次后,出现了以下异常

我无法分辨与 docker 相关的问题,但每次我启动一些容器时都会发生这种情况,并且我无法从系统日志中找到任何有用的信息:

Nov 24 15:21:30 shisoft-idc kernel: [25671.700452] Oops: 0000 [#2] SMP
Nov 24 15:21:30 shisoft-idc kernel: [25671.713472] Modules linked in: xt_nat xt_tcpudp veth xt_addrtype xt_conntrack ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack bridge stp llc pf_ring(OX) aufs iptable_filter ip_tables x_tables nls_iso8859_1 gpio_ich mxm_wmi joydev mac_hid x86_pkg_temp_thermal intel_powerclamp coretemp mei_me mei sb_edac ioatdma lpc_ich edac_core dca wmi kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd ipmi_si lp parport hid_generic isci e1000e ahci libsas usbhid ptp hid libahci pps_core scsi_transport_sas megaraid_sas
Nov 24 15:21:30 shisoft-idc kernel: [25671.810245] CPU: 34 PID: 6 Comm: kworker/u80:0 Tainted: G      D W  OX 3.13.0-40-generic #69-Ubuntu
Nov 24 15:21:30 shisoft-idc kernel: [25671.838447] Hardware name: Supermicro X9DRL-3F/iF/X9DRL-3F/iF, BIOS 3.0a 08/08/2013
Nov 24 15:21:30 shisoft-idc kernel: [25671.853158] task: ffff880851354800 ti: ffff88085135e000 task.ti: ffff88085135e000
Nov 24 15:21:30 shisoft-idc kernel: [25671.867861] RIP: 0010:[<ffffffff8108bc00>]  [<ffffffff8108bc00>] kthread_data+0x10/0x20
Nov 24 15:21:30 shisoft-idc kernel: [25671.883418] RSP: 0018:ffff88085135f960  EFLAGS: 00010002
Nov 24 15:21:30 shisoft-idc kernel: [25671.899320] RAX: 0000000000000000 RBX: 0000000000000022 RCX: 0000000000000000
Nov 24 15:21:30 shisoft-idc kernel: [25671.914928] RDX: 0000000000000001 RSI: 0000000000000022 RDI: ffff880851354800
Nov 24 15:21:30 shisoft-idc kernel: [25671.930186] RBP: ffff88085135f960 R08: 0000000000000000 R09: 0000000000000001
Nov 24 15:21:30 shisoft-idc kernel: [25671.945595] R10: ffffffff8106516c R11: ffffea002144d200 R12: ffff88183f394480
Nov 24 15:21:30 shisoft-idc kernel: [25671.960870] R13: 0000000000000022 R14: ffff8808513547f0 R15: ffff880851354800
Nov 24 15:21:30 shisoft-idc kernel: [25671.976402] FS:  0000000000000000(0000) GS:ffff88183f380000(0000) knlGS:0000000000000000
Nov 24 15:21:30 shisoft-idc kernel: [25671.992073] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 24 15:21:30 shisoft-idc kernel: [25672.007445] CR2: 0000000000000028 CR3: 0000000001c0e000 CR4: 00000000001407e0
Nov 24 15:21:30 shisoft-idc kernel: [25672.023175] Stack:
Nov 24 15:21:30 shisoft-idc kernel: [25672.038455]  ffff88085135f978 ffffffff81084f51 ffff880851354800 ffff88085135f9d8
Nov 24 15:21:30 shisoft-idc kernel: [25672.054584]  ffffffff817233d9 ffff880851354800 ffff88085135ffd8 0000000000014480
Nov 24 15:21:30 shisoft-idc kernel: [25672.070539]  0000000000014480 ffff880851354800 ffff880851354e50 ffff8808513547f0
Nov 24 15:21:30 shisoft-idc kernel: [25672.086803] Call Trace:
Nov 24 15:21:30 shisoft-idc kernel: [25672.103266]  [<ffffffff81084f51>] wq_worker_sleeping+0x11/0x90
Nov 24 15:21:30 shisoft-idc kernel: [25672.119191]  [<ffffffff817233d9>] __schedule+0x589/0x7d0
Nov 24 15:21:30 shisoft-idc kernel: [25672.135594]  [<ffffffff81723649>] schedule+0x29/0x70
Nov 24 15:21:30 shisoft-idc kernel: [25672.150643]  [<ffffffff8106a15f>] do_exit+0x6df/0xa50
Nov 24 15:21:30 shisoft-idc kernel: [25672.165683]  [<ffffffff817287f9>] oops_end+0xa9/0x150
Nov 24 15:21:30 shisoft-idc kernel: [25672.180692]  [<ffffffff810172ab>] die+0x4b/0x70
Nov 24 15:21:30 shisoft-idc kernel: [25672.195466]  [<ffffffff8172818e>] do_general_protection+0x11e/0x1b0
Nov 24 15:21:30 shisoft-idc kernel: [25672.210093]  [<ffffffff81727aa8>] general_protection+0x28/0x30
Nov 24 15:21:30 shisoft-idc kernel: [25672.224652]  [<ffffffff816f6ff2>] ? in6_dev_finish_destroy+0x62/0xf0
Nov 24 15:21:30 shisoft-idc kernel: [25672.238830]  [<ffffffff8122a099>] ? remove_proc_entry+0x89/0x1b0
Nov 24 15:21:30 shisoft-idc kernel: [25672.253390]  [<ffffffffa0344889>] remove_device_from_ring_list+0x69/0x120 [pf_ring]
Nov 24 15:21:30 shisoft-idc kernel: [25672.268168]  [<ffffffffa0344d07>] ring_notifier+0x127/0x425 [pf_ring]
Nov 24 15:21:30 shisoft-idc kernel: [25672.282725]  [<ffffffff816f02f8>] ? ip6mr_device_event+0xa8/0xc0
Nov 24 15:21:30 shisoft-idc kernel: [25672.296697]  [<ffffffff8172b83c>] notifier_call_chain+0x4c/0x70
Nov 24 15:21:30 shisoft-idc kernel: [25672.310129]  [<ffffffff8108fd56>] raw_notifier_call_chain+0x16/0x20
Nov 24 15:21:30 shisoft-idc kernel: [25672.323626]  [<ffffffff8161f055>] call_netdevice_notifiers_info+0x35/0x60
Nov 24 15:21:30 shisoft-idc kernel: [25672.336576]  [<ffffffff81620469>] rollback_registered_many+0x189/0x2a0
Nov 24 15:21:30 shisoft-idc kernel: [25672.349075]  [<ffffffff816205db>] unregister_netdevice_many+0x1b/0xb0
Nov 24 15:21:30 shisoft-idc kernel: [25672.362098]  [<ffffffff8162114d>] default_device_exit_batch+0x13d/0x160
Nov 24 15:21:30 shisoft-idc kernel: [25672.374600]  [<ffffffff810ab0a0>] ? prepare_to_wait_event+0x100/0x100
Nov 24 15:21:30 shisoft-idc kernel: [25672.386514]  [<ffffffff8161b8a3>] ops_exit_list.isra.1+0x53/0x60
Nov 24 15:21:30 shisoft-idc kernel: [25672.398854]  [<ffffffff8161c110>] cleanup_net+0x110/0x250
Nov 24 15:21:30 shisoft-idc kernel: [25672.411501]  [<ffffffff81083a52>] process_one_work+0x182/0x450
Nov 24 15:21:30 shisoft-idc kernel: [25672.425847]  [<ffffffff81084841>] worker_thread+0x121/0x410
Nov 24 15:21:30 shisoft-idc kernel: [25672.439753]  [<ffffffff81084720>] ? rescuer_thread+0x430/0x430
Nov 24 15:21:30 shisoft-idc kernel: [25672.454172]  [<ffffffff8108b562>] kthread+0xd2/0xf0
Nov 24 15:21:30 shisoft-idc kernel: [25672.467798]  [<ffffffff8108b490>] ? kthread_create_on_node+0x1c0/0x1c0
Nov 24 15:21:30 shisoft-idc kernel: [25672.481698]  [<ffffffff8172fc7c>] ret_from_fork+0x7c/0xb0
Nov 24 15:21:30 shisoft-idc kernel: [25672.496083]  [<ffffffff8108b490>] ? kthread_create_on_node+0x1c0/0x1c0
Nov 24 15:21:30 shisoft-idc kernel: [25672.509968] Code: 00 48 89 e5 5d 48 8b 40 c8 48 c1 e8 02 83 e0 01 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 87 c0 03 00 00 55 48 89 e5 <48> 8b 40 d8 5d c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00
Nov 24 15:21:30 shisoft-idc kernel: [25672.546519] RIP  [<ffffffff8108bc00>] kthread_data+0x10/0x20
Nov 24 15:21:30 shisoft-idc kernel: [25672.557942]  RSP <ffff88085135f960>
Nov 24 15:21:30 shisoft-idc kernel: [25672.569003] CR2: ffffffffffffffd8
Nov 24 15:21:30 shisoft-idc kernel: [25672.580223] ---[ end trace f801ff82c5094880 ]---
Nov 24 15:21:30 shisoft-idc kernel: [25674.624052] Fixing recursive fault but reboot is needed!
Nov 24 15:21:30 shisoft-idc kernel: [25682.813069] docker0: port 14(veth_app-mine) entered forwarding state
Nov 24 15:21:49 shisoft-idc kernel: [25700.486840] BUG: soft lockup - CPU#20 stuck for 22s! [irqbalance:1544]
Nov 24 15:21:49 shisoft-idc kernel: [25700.498429] Modules linked in: xt_nat xt_tcpudp veth xt_addrtype xt_conntrack ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack bridge stp llc pf_ring(OX) aufs iptable_filter ip_tables x_tables nls_iso8859_1 gpio_ich mxm_wmi joydev mac_hid x86_pkg_temp_thermal intel_powerclamp coretemp mei_me mei sb_edac ioatdma lpc_ich edac_core dca wmi kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd ipmi_si lp parport hid_generic isci e1000e ahci libsas usbhid ptp hid libahci pps_core scsi_transport_sas megaraid_sas
Nov 24 15:21:49 shisoft-idc kernel: [25700.590082] CPU: 20 PID: 1544 Comm: irqbalance Tainted: G      D W  OX 3.13.0-40-generic #69-Ubuntu
Nov 24 15:21:49 shisoft-idc kernel: [25700.618480] Hardware name: Supermicro X9DRL-3F/iF/X9DRL-3F/iF, BIOS 3.0a 08/08/2013
Nov 24 15:21:49 shisoft-idc kernel: [25700.633291] task: ffff88084e6f8000 ti: ffff88084d980000 task.ti: ffff88084d980000
Nov 24 15:21:49 shisoft-idc kernel: [25700.648956] RIP: 0010:[<ffffffff8172722a>]  [<ffffffff8172722a>] _raw_spin_lock+0x3a/0x50
Nov 24 15:21:49 shisoft-idc kernel: [25700.664427] RSP: 0018:ffff88084d981c50  EFLAGS: 00000206
Nov 24 15:21:49 shisoft-idc kernel: [25700.679149] RAX: 0000000000007bfa RBX: 0000000100000001 RCX: 00000000000020de
Nov 24 15:21:49 shisoft-idc kernel: [25700.694642] RDX: 00000000000020e0 RSI: 00000000000020e0 RDI: ffffffff81fb2a40
Nov 24 15:21:49 shisoft-idc kernel: [25700.709489] RBP: ffff88084d981c50 R08: 0000000000017a50 R09: 0000000000000001
Nov 24 15:21:49 shisoft-idc kernel: [25700.724954] R10: ffff880850b76026 R11: ffff880825f08b40 R12: 0000001400000013
Nov 24 15:21:49 shisoft-idc kernel: [25700.739837] R13: 0000000100000001 R14: 0000000000002df8 R15: 0000000000000000
Nov 24 15:21:49 shisoft-idc kernel: [25700.755476] FS:  00007fdf26b71780(0000) GS:ffff88085fa80000(0000) knlGS:0000000000000000
Nov 24 15:21:49 shisoft-idc kernel: [25700.770582] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 24 15:21:49 shisoft-idc kernel: [25700.785925] CR2: 00007f860014a108 CR3: 000000084b8d6000 CR4: 00000000001407e0
Nov 24 15:21:49 shisoft-idc kernel: [25700.801644] Stack:
Nov 24 15:21:49 shisoft-idc kernel: [25700.818004]  ffff88084d981c80 ffffffff81229cf5 ffff88085f018040 ffff880825f08b40
Nov 24 15:21:49 shisoft-idc kernel: [25700.834178]  0000000000000101 ffff88085f008240 ffff88084d981c90 ffffffff81229dcb
Nov 24 15:21:49 shisoft-idc kernel: [25700.850069]  ffff88084d981cb8 ffffffff8122491c ffff880825f08b40 0000000000008000
Nov 24 15:21:49 shisoft-idc kernel: [25700.866236] Call Trace:
Nov 24 15:21:49 shisoft-idc kernel: [25700.881916]  [<ffffffff81229cf5>] proc_lookup_de+0x25/0xe0
Nov 24 15:21:49 shisoft-idc kernel: [25700.898082]  [<ffffffff81229dcb>] proc_lookup+0x1b/0x20
Nov 24 15:21:49 shisoft-idc kernel: [25700.914531]  [<ffffffff8122491c>] proc_root_lookup+0x1c/0x40
Nov 24 15:21:49 shisoft-idc kernel: [25700.932091]  [<ffffffff811c75dd>] lookup_real+0x1d/0x50
Nov 24 15:21:49 shisoft-idc kernel: [25700.951143]  [<ffffffff811cc8e3>] do_last+0x983/0x1230
Nov 24 15:21:49 shisoft-idc kernel: [25700.969605]  [<ffffffff811ca561>] ? link_path_walk+0x71/0x870
Nov 24 15:21:49 shisoft-idc kernel: [25700.988492]  [<ffffffff813137ab>] ? apparmor_file_alloc_security+0x5b/0x180
Nov 24 15:21:49 shisoft-idc kernel: [25701.007602]  [<ffffffff812d5df6>] ? security_file_alloc+0x16/0x20
Nov 24 15:21:49 shisoft-idc kernel: [25701.025095]  [<ffffffff811cd24b>] path_openat+0xbb/0x650
Nov 24 15:21:49 shisoft-idc kernel: [25701.039760]  [<ffffffff81012609>] ? __switch_to+0x169/0x4c0
Nov 24 15:21:49 shisoft-idc kernel: [25701.054466]  [<ffffffff811cd87f>] ? getname_flags+0x4f/0x190
Nov 24 15:21:49 shisoft-idc kernel: [25701.068670]  [<ffffffff811ce64a>] do_filp_open+0x3a/0x90
Nov 24 15:21:49 shisoft-idc kernel: [25701.082653]  [<ffffffff811db4d7>] ? __alloc_fd+0xa7/0x130
Nov 24 15:21:49 shisoft-idc kernel: [25701.096477]  [<ffffffff811bccc9>] do_sys_open+0x129/0x280
Nov 24 15:21:49 shisoft-idc kernel: [25701.109503]  [<ffffffff811bce3e>] SyS_open+0x1e/0x20
Nov 24 15:21:49 shisoft-idc kernel: [25701.122350]  [<ffffffff8172fd2d>] system_call_fastpath+0x1a/0x1f
Nov 24 15:21:49 shisoft-idc kernel: [25701.134992] Code: 0f c1 07 89 c2 c1 ea 10 66 39 c2 75 02 5d c3 83 e2 fe 0f b7 f2 b8 00 80 00 00 eb 0c 0f 1f 44 00 00 f3 90 83 e8 01 74 0a 0f b7 0f <66> 39 ca 75 f1 5d c3 0f 1f 80 00 00 00 00 eb da 66 0f 1f 44 00

uname 让我获得以下信息

Linux shisoft-idc 3.13.0-40-generic #69-Ubuntu SMP Thu Nov 13 17:53:56 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Docker 版本

Client version: 1.3.1
Client API version: 1.15
Go version (client): go1.3.3
Git commit (client): 4e9bbfa
OS/Arch (client): linux/amd64
Server version: 1.3.1
Server API version: 1.15
Go version (server): go1.3.3
Git commit (server): 4e9bbfa

和 docker 信息

Containers: 19
Images: 343
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Dirs: 382
Execution Driver: native-0.2
Kernel Version: 3.13.0-40-generic
Operating System: Ubuntu 14.04.1 LTS
Debug mode (server): false
Debug mode (client): true
Fds: 10
Goroutines: 10
EventsListeners: 0
Init Path: /usr/bin/docker

答案1

根据这一行:

11 月 24 日 15:21:49 shisoft-idc 内核:[25700.486840] BUG:软锁定 - CPU#20 卡住 22 秒![irqbalance:1544]

这或多或少意味着你的 CPU 崩溃了至少 22 秒……

我认为问题出在你的英特尔 CPU 上,如果 CPU 没有死机或即将死机,那么它可能只是有错误,请查看微代码更新(CPU 固件),请阅读:

https://lists.debian.org/debian-user/2013/09/msg00126.html

http://wiki.gentoo.org/wiki/Intel_microcode

至少,所有英特尔至强、i3、i5 和 i7 CPU 都需要对其微代码进行关键安全修复

你的 Linux 发行版可能有一个微码更新服务

请回复告诉我们微代码更新是否可以解决问题(但如果不能...恐怕您必须购买新的 CPU)

请注意,仅在启动时更新一次微代码并不总是足够的,每次重置 CPU 时,通常需要一个运行服务来重新注入微代码更新。

答案2

第一个回溯中的那些行看起来像是一个内核错误:

Nov 24 15:21:30 shisoft-idc kernel: [25672.210093]  [<ffffffff81727aa8>] general_protection+0x28/0x30
Nov 24 15:21:30 shisoft-idc kernel: [25672.224652]  [<ffffffff816f6ff2>] ? in6_dev_finish_destroy+0x62/0xf0
Nov 24 15:21:30 shisoft-idc kernel: [25672.238830]  [<ffffffff8122a099>] ? remove_proc_entry+0x89/0x1b0
Nov 24 15:21:30 shisoft-idc kernel: [25672.253390]  [<ffffffffa0344889>] remove_device_from_ring_list+0x69/0x120 [pf_ring]
Nov 24 15:21:30 shisoft-idc kernel: [25672.268168]  [<ffffffffa0344d07>] ring_notifier+0x127/0x425 [pf_ring]
... skip ...
Nov 24 15:21:30 shisoft-idc kernel: [25674.624052] Fixing recursive fault but reboot is needed!
Nov 24 15:21:30 shisoft-idc kernel: [25682.813069] docker0: port 14(veth_app-mine) entered forwarding state

您可以通过更改 docker 的网络设置(例如禁用 IPv6)来解决该问题。

或者如果您有空闲时间,您可以尝试解决ffffffff816f6ff2LOC,并尝试找出可能导致 GPF 的原因。

PS. 另外,你可能还没有在这里发布你的第一个错误,因为你已经发布了,X并且W在你的Tainted: G D W OX

答案3

我通过将系统内核从 13.13 升级到 13.17.4 解决了这个问题

在另一台机器上,使用 13.13 内核的 docker 没有出现同样的问题。这很奇怪

相关内容