Centos 7 随机恐慌/哎呀

Centos 7 随机恐慌/哎呀

最后,经过多次配置更改后,我得到了可以检查的完整故障转储:

  KERNEL: /usr/lib/debug/lib/modules/3.10.0-327.28.3.el7.x86_64/vmlinux
DUMPFILE: vmcore  [PARTIAL DUMP]
    CPUS: 64
    DATE: Wed Aug 24 20:11:12 2016
  UPTIME: 02:16:27
LOAD AVERAGE: 1.00, 8.29, 7.26
   TASKS: 1175
NODENAME: dev1.soft.com
 RELEASE: 3.10.0-327.28.3.el7.x86_64
 VERSION: #1 SMP Thu Aug 18 19:05:49 UTC 2016
 MACHINE: x86_64  (2260 Mhz)
  MEMORY: 256 GB
   PANIC: "BUG: unable to handle kernel paging request at 00007f3b31c9a798"

在崩溃中我跑了 bt

crash> bt
PID: 11768  TASK: ffff8840173d0000  CPU: 38  COMMAND: "cp"
 #0 [ffff884017277660] machine_kexec at ffffffff81051e9b
 #1 [ffff8840172776c0] crash_kexec at ffffffff810f27c2
 #2 [ffff884017277790] oops_end at ffffffff8163f448
 #3 [ffff8840172777b8] no_context at ffffffff8162f588
 #4 [ffff884017277808] __bad_area_nosemaphore at ffffffff8162f61e
 #5 [ffff884017277850] bad_area at ffffffff8162f942
 #6 [ffff884017277878] __do_page_fault at ffffffff81642225
 #7 [ffff8840172778d8] do_page_fault at ffffffff81642353
 #8 [ffff884017277900] page_fault at ffffffff8163e648
    [exception RIP: radix_tree_next_chunk+323]
    RIP: ffffffff812f8e83  RSP: ffff8840172779b8  RFLAGS: 00010246
    RAX: 0000000000000000  RBX: 0000000000000000  RCX: 0000000000000006
    RDX: 00007f3b31c9a770  RSI: ffff884017277a00  RDI: 0000000000000006
    RBP: ffff8840172779e0   R8: ffff884017277a00   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000220  R12: 0000000000000040
    R13: 0000000000000000  R14: 0000000000000012  R15: ffff880fddf9cfd8
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #9 [ffff8840172779e8] __find_get_pages at ffffffff8116c065
#10 [ffff884017277a50] __pagevec_lookup at ffffffff8117880e
#11 [ffff884017277a68] truncate_inode_pages_range at ffffffff81179240
#12 [ffff884017277bb8] truncate_pagecache at ffffffff81179927
#13 [ffff884017277be0] ext4_setattr at ffffffffa0226889 [ext4]
#14 [ffff884017277c40] notify_change at ffffffff811fbd09
#15 [ffff884017277c88] do_truncate at ffffffff811dce03
#16 [ffff884017277d00] do_last at ffffffff811ec3a2
#17 [ffff884017277db0] path_openat at ffffffff811eed02
#18 [ffff884017277e48] do_filp_open at ffffffff811f04cb
#19 [ffff884017277f18] do_sys_open at ffffffff811dde73
#20 [ffff884017277f70] sys_open at ffffffff811ddf8e
#21 [ffff884017277f80] system_call_fastpath at ffffffff81646b49
    RIP: 00007fdad8f142b0  RSP: 00007ffdda059160  RFLAGS: 00010246
    RAX: 0000000000000002  RBX: ffffffff81646b49  RCX: 00007fdad8f13e64
    RDX: 0000000000000000  RSI: 0000000000000201  RDI: 00007ffdda05a385
    RBP: 00007ffdda059550   R8: 00007ffdda059730   R9: 00000000000001b4
    R10: 00007ffdda058ee0  R11: 0000000000000246  R12: ffffffff811ddf8e
    R13: ffff884017277f78  R14: 00007ffdda05a34c  R15: 00007ffdda059730
    ORIG_RAX: 0000000000000002  CS: 0033  SS: 002b

crash> files
PID: 11768  TASK: ffff8840173d0000  CPU: 38  COMMAND: "cp"
ROOT: /    CWD: /home/dev/OPT-10.7.1/data
 FD       FILE            DENTRY           INODE       TYPE PATH
  0 ffff8830264e1e00 ffff882028c00240 ffff882029270850 CHR  /dev/null
  1 ffff8830264e1c00 ffff882028c00240 ffff882029270850 CHR  /dev/null
  2 ffff8830264e1c00 ffff882028c00240 ffff882029270850 CHR  /dev/null
  3 ffff8830287ed800 ffff8820185026c0 ffff88101f699940 REG  /home/dev/OPT-10.7.1/files/Ratings.txt

困难在于我不知道为什么它无法cp该文件。它每天都会执行此操作,有时一天会执行几次,文件大小约为 7.5Gb。

我可以做什么来获取更多信息并恢复盒子?

更新:我在 /home 路径上运行了 fsck,它看起来很好退出代码 0:

FSCK结果

更新 2:我使用 -f 标志返回代码 0 重新运行:

使用 -f 重新运行命令

相关内容