突然的页面分配失败导致系统崩溃

突然的页面分配失败导致系统崩溃

我们运行一个使用 Linux 桥接来过滤流量的平台,并将该活动记录到 MySQL 服务器。有时我们会遇到一个问题,即设备会经历非常高的延迟,而在此之前,我们经常会在驱动程序中看到重复的页面分配失败mpt3sas,并记录到/var/log/messages。这些似乎发生在系统负载高的时候,但也发生在内存似乎足够的系统上。我没有正确阅读这些日志的专业知识,希望有人能有所了解。

我尝试过调整vm.min_free_kbytes = 65536(我们正在使用vm.reclaim_mode = 1),但似乎并没有缓解问题。有人有什么想法吗?(日志如下:)

localhost kernel: [21572436.601597] sas3ircu: page allocation failure: order:4, mode:0xcc0(GFP_KERNEL), nodemask=(null),cpuset=/,mems_allowed=0
localhost kernel: [21572436.601601] CPU: 2 PID: 22663 Comm: sas3ircu Tainted: G        W  O      #1
localhost kernel: [21572436.601602] Hardware name: XXXXXXXXXXX , BIOS 3.1 06/06/2018
localhost kernel: [21572436.601602] Call Trace:
localhost kernel: [21572436.601609]  dump_stack+0x7c/0x9c
localhost kernel: [21572436.601612]  warn_alloc.cold+0x7b/0xdf
localhost kernel: [21572436.601615]  ? _cond_resched+0x15/0x30
localhost kernel: [21572436.601617]  ? __alloc_pages_direct_compact+0x141/0x150
localhost kernel: [21572436.601618]  __alloc_pages_slowpath+0xd88/0xdc0
localhost kernel: [21572436.601622]  ? node_reclaim+0x2b1/0x310
localhost kernel: [21572436.601624]  ? get_page_from_freelist+0xaf/0x3a0
localhost kernel: [21572436.601625]  __alloc_pages_nodemask+0x2bf/0x310
localhost kernel: [21572436.601628]  __dma_direct_alloc_pages+0x137/0x220
localhost kernel: [21572436.601630]  dma_direct_alloc_pages+0x1c/0x80
localhost kernel: [21572436.601639]  _ctl_do_mpt_command+0x724/0xc40 [mpt3sas]
localhost kernel: [21572436.601642]  ? ima_file_check+0x59/0x80
localhost kernel: [21572436.601646]  _ctl_compat_mpt_command+0xd1/0x100 [mpt3sas]
localhost kernel: [21572436.601651]  _ctl_ioctl_main+0x4e0/0xb80 [mpt3sas]
localhost kernel: [21572436.601655]  ? __ia32_compat_sys_ioctl+0x189/0x210
localhost kernel: [21572436.601656]  __ia32_compat_sys_ioctl+0x189/0x210
localhost kernel: [21572436.601659]  do_int80_syscall_32+0x6e/0x1d0
localhost kernel: [21572436.601660]  entry_INT80_compat+0x85/0x90
localhost kernel: [21572436.601669] Mem-Info:
localhost kernel: [21572436.601672] active_anon:9743919 inactive_anon:513867 isolated_anon:0
localhost kernel: [21572436.601672]  active_file:35892 inactive_file:14339 isolated_file:0
localhost kernel: [21572436.601672]  unevictable:0 dirty:398 writeback:1 unstable:0
localhost kernel: [21572436.601672]  slab_reclaimable:51419 slab_unreclaimable:4912133
localhost kernel: [21572436.601672]  mapped:18355 shmem:22661 pagetables:53364 bounce:0
localhost kernel: [21572436.601672]  free:1065699 free_pcp:351 free_cma:0
localhost kernel: [21572436.601675] Node 0 active_anon:38975676kB inactive_anon:2055468kB active_file:143568kB inactive_file:57356kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:73420kB dirty:1592kB writeback:4kB shmem:90644kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
localhost kernel: [21572436.601675] Node 0 DMA free:15884kB min:12kB low:24kB high:36kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15968kB managed:15884kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
localhost kernel: [21572436.601678] lowmem_reserve[]: 0 1784 64117 64117
localhost kernel: [21572436.601679] Node 0 DMA32 free:255804kB min:1892kB low:3788kB high:5684kB active_anon:170384kB inactive_anon:80484kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:1965184kB managed:1899648kB mlocked:0kB kernel_stack:0kB pagetables:56kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
localhost kernel: [21572436.601682] lowmem_reserve[]: 0 0 62333 62333
localhost kernel: [21572436.601683] Node 0 Normal free:3991108kB min:63624kB low:127460kB high:191296kB active_anon:38805292kB inactive_anon:1974984kB active_file:143684kB inactive_file:57032kB unevictable:0kB writepending:1596kB present:65011712kB managed:63836092kB mlocked:0kB kernel_stack:5604kB pagetables:213400kB bounce:0kB free_pcp:1404kB local_pcp:232kB free_cma:0kB
localhost kernel: [21572436.601686] lowmem_reserve[]: 0 0 0 0
localhost kernel: [21572436.601687] Node 0 DMA: 1*4kB (U) 1*8kB (U) 0*16kB 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15884kB
localhost kernel: [21572436.601694] Node 0 DMA32: 14687*4kB (UME) 10010*8kB (UME) 7183*16kB (UME) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB (H) 0*4096kB = 255804kB
localhost kernel: [21572436.601697] Node 0 Normal: 297793*4kB (UM) 129409*8kB (UM) 110330*16kB (UME) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3991724kB
localhost kernel: [21572436.601701] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
localhost kernel: [21572436.601702] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
localhost kernel: [21572436.601702] 107240 total pagecache pages
localhost kernel: [21572436.601707] 34281 pages in swap cache
localhost kernel: [21572436.601708] Swap cache stats: add 18740072, delete 18705912, find 159408767/161694352
localhost kernel: [21572436.601708] Free swap  = 4913860kB
localhost kernel: [21572436.601708] Total swap = 33554424kB
localhost kernel: [21572436.601709] 16748216 pages RAM
localhost kernel: [21572436.601709] 0 pages HighMem/MovableOnly
localhost kernel: [21572436.601709] 310310 pages reserved
localhost kernel: [21572436.601710] 0 pages cma reserved
localhost kernel: [21572436.601710] 0 pages hwpoisoned
localhost kernel: [21572436.601711] failure at drivers/scsi/mpt3sas/mpt3sas_ctl.c:763/_ctl_do_mpt_command()!

相关内容