迭代非常大的内存映射导致 OOM

迭代非常大的内存映射导致 OOM

我正在编写一个模拟器,它依赖于生成(可能)非常大的问题域。由于 RAM 无法容纳数据,我使用 4 个四内存映射文件来保存它。这是一个在具有 8GB RAM 的 64 位 Linux 上运行的 64 位应用程序。

我的应用程序在多个线程中迭代内存映射并对它们执行读写操作。然而,我的程序在启动后不久就会导致 OOM(没有发生抖动):

[  683.899682] Purging GPU memory, 25 pages freed, 12838 pages still pinned.
[  683.899683] 50 and 0 pages still available in the bound and unbound GPU page lists.
[  683.899732] Purging GPU memory, 0 pages freed, 12838 pages still pinned.
[  683.899732] 50 and 0 pages still available in the bound and unbound GPU page lists.
[  683.901441] gnome-shell invoked oom-killer: gfp_mask=0x240c0d0(GFP_TEMPORARY|__GFP_COMP|__GFP_ZERO), order=3, oom_score_adj=0
[  683.901443] gnome-shell cpuset=/ mems_allowed=0
[  683.901446] CPU: 0 PID: 1714 Comm: gnome-shell Not tainted 4.8.8-300.fc25.x86_64 #1
[  683.901447] Hardware name: Dell Inc. XPS 13 9350/0PWNCR, BIOS 1.4.4 06/14/2016
[  683.901449]  0000000000000286 000000006699dcf4 ffff8c743292b588 ffffffff863e5dbd
[  683.901451]  ffff8c743292b748 ffff8c73d7431f00 ffff8c743292b5f0 ffffffff8624c1f8
[  683.901453]  000000006699dcf4 000000006699dcf4 ffffffff86e9cac0 0000000000000015
[  683.901454] Call Trace:
[  683.901459]  [<ffffffff863e5dbd>] dump_stack+0x63/0x86
[  683.901460]  [<ffffffff8624c1f8>] dump_header+0x5c/0x1d5
[  683.901463]  [<ffffffff861bd90c>] oom_kill_process+0x20c/0x3d0
[  683.901465]  [<ffffffff860aacfe>] ? has_capability_noaudit+0x1e/0x30
[  683.901466]  [<ffffffff861bde76>] out_of_memory+0x356/0x440
[  683.901468]  [<ffffffff861c3df0>] __alloc_pages_nodemask+0xe90/0xeb0
[  683.901470]  [<ffffffff8621a055>] alloc_pages_current+0x95/0x140
[  683.901472]  [<ffffffff861e54be>] kmalloc_order_trace+0x2e/0xd0
[  683.901508]  [<ffffffffc03962b6>] ? gen9_read32+0x166/0x3a0 [i915]
[  683.901510]  [<ffffffff8622782d>] __kmalloc+0x1cd/0x1f0
[  683.901525]  [<ffffffffc036eaae>] ? alloc_gen8_temp_bitmaps+0x2e/0x80 [i915]
[  683.901537]  [<ffffffffc036eac7>] alloc_gen8_temp_bitmaps+0x47/0x80 [i915]
[  683.901552]  [<ffffffffc036eb9c>] gen8_alloc_va_range_3lvl+0x9c/0x9f0 [i915]
[  683.901553]  [<ffffffff861b896b>] ? find_lock_entry+0x5b/0x140
[  683.901555]  [<ffffffff86411003>] ? swiotlb_map_sg_attrs+0x53/0x130
[  683.901567]  [<ffffffffc036f88c>] gen8_alloc_va_range+0x23c/0x470 [i915]
[  683.901580]  [<ffffffffc0370e5b>] i915_vma_bind+0x9b/0x180 [i915]
[  683.901593]  [<ffffffffc03774fb>] i915_gem_object_do_pin+0x86b/0xa60 [i915]
[  683.901606]  [<ffffffffc037771d>] i915_gem_object_pin+0x2d/0x30 [i915]
[  683.901618]  [<ffffffffc0365acf>] i915_gem_execbuffer_reserve_vma.isra.20+0x9f/0x180 [i915]
[  683.901633]  [<ffffffffc0365f3b>] i915_gem_execbuffer_reserve.isra.21+0x38b/0x3b0 [i915]
[  683.901646]  [<ffffffffc03671d8>] i915_gem_do_execbuffer.isra.24+0x6b8/0x1200 [i915]
[  683.901648]  [<ffffffff861b87d0>] ? find_get_entry+0x20/0x160
[  683.901650]  [<ffffffff861d9b99>] ? shmem_getpage_gfp+0xd9/0xc90
[  683.901661]  [<ffffffffc0368944>] i915_gem_execbuffer2+0x104/0x260 [i915]
[  683.901691]  [<ffffffffc0250fa0>] drm_ioctl+0x200/0x4f0 [drm]
[  683.901704]  [<ffffffffc0368840>] ? i915_gem_execbuffer+0x330/0x330 [i915]
[  683.901705]  [<ffffffff862692ff>] ? dput+0x21f/0x260
[  683.901707]  [<ffffffff86264cd3>] do_vfs_ioctl+0xa3/0x5f0
[  683.901708]  [<ffffffff86265299>] SyS_ioctl+0x79/0x90
[  683.901710]  [<ffffffff868027b2>] entry_SYSCALL_64_fastpath+0x1a/0xa4
[  683.901711] Mem-Info:
[  683.901714] active_anon:95440 inactive_anon:114637 isolated_anon:0
                active_file:1284097 inactive_file:231741 isolated_file:0
                unevictable:31 dirty:107 writeback:155137 unstable:0
                slab_reclaimable:34416 slab_unreclaimable:15787
                mapped:1373800 shmem:49734 pagetables:32919 bounce:0
                free:26810 free_pcp:0 free_cma:0
[  683.901717] Node 0 active_anon:381760kB inactive_anon:458548kB active_file:5136388kB inactive_file:926964kB unevictable:124kB isolated(anon):0kB isolated(file):0kB mapped:5495200kB dirty:428kB writeback:620548kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 198936kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no
[  683.901718] Node 0 DMA free:15872kB min:132kB low:164kB high:196kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15980kB managed:15896kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:24kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[  683.901720] lowmem_reserve[]: 0 1829 7809 7809 7809
[  683.901723] Node 0 DMA32 free:39676kB min:15796kB low:19744kB high:23692kB active_anon:2196kB inactive_anon:180kB active_file:1467288kB inactive_file:152024kB unevictable:0kB writepending:120700kB present:1958284kB managed:1892644kB mlocked:0kB slab_reclaimable:34852kB slab_unreclaimable:4228kB kernel_stack:32kB pagetables:27600kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[  683.901725] lowmem_reserve[]: 0 0 5980 5980 5980
[  683.901727] Node 0 Normal free:51692kB min:51648kB low:64560kB high:77472kB active_anon:379564kB inactive_anon:458368kB active_file:3669100kB inactive_file:774940kB unevictable:124kB writepending:500276kB present:6275072kB managed:6128020kB mlocked:124kB slab_reclaimable:102812kB slab_unreclaimable:58896kB kernel_stack:8288kB pagetables:104076kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[  683.901729] lowmem_reserve[]: 0 0 0 0 0
[  683.901731] Node 0 DMA: 0*4kB 0*8kB 2*16kB (U) 1*32kB (U) 3*64kB (U) 2*128kB (U) 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15872kB
[  683.901738] Node 0 DMA32: 6599*4kB (UME) 1573*8kB (UM) 51*16kB (UM) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 39796kB
[  683.901744] Node 0 Normal: 11012*4kB (UMEH) 955*8kB (UMH) 11*16kB (H) 1*32kB (H) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 51896kB
[  683.901750] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[  683.901751] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[  683.901752] 1565598 total pagecache pages
[  683.901752] 0 pages in swap cache
[  683.901753] Swap cache stats: add 0, delete 0, find 0/0
[  683.901754] Free swap  = 8126460kB
[  683.901754] Total swap = 8126460kB
[  683.901754] 2062334 pages RAM
[  683.901755] 0 pages HighMem/MovableOnly
[  683.901755] 53194 pages reserved
[  683.901755] 0 pages cma reserved
[  683.901756] 0 pages hwpoisoned
[  683.901756] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
[  683.901771] [  771]     0   771    19588     1983      34       3        0             0 systemd-journal
[  683.901773] [  802]     0   802    32268     1616      29       3        0             0 lvmetad
[  683.901775] [  808]     0   808    11972     1901      25       3        0         -1000 systemd-udevd
[  683.901778] [  987]     0   987    13888      834      27       3        0         -1000 auditd
[  683.901779] [  998]     0   998    21136      443      12       3        0             0 audispd
[  683.901781] [ 1001]     0  1001    10907      520      26       3        0             0 sedispatch
[  683.901782] [ 1011]     0  1011     1104      180       8       3        0             0 rngd
[  683.901783] [ 1012]     0  1012     4220      343      13       3        0             0 alsactl
[  683.901784] [ 1013]     0  1013    98858     1754      45       3        0             0 accounts-daemon
[  683.901785] [ 1015]     0  1015    10882     1188      23       3        0             0 bluetoothd
[  683.901786] [ 1017]     0  1017   104437     2240      73       4        0             0 ModemManager
[  683.901787] [ 1018]   172  1018    46710      823      28       3        0             0 rtkit-daemon
[  683.901788] [ 1020]     0  1020     1642      478       9       3        0             0 mcelog
[  683.901790] [ 1022]    81  1022    14507     1493      27       3        0          -900 dbus-daemon
[  683.901791] [ 1026]   990  1026    28014      782      25       3        0             0 chronyd
[  683.901792] [ 1036]     0  1036    51991      804      38       3        0             0 gssproxy
[  683.901794] [ 1048]     0  1048   167266     8834     105       4        0             0 firewalld
[  683.901795] [ 1050]     0  1050    12558     1987      28       3        0             0 systemd-logind
[  683.901797] [ 1052]    70  1052    12579      990      28       3        0             0 avahi-daemon
[  683.901798] [ 1057]   995  1057   133403     3831      56       3        0             0 polkitd
[  683.901799] [ 1059]     0  1059   111734     2268      64       3        0             0 abrtd
[  683.901800] [ 1066]    70  1066    12547       89      27       3        0             0 avahi-daemon
[  683.901801] [ 1086]     0  1086   156436     3838      85       3        0             0 NetworkManager
[  683.901802] [ 1110]     0  1110   228634     8292     174       4        0             0 libvirtd
[  683.901803] [ 1122]     0  1122   102461     2029      47       3        0             0 gdm
[  683.901804] [ 1125]     0  1125    33234      810      19       3        0             0 crond
[  683.901805] [ 1126]     0  1126     6490      534      19       3        0             0 atd
[  683.901806] [ 1213]     0  1213    93009     2339      67       4        0             0 gdm-session-wor
[  683.901807] [ 1222]     0  1222    16576     1873      36       3        0             0 wpa_supplicant
[  683.901808] [ 1236]    42  1236    16516     1785      36       3        0             0 systemd
[  683.901809] [ 1238]    42  1238    24758      780      47       3        0             0 (sd-pam)
[  683.901810] [ 1242]    42  1242   112470     2665      94       3        0             0 gdm-wayland-ses
[  683.901811] [ 1244]    42  1244    14131     1121      28       3        0             0 dbus-daemon
[  683.901812] [ 1247]    42  1247   172808     3320     110       4        0             0 gnome-session-b
[  683.901813] [ 1255]    42  1255   403569    27848     305       5        0             0 gnome-shell
[  683.901814] [ 1270]     0  1270   107218     2323      53       4        0             0 upowerd
[  683.901816] [ 1277]     0  1277   261220     3428     344       4        0             0 abrt-dump-journ
[  683.901817] [ 1279]     0  1279   262477     3538     345       4        0             0 abrt-dump-journ
[  683.901818] [ 1338]    99  1338    12274       92      26       3        0             0 dnsmasq
[  683.901819] [ 1339]     0  1339    12267       92      26       3        0             0 dnsmasq
[  683.901820] [ 1420]    42  1420    62218    11072     107       3        0             0 Xwayland
[  683.901822] [ 1426]    42  1426    86174     1417      36       3        0             0 at-spi-bus-laun
[  683.901823] [ 1431]    42  1431    14074      920      27       3        0             0 dbus-daemon
[  683.901824] [ 1434]    42  1434    55841     1513      42       4        0             0 at-spi2-registr
[  683.901825] [ 1440]    42  1440   164934     2682      84       4        0             0 pulseaudio
[  683.901826] [ 1453]    42  1453   115040     2138      39       3        0             0 ibus-daemon
[  683.901827] [ 1456]    42  1456    95634     1424      38       3        0             0 ibus-dconf
[  683.901828] [ 1459]    42  1459   126126     6717     129       3        0             0 ibus-x11
[  683.901829] [ 1465]    42  1465   109804     2173      59       3        0             0 xdg-permission-
[  683.901830] [ 1473]     0  1473   196027    18444     150       3        0             0 packagekitd
[  683.901831] [ 1477]    42  1477   302617     9107     200       5        0             0 gnome-settings-
[  683.901832] [ 1497]    42  1497    77184     1362      34       3        0             0 ibus-engine-sim
[  683.901833] [ 1538]   993  1538   103715     2539      54       3        0             0 colord
[  683.901834] [ 1583]     0  1583    98709     2395      77       3        0             0 gdm-session-wor
[  683.901835] [ 1592]     0  1592    21780     4717      46       3        0             0 dhclient
[  683.901836] [ 1641]     0  1641    84698     2398      50       4        0             0 nm-openvpn-serv
[  683.901837] [ 1645]   988  1645    17970     1854      40       3        0             0 openvpn
[  683.901838] [ 1648]  1000  1648    16517     1756      34       4        0             0 systemd
[  683.901839] [ 1656]  1000  1656    24794      801      47       3        0             0 (sd-pam)
[  683.901840] [ 1666]  1000  1666   118117     2041      44       3        0             0 gnome-keyring-d
[  683.901841] [ 1669]  1000  1669   112470     2807      97       3        0             0 gdm-wayland-ses
[  683.901843] [ 1671]  1000  1671    14340     1294      30       3        0             0 dbus-daemon
[  683.901844] [ 1674]  1000  1674   172877     3440     114       4        0             0 gnome-session-b
[  683.901845] [ 1686]  1000  1686    98759     1747      42       3        0             0 gvfsd
[  683.901846] [ 1691]  1000  1691   104451     1327      37       3        0             0 gvfsd-fuse
[  683.901847] [ 1714]  1000  1714   447111    39711     335       5        0             0 gnome-shell
[  683.901848] [ 1729]  1000  1729    62884    11611     111       3        0             0 Xwayland
[  683.901849] [ 1735]  1000  1735    86177     1474      36       4        0             0 at-spi-bus-laun
[  683.901850] [ 1740]  1000  1740    14106     1108      29       3        0             0 dbus-daemon
[  683.901851] [ 1743]  1000  1743    55841     1516      46       3        0             0 at-spi2-registr
[  683.901852] [ 1749]  1000  1749   173283     3029     102       4        0             0 pulseaudio
[  683.901853] [ 1765]  1000  1765   221322     8359     193       4        0             0 gnome-shell-cal
[  683.901854] [ 1766]  1000  1766   115001     2100      41       4        0             0 ibus-daemon
[  683.901855] [ 1770]  1000  1770    95645     1397      37       3        0             0 ibus-dconf
[  683.901855] [ 1772]  1000  1772   126126     6529     126       4        0             0 ibus-x11
[  683.901856] [ 1782]  1000  1782   229523     9384     213       4        0             0 evolution-sourc
[  683.901857] [ 1785]  1000  1785   109804     2197      62       4        0             0 xdg-permission-
[  683.901858] [ 1794]  1000  1794   105580     2119      55       3        0             0 gvfs-udisks2-vo
[  683.901859] [ 1801]     0  1801    97029     2083      54       3        0             0 udisksd
[  683.901860] [ 1808]  1000  1808   222729     7688     172       4        0             0 goa-daemon
[  683.901861] [ 1811]  1000  1811    98629     1391      40       3        0             0 gvfs-gphoto2-vo
[  683.901862] [ 1817]  1000  1817    94552     1372      35       3        0             0 gvfs-goa-volume
[  683.901864] [ 1827]  1000  1827   136872     2734     108       3        0             0 goa-identity-se
[  683.901865] [ 1829]  1000  1829    96354     1300      38       3        0             0 gvfs-mtp-volume
[  683.901866] [ 1836]  1000  1836   122361     2095      53       4        0             0 gvfs-afc-volume
[  683.901867] [ 1848]  1000  1848   322422    10974     240       5        0             0 gnome-settings-
[  683.901868] [ 1861]  1000  1861   269205    10731     229       4        0             0 evolution-calen
[  683.901870] [ 1889]  1000  1889    77225     1429      35       3        0             0 ibus-engine-sim
[  683.901871] [ 1981]  1000  1981   290746     9893     206       4        0             0 evolution-calen
[  683.901872] [ 1993]  1000  1993   163552     5140      89       4        0             0 tracker-miner-f
[  683.901873] [ 1995]  1000  1995   158399     4233      78       4        0             0 tracker-miner-a
[  683.901874] [ 1999]  1000  1999   185785     4846      96       4        0             0 tracker-extract
[  683.901875] [ 2010]  1000  2010   140701     4027      79       4        0             0 tracker-miner-u
[  683.901876] [ 2011]     0  2011    52365     1984      53       3        0             0 cupsd
[  683.901878] [ 2017]  1000  2017    46916     1268      27       4        0             0 dconf-service
[  683.901879] [ 2021]  1000  2021   265817     9429     221       4        0             0 evolution-addre
[  683.901880] [ 2023]  1000  2023   272624     9896     198       4        0             0 evolution-calen
[  683.901881] [ 2027]  1000  2027   133874     3898      60       4        0             0 tracker-store
[  683.901882] [ 2043]  1000  2043   160491     3462     122       4        0             0 gsd-printer
[  683.901883] [ 2051]  1000  2051   168185     6168     137       3        0             0 abrt-applet
[  683.901884] [ 2054]  1000  2054    83177     4194     112       3        0             0 seapplet
[  683.901885] [ 2071]  1000  2071   308067     9309     203       4        0             0 evolution-addre
[  683.901886] [ 2127]  1000  2127   657040    48104     313       6        0             0 insync
[  683.901887] [ 2133]     0  2133    85913     2519      64       4        0          -900 abrt-dbus
[  683.901888] [ 2163]  1000  2163    45120     1422      39       3        0             0 gconfd-2
[  683.901890] [ 2200]  1000  2200    77445     1596      35       4        0             0 gvfsd-metadata
[  683.901891] [ 2215]  1000  2215   203657    11515     145       4        0             0 gnome-terminal-
[  683.901892] [ 2235]  1000  2235   117765     1710      47       3        0             0 gvfsd-trash
[  683.901893] [ 2309]  1000  2309    30789     1208      15       3        0             0 bash
[  683.901895] [ 2397]  1000  2397   170605    13763     148       4        0             0 gnome-system-mo
[  683.901896] [ 3036]  1000  3036 12251152  1306699   23712      50        0             0 SyrenProcessor_
[  683.901897] [ 3042]  1000  3042    30763     1084      14       3        0             0 bash
[  683.901899] [ 3122]  1000  3122    30267      247      12       3        0             0 dmesg
[  683.901900] Out of memory: Kill process 3036 (SyrenProcessor_) score 329 or sacrifice child
[  683.901928] Killed process 3036 (SyrenProcessor_) total-vm:49004608kB, anon-rss:1368kB, file-rss:5226416kB, shmem-rss:0kB

正如您所看到的,我的应用程序在内存映射中保存了约 50GB。

我的假设是,当我的应用程序访问虚拟地址空间中的数据时,Linux 会计算出该地址位于哪个页面,如果它不在 RAM 中,则触发页面错误以从磁盘中提取该页面。在这种假设下,RAM 中的页面数量与访问内存映射的线程数量相同 - 我知道实际上 Linux 会缓存尽可能多的页面以提高性能,但 Linux 应该根据需要丢弃较少的活动页面,因此它们不应该'这不是问题。这就是为什么我认为使用内存映射时不会出现 OOM 错误(假设您有磁盘空间并且不超过 64 位地址范围)。显然情况并非如此,所以有人可以纠正我的假设吗?

最后一行anon-rss:1368kB似乎是合理的,因为我的应用程序使用很少的 RAM 来实现它自己的功能,并且file-rss:5226416kB表明它正在尝试将 5.2 GB 的内存映射数据放入 RAM - 并且本来就有那么多可用的内存。

那么为什么会触发OOM呢?

答案1

Linux 允许过度分配内存,因为大多数程序要求的内存多于实际需要的内存,然后当它找到内存时就会担心实际上过度分配了内存并启动了 OOM 杀手。

sysctl vm.overcommit_memory 可能会有所帮助,将其设置为 2 会导致操作系统在您执行操作系统无法处理的操作时给出错误,而不是希望得到最好的结果。

拥有足够大的交换分区/文件来保存数据可能会让事情按预期工作。

或者,这可能是文件本身太大的问题,因此 mmap2() 可能是更好的选择。

相关内容