保留RAM:确定保留主内存的原因?

保留RAM:确定保留主内存的原因?

在服务器上,Linux 内核在启动时会通知 RAM 设置。它表明在物理 512GiB RAM (536409480kiB) 中,只有大约 503GiB RAM (527942676kiB) 可用。

root@ada:~# dmesg | grep Memory:
[    5.891484] Memory: 527942676K/536409480K available (10252K kernel code, 1241K rwdata, 3320K rodata, 1592K init, 2272K bss, 8466804K reserved, 0K cma-reserved)

考虑到 BIOS 保留的 IO 区域,预计某些内存不可用。

root@ada:~# dmesg | grep reserved
[    0.000000] BIOS-e820: [mem 0x000000000009c000-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x000000004f66f000-0x0000000057677fff] reserved
[    0.000000] BIOS-e820: [mem 0x000000006cdcf000-0x000000006efcefff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000070000000-0x000000008fffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000e0000000-0x00000000ffffffff] reserved
[    0.000000] BIOS-e820: [mem 0x000000107f380000-0x000000107fffffff] reserved
[    0.000000] BIOS-e820: [mem 0x000000207ff80000-0x000000207fffffff] reserved
[    0.000000] BIOS-e820: [mem 0x000000307ff80000-0x000000307fffffff] reserved
[    0.000000] BIOS-e820: [mem 0x000000407ff80000-0x000000407fffffff] reserved
[    0.000000] BIOS-e820: [mem 0x000000507ff80000-0x000000507fffffff] reserved
[    0.000000] BIOS-e820: [mem 0x000000607ff80000-0x000000607fffffff] reserved
[    0.000000] BIOS-e820: [mem 0x000000707ff80000-0x000000707fffffff] reserved
[    0.000000] BIOS-e820: [mem 0x000000807ff80000-0x000000807fffffff] reserved

然而,其总量不超过 100MiB。

我想知道如何调查导致该内存被保留的原因?原因是什么?

如果不了解保留内存的用途,它看起来就像是 ~9GiB 的内存丢失。鉴于系统充当虚拟化主机,这种“损失”会加剧,因为每个虚拟化客户系统反过来也“保留”了其专用 RAM 的类似部分。

自从其他问题有人建议可以为显卡的“共享内存”保留这样的内存,我查了一下,但目前的适配器似乎最多只使用~50MiB。

root@ada:~# lspci  | grep -i vga
03:00.0 VGA compatible controller: Matrox Electronics Systems Ltd. Integrated Matrox G200eW3 Graphics Controller (rev 04)
root@ada:~# lspci -s 03:00.0 -vvv
03:00.0 VGA compatible controller: Matrox Electronics Systems Ltd. Integrated Matrox G200eW3 Graphics Controller (rev 04) (prog-if 00 [VGA controller])
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0 (4000ns min, 8000ns max)
        Interrupt: pin A routed to IRQ 243
        NUMA node: 0
        Region 0: Memory at eb000000 (32-bit, prefetchable) [size=16M]
        Region 1: Memory at f9808000 (32-bit, non-prefetchable) [size=16K]
        Region 2: Memory at f9000000 (32-bit, non-prefetchable) [size=8M]
        [virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
        Capabilities: [dc] Power Management version 3
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Kernel driver in use: mgag200
        Kernel modules: mgag200

更新

这里是完整的 dmesg 输出

更新 它是 Dell Poweredge 服务器,以 BIOS 模式(非 UEFI)启动

更新

这是输出/proc/iomem

更新

这里评估的输出/proc/iomem

root@ada:~# cat /proc/iomem | tr [a-z] [A-Z] | while IFS='-: ' read AD1 AD2 REST;
> do echo "$(( $(echo "obase=10; ibase=16; ( $AD2 - $AD1 ) " | bc) >> 20))MB for  $REST" ; 
> done | sort -h  

[...]
14MB for  ACPI NON-VOLATILE STORAGE                                                                                                        
15MB for  0000:03:00.0
15MB for  MGADRMFB_VRAM
15MB for  PCI BUS 0000:02
15MB for  PCI BUS 0000:03
33MB for  RESERVED
128MB for  RESERVED
207MB for  PCI BUS 0000:20
207MB for  PCI BUS 0000:40
207MB for  PCI BUS 0000:60
207MB for  PCI BUS 0000:80
207MB for  PCI BUS 0000:A0
207MB for  PCI BUS 0000:C0
207MB for  PCI BUS 0000:E0
255MB for  PCI MMCONFIG 0000 [BUS 00-FF]
255MB for  PNP 00:00
315MB for  PCI BUS 0000:00
343MB for  SYSTEM RAM
511MB for  RESERVED
543MB for  RESERVED
1269MB for  SYSTEM RAM
63475MB for  SYSTEM RAM
65535MB for  SYSTEM RAM
65535MB for  SYSTEM RAM
65535MB for  SYSTEM RAM
65535MB for  SYSTEM RAM
65535MB for  SYSTEM RAM
65535MB for  SYSTEM RAM
65535MB for  SYSTEM RAM
915967MB for  PCI BUS 0000:00
915967MB for  PCI BUS 0000:20
915967MB for  PCI BUS 0000:40
915967MB for  PCI BUS 0000:60
915967MB for  PCI BUS 0000:80
915967MB for  PCI BUS 0000:A0
915967MB for  PCI BUS 0000:C0
915967MB for  PCI BUS 0000:E0

系统上 dmidecode 的输出是(因为我希望它适合 RAM 可用性的情况):

Getting SMBIOS data from sysfs.
SMBIOS 3.2 present.
81 structures occupying 6778 bytes.
Table at 0x6E8AD000.

Handle 0xDA00, DMI type 218, 11 bytes
OEM-specific Type
        Header and Data:
                DA 0B 00 DA B2 00 17 20 0E 10 03

Handle 0x0000, DMI type 0, 26 bytes
BIOS Information
        Vendor: Dell Inc.
        Version: 1.14.3
        Release Date: 07/17/2020
        Address: 0xF0000
        Runtime Size: 64 kB
        ROM Size: 0 MB
        Characteristics:
                ISA is supported
                PCI is supported
                PNP is supported
                BIOS is upgradeable
                BIOS shadowing is allowed
                Boot from CD is supported
                Selectable boot is supported
                EDD is supported
                Japanese floppy for Toshiba 1.2 MB is supported (int 13h)
                5.25"/360 kB floppy services are supported (int 13h)
                5.25"/1.2 MB floppy services are supported (int 13h)
                3.5"/720 kB floppy services are supported (int 13h)
                8042 keyboard services are supported (int 9h)
                Serial services are supported (int 14h)
                CGA/mono video services are supported (int 10h)
                ACPI is supported
                USB legacy is supported
                BIOS boot specification is supported
                Function key-initiated network boot is supported
                Targeted content distribution is supported
                UEFI is supported
            BIOS Revision: 1.14
    
    Handle 0x0100, DMI type 1, 27 bytes
    System Information
            Manufacturer: Dell Inc.
            Product Name: PowerEdge R7425
            Version: Not Specified
            Serial Number: XXXXXX
            UUID: XXXXXXXX-XXXX-4a10-8048-c3c04f593533
            Wake-up Type: Power Switch
            SKU Number: SKU=NotProvided;ModelName=PowerEdge R7425
        Family: PowerEdge

完整的 dmidecode 以及有关大约 9GiB RAM 神秘不可用的可能信息可以在此处查看https://pastebin.com/nHYyuH7h

在此输入图像描述

答案1

我想我知道答案(但不是 100% 确定)。

Linux 具有struct page与内存页相关的结构。该结构的大小因体系结构而异,在 32 位 x86 Linux 上,它是 40 字节,在 64 位上,我相信,由于指针大小较大,它是 64 字节。

你可以在这里找到它: https://elixir.bootlin.com/linux/v6.1/source/include/linux/mm_types.h

这些结构可以以不同的方式组织(参见https://lwn.net/Articles/789304/),但最终,系统上的每个内存页都会得到一个这样的结构。

这意味着内核需要为每 4096 字节(页面大小)保留 64 字节,这意味着,对于 512G,它将保留 ((512×1024³)/4096)×64 = 8589934592 字节,或 8388608K,即 8G。

将其添加到代码、ro/rwdata、init、bss 部分,您将得到 8407285K,非常接近我们在这里试图解释的 8466804K。

mem=256G您还可以使用内核参数等来启动服务器。这会将可用内存大小限制为 256G,并且您会发现保留大小几乎缩小了 2 倍。这也说明了上述理论,不幸的是,考虑到信息在该结构中的打包程度以及尽管它非常重要,但我认为没有办法使该内存可用于其他用途。

相关内容