在服务器上,Linux 内核在启动时会通知 RAM 设置。它表明在物理 512GiB RAM (536409480kiB) 中,只有大约 503GiB RAM (527942676kiB) 可用。
root@ada:~# dmesg | grep Memory:
[ 5.891484] Memory: 527942676K/536409480K available (10252K kernel code, 1241K rwdata, 3320K rodata, 1592K init, 2272K bss, 8466804K reserved, 0K cma-reserved)
考虑到 BIOS 保留的 IO 区域,预计某些内存不可用。
root@ada:~# dmesg | grep reserved
[ 0.000000] BIOS-e820: [mem 0x000000000009c000-0x000000000009ffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
[ 0.000000] BIOS-e820: [mem 0x000000004f66f000-0x0000000057677fff] reserved
[ 0.000000] BIOS-e820: [mem 0x000000006cdcf000-0x000000006efcefff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000070000000-0x000000008fffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000e0000000-0x00000000ffffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x000000107f380000-0x000000107fffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x000000207ff80000-0x000000207fffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x000000307ff80000-0x000000307fffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x000000407ff80000-0x000000407fffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x000000507ff80000-0x000000507fffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x000000607ff80000-0x000000607fffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x000000707ff80000-0x000000707fffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x000000807ff80000-0x000000807fffffff] reserved
然而,其总量不超过 100MiB。
我想知道如何调查导致该内存被保留的原因?原因是什么?
如果不了解保留内存的用途,它看起来就像是 ~9GiB 的内存丢失。鉴于系统充当虚拟化主机,这种“损失”会加剧,因为每个虚拟化客户系统反过来也“保留”了其专用 RAM 的类似部分。
自从其他问题有人建议可以为显卡的“共享内存”保留这样的内存,我查了一下,但目前的适配器似乎最多只使用~50MiB。
root@ada:~# lspci | grep -i vga
03:00.0 VGA compatible controller: Matrox Electronics Systems Ltd. Integrated Matrox G200eW3 Graphics Controller (rev 04)
root@ada:~# lspci -s 03:00.0 -vvv
03:00.0 VGA compatible controller: Matrox Electronics Systems Ltd. Integrated Matrox G200eW3 Graphics Controller (rev 04) (prog-if 00 [VGA controller])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0 (4000ns min, 8000ns max)
Interrupt: pin A routed to IRQ 243
NUMA node: 0
Region 0: Memory at eb000000 (32-bit, prefetchable) [size=16M]
Region 1: Memory at f9808000 (32-bit, non-prefetchable) [size=16K]
Region 2: Memory at f9000000 (32-bit, non-prefetchable) [size=8M]
[virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
Capabilities: [dc] Power Management version 3
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Kernel driver in use: mgag200
Kernel modules: mgag200
更新
这里是完整的 dmesg 输出
更新 它是 Dell Poweredge 服务器,以 BIOS 模式(非 UEFI)启动
更新
这是输出/proc/iomem
更新
这里评估的输出/proc/iomem
root@ada:~# cat /proc/iomem | tr [a-z] [A-Z] | while IFS='-: ' read AD1 AD2 REST;
> do echo "$(( $(echo "obase=10; ibase=16; ( $AD2 - $AD1 ) " | bc) >> 20))MB for $REST" ;
> done | sort -h
[...]
14MB for ACPI NON-VOLATILE STORAGE
15MB for 0000:03:00.0
15MB for MGADRMFB_VRAM
15MB for PCI BUS 0000:02
15MB for PCI BUS 0000:03
33MB for RESERVED
128MB for RESERVED
207MB for PCI BUS 0000:20
207MB for PCI BUS 0000:40
207MB for PCI BUS 0000:60
207MB for PCI BUS 0000:80
207MB for PCI BUS 0000:A0
207MB for PCI BUS 0000:C0
207MB for PCI BUS 0000:E0
255MB for PCI MMCONFIG 0000 [BUS 00-FF]
255MB for PNP 00:00
315MB for PCI BUS 0000:00
343MB for SYSTEM RAM
511MB for RESERVED
543MB for RESERVED
1269MB for SYSTEM RAM
63475MB for SYSTEM RAM
65535MB for SYSTEM RAM
65535MB for SYSTEM RAM
65535MB for SYSTEM RAM
65535MB for SYSTEM RAM
65535MB for SYSTEM RAM
65535MB for SYSTEM RAM
65535MB for SYSTEM RAM
915967MB for PCI BUS 0000:00
915967MB for PCI BUS 0000:20
915967MB for PCI BUS 0000:40
915967MB for PCI BUS 0000:60
915967MB for PCI BUS 0000:80
915967MB for PCI BUS 0000:A0
915967MB for PCI BUS 0000:C0
915967MB for PCI BUS 0000:E0
系统上 dmidecode 的输出是(因为我希望它适合 RAM 可用性的情况):
Getting SMBIOS data from sysfs.
SMBIOS 3.2 present.
81 structures occupying 6778 bytes.
Table at 0x6E8AD000.
Handle 0xDA00, DMI type 218, 11 bytes
OEM-specific Type
Header and Data:
DA 0B 00 DA B2 00 17 20 0E 10 03
Handle 0x0000, DMI type 0, 26 bytes
BIOS Information
Vendor: Dell Inc.
Version: 1.14.3
Release Date: 07/17/2020
Address: 0xF0000
Runtime Size: 64 kB
ROM Size: 0 MB
Characteristics:
ISA is supported
PCI is supported
PNP is supported
BIOS is upgradeable
BIOS shadowing is allowed
Boot from CD is supported
Selectable boot is supported
EDD is supported
Japanese floppy for Toshiba 1.2 MB is supported (int 13h)
5.25"/360 kB floppy services are supported (int 13h)
5.25"/1.2 MB floppy services are supported (int 13h)
3.5"/720 kB floppy services are supported (int 13h)
8042 keyboard services are supported (int 9h)
Serial services are supported (int 14h)
CGA/mono video services are supported (int 10h)
ACPI is supported
USB legacy is supported
BIOS boot specification is supported
Function key-initiated network boot is supported
Targeted content distribution is supported
UEFI is supported
BIOS Revision: 1.14
Handle 0x0100, DMI type 1, 27 bytes
System Information
Manufacturer: Dell Inc.
Product Name: PowerEdge R7425
Version: Not Specified
Serial Number: XXXXXX
UUID: XXXXXXXX-XXXX-4a10-8048-c3c04f593533
Wake-up Type: Power Switch
SKU Number: SKU=NotProvided;ModelName=PowerEdge R7425
Family: PowerEdge
完整的 dmidecode 以及有关大约 9GiB RAM 神秘不可用的可能信息可以在此处查看https://pastebin.com/nHYyuH7h
答案1
我想我知道答案(但不是 100% 确定)。
Linux 具有struct page
与内存页相关的结构。该结构的大小因体系结构而异,在 32 位 x86 Linux 上,它是 40 字节,在 64 位上,我相信,由于指针大小较大,它是 64 字节。
你可以在这里找到它: https://elixir.bootlin.com/linux/v6.1/source/include/linux/mm_types.h
这些结构可以以不同的方式组织(参见https://lwn.net/Articles/789304/),但最终,系统上的每个内存页都会得到一个这样的结构。
这意味着内核需要为每 4096 字节(页面大小)保留 64 字节,这意味着,对于 512G,它将保留 ((512×1024³)/4096)×64 = 8589934592 字节,或 8388608K,即 8G。
将其添加到代码、ro/rwdata、init、bss 部分,您将得到 8407285K,非常接近我们在这里试图解释的 8466804K。
mem=256G
您还可以使用内核参数等来启动服务器。这会将可用内存大小限制为 256G,并且您会发现保留大小几乎缩小了 2 倍。这也说明了上述理论,不幸的是,考虑到信息在该结构中的打包程度以及尽管它非常重要,但我认为没有办法使该内存可用于其他用途。