为什么读取共享内存对象时会发生此总线错误?

为什么读取共享内存对象时会发生此总线错误?

版本:Ubuntu 22.04.3 GCC:11.4.0(Ubuntu 11.4.0-1ubuntu1~22.04)

你好,

我正在尝试创建一个具有非常大的共享内存对象的程序。这是在具有 130GB RAM 的 AWS EC2 实例上完成的。如果创建了高达 120GB 的共享内存对象,shm_open()、ftruncate() 和 mmap() 不会出现任何错误。但是,当逐行读取共享内存对象中的每个内存位置时,会发生总线错误。我创建了一个附加的小型测试程序,它会产生完全可重复的结果。

请注意,shmmax = 18446744073692774399、shmall = 18446744073692774399 和 shmmni = 8092。

当从共享内存对象的底部开始向上读取时,总线错误发生在 66,936,954,880 处。当从顶部 (79,999,999,999) 开始向下读取时,在读取 13,063,041,023 次后发生总线错误。因此在从底部开始的 66,936,958,976 处。因此,总线错误发生的位置之间有一页 (4096) 的间隙。

知道会发生什么吗?

谢谢,

基因

非常简单的 C/C++ 测试程序显示了这个问题。共享内存对象只是硬编码为 80GB。更改注释行以使其通过共享内存对象增加或减少。

// g++ -std=c++20 -O3 test2.cpp -W -Wall -Wextra -pedantic -pthread -o test2

#include <iostream>
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/types.h>

int main() {

    uint_fast64_t mem_amt = 80000000000;
    std::cout << "mem_amt = " << mem_amt << "\n";

    int fd;
    std::string shmpath = "/foo";

    // Remove any existing shared memory object
    shm_unlink(shmpath.c_str());
    // Create the shared memory object with read-write access.
    fd = shm_open(shmpath.c_str(), O_CREAT | O_EXCL | O_RDWR, S_IRUSR | S_IWUSR);

    if (fd == -1) {
        std::cerr << "\nshm_open shmbuf failure. Exiting program.\n\n";
        exit(EXIT_FAILURE);
    }

    // Truncate (set) the size.
    if (ftruncate64(fd, mem_amt) == -1) {
        std::cerr << "\nftruncate shmbuf failure. Exiting program.\n\n";
        exit(EXIT_FAILURE);
    }

    // Map the shared memory object.
    char* pool = (char*)mmap(NULL, mem_amt, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
    if (pool == MAP_FAILED) {
        std::cerr << "\nmmap pool failure. Exiting program.\n\n";
        exit(EXIT_FAILURE);
    }

    std::cout << "pool = " << (uint_fast64_t)pool << "\n";

    char temp;
    for (uint_fast64_t i=0; i<mem_amt; i++) {
//    for (uint_fast64_t i=mem_amt-1; i>0; i--) {
        temp = pool[i];
        if (i % 5000000000 == 0) {
            std::cout << "i = " << i << "\n";
        }
    }
    std::cout << "temp = " << temp << "\n";
}

gbd 分别输出递增和递减的核心文件:

Core was generated by `./test2'.
Program terminated with signal SIGBUS, Bus error.
#0  0x00005570b7fd1373 in main () at test2.cpp:47
47          temp = pool[i];
(gdb) bt full
#0  0x00005570b7fd1373 in main () at test2.cpp:47
        i = 66936954880
        mem_amt = 80000000000
        fd = <optimized out>
        shmpath = "/foo"
        pool = 0x7fa09da0e000 ""
        temp = <optimized out>
(gdb)

Core was generated by `./test2'.
Program terminated with signal SIGBUS, Bus error.
#0  0x000055e242fdc379 in main () at test2.cpp:47
47          temp = pool[i];
(gdb) bt full
#0  0x000055e242fdc379 in main () at test2.cpp:47
        i = 13063041023
        mem_amt = 80000000000
        fd = <optimized out>
        shmpath = "/foo"
        pool = 0x7f7366a0e000 ""
        temp = <optimized out>
(gdb)

相关内容