尽管使用 -fno-omit-frame-pointer 编译,但 perf_event 的性能报告缺少堆栈符号

尽管使用 -fno-omit-frame-pointer 编译,但 perf_event 的性能报告缺少堆栈符号

尽管阅读了许多有关该主题的教程并做了(我认为)所有必要的事情,但我仍在努力让 perf_events 为我提供带有符号的堆栈跟踪。我的本地安装的 perf(详细信息如下)可能以某种方式出现了问题?无论如何,这就是我所做的:

main.cpp 是一个简单的 C++ 程序,它调用同一文件中定义的几个函数,分配一些内存并释放它,然后打印一些内容。

编译命令:

gcc -std=c++11 -lstdc++ main.cpp -Og -fno-omit-frame-pointer -fno-inline -o arr_test

配置文件命令:

perf record -a -g -- ./arr_test && perf report --stdio

我确实收到以下有关内核符号的警告,但我认为这并不重要,因为我现在只关心应用程序中的符号:

[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.052 MB perf.data (~2285 samples) ]
[kernel.kallsyms] with build id e22966849c48748782a1be4fe0ce94db6838b806 not found, continuing without symbols
[kernel.kallsyms] with build id e22966849c48748782a1be4fe0ce94db6838b806 not found, continuing without symbols
Warning:
Kernel address maps (/proc/{kallsyms,modules}) were restricted.

Check /proc/sys/kernel/kptr_restrict before running 'perf record'.

As no suitable kallsyms nor vmlinux was found, kernel samples
can't be resolved.

Samples in kernel modules can't be resolved as well.

这是输出的片段:

# Overhead   Command      Shared Object
# ........  ........  .................
#
    83.27%  arr_test  arr_test         
            |          
            |--34.12%-- 0x400908
            |          0x7fe72b381ec5
            |          
            |--10.48%-- 0x400903
            |          0x7fe72b381ec5
            |          
            |--10.08%-- 0x4008b8
            |          0x7fe72b381ec5
            |          
            |--9.22%-- 0x4008e5
            |          0x7fe72b381ec5
            |          
            |--9.05%-- 0x4008da
            |          0x7fe72b381ec5
            |          
            |--8.49%-- 0x4008f0
            |          0x7fe72b381ec5
            |          
            |--6.87%-- 0x4008d5
            |          0x7fe72b381ec5
            |          
            |--6.23%-- 0x4008c2
            |          0x7fe72b381ec5
            |          
            |--4.76%-- 0x4008fd
            |          0x7fe72b381ec5
             --0.70%-- [...]

     8.02%  arr_test  [kernel.kallsyms]
            |          
            |--4.87%-- 0xffffffff81140b64
            |          0xffffffff81146646
            |          0xffffffff81182751
            |          0xffffffff811829eb
            |          0xffffffff8173317d
            |          0x7fe72bab86a7
            |          0x7fe72baa7e00

文件信息(显示“未剥离”):

$ file arr_test 
arr_test: ELF 64-bit LSB  executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, not stripped

有关我的性能安装的详细信息(这些警告是否会阻止我在堆栈中看到符号?)

Auto-detecting system features:
...                     backtrace: [ on  ]
...                         dwarf: [ OFF ]
...                fortify-source: [ on  ]
...                         glibc: [ on  ]
...                          gtk2: [ on  ]
...                  gtk2-infobar: [ on  ]
...                      libaudit: [ OFF ]
...                        libbfd: [ OFF ]
...                        libelf: [ OFF ]
...             libelf-getphdrnum: [ OFF ]
...                   libelf-mmap: [ OFF ]
...                       libnuma: [ on  ]
...                       libperl: [ on  ]
...                     libpython: [ on  ]
...             libpython-version: [ on  ]
...                      libslang: [ on  ]
...                     libunwind: [ OFF ]
...                       on-exit: [ on  ]
...                stackprotector: [ on  ]
...            stackprotector-all: [ on  ]
...                       timerfd: [ on  ]

config/Makefile:264: No libelf found, disables 'probe' tool, please install elfutils-libelf-devel/libelf-dev
config/Makefile:329: No libunwind found, disabling post unwind support. Please install libunwind-dev[el] >= 1.1
config/Makefile:354: No libaudit.h found, disables 'trace' tool, please install audit-libs-devel or libaudit-dev

如何在 perf 中找到我的符号?

答案1

我正在使用更多调试选项进行编译:

-Og -ggdb3 -fno-omit-frame-pointer

然后,当我记录时,我没有使用 -a 选项(应该监视所有系统进程),我正在使用

perf record -e cycles -g --call-graph fp -- ./your_app your_args

最后,展示我正在使用的结果

perf report -g graph

输出看起来像预期的那样(注意,我使用的是 debian 9 并且性能报告输出是基于 ncurses 的)

-   92.18%     0.00%  stsm     stsm                  [.] main                                                                                                ◆
   - main                                                                                                                                                    ▒
      - 91.77% STSM::run                                                                                                                                     ▒
         + 56.86% STSM::generateCandidates                                                                                                                   ▒
         - 25.22% STSM::detectBlocksOfAllSolidSequences                                                                                                      ▒
            + 23.42% STSM::detectSolidSequenceBlocksFromSolidSequence                                                                                        ▒
              0.81% Segment::unify                                                                                                                           ▒
         + 5.25% STSM::updateKernelsOfAllCandidates                                                                                                          ▒
           1.80% RangedSequence::range                                                                                                                       ▒
         + 1.45% STSM::updateMatchingPositions                                                                                                               ▒
           0.99% Segment::intersects                                                                                                                         ▒
+   92.18%     0.00%  stsm     libc-2.24.so          [.] __libc_start_main                                                                                   ▒
+   92.18%     0.00%  stsm     [unknown]             [k] 0x4d96258d4c544155                                                                                  ▒
+   91.77%     0.00%  stsm     stsm                  [.] STSM::run                                                                                           ▒
+   56.86%     6.74%  stsm     stsm                  [.] STSM::generateCandidates                                                                            ▒
+   49.99%    49.99%  stsm     stsm                  [.] Segment::intersects                                                                                 ▒
+   25.22%     0.00%  stsm     stsm                  [.] STSM::detectBlocksOfAllSolidSequences 

答案2

这是一个古老的话题,但我确信它仍然发生在数百人身上,并且因为这个 StackExchange 问题仍然在 Google 结果中显示很高,所以我分享在 StackOverflow 上找到的对我有用的答案:https://stackoverflow.com/questions/33137543/linux-perf-top-kernel-symbol-not-found

基本上:

开始性能记录之前

echo 0 > /proc/sys/kernel/kptr_restrict

还可能需要设置这些

适用于 RHEL/CentOS/Fedora/等

yum install -y elfutils-libelf-devel libunwind-devel audit-libs-devel slang-devel

或者对于 Debian/Ubuntu/etc

apt-get install libelf-dev libunwind-dev libaudit-dev libslang-dev

相关内容