从 CUDA 10.1 开始,用户需要拥有 sudo 权限才能使用 cuda 分析工具(例如nvprof
或 nsightcompute ncu
)收集高级指标。
这里描述了解决这个问题的替代方案:
上面的链接提到可以使用 CAP_SYS_ADMIN 来启用这些指标的收集。
为了理解这个问题,我发现了这个富有洞察力的堆栈溢出响应:
如果我错了,请纠正我,但为了继续使用 CAP_SYS_ADMIN 路径,我应该启用应用程序的功能和用户(如果是非 root 用户)。
我不熟悉 Linux 功能,并且不确定是否最好将 CAP_SYS_ADMIN 授予用户/应用程序或仅授予用户 SUDO 访问权限。为什么一个比另一个更好?
编辑:截至目前,我仍然无法让它工作。
# First I executed
$ sudo setcap cap_sys_admin+ep /usr/local/cuda/bin/nvprof
# This is the command that I am executing after installing the CUDA toolkit 10.2.
$ /usr/local/cuda/bin/nvprof -o output-detailed.nvvp -f --analysis-metrics /usr/local/cuda/extras/demo_suite/vectorAdd
[Vector addition of 50000 elements]
==142443== NVPROF is profiling process 142443, command: /usr/local/cuda/extras/demo_suite/vectorAdd
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
==142443== Some kernel(s) will be replayed on device 0 in order to collect all events/metrics.
==142443== Warning: ERR_NVGPUCTRPERM - The user does not have permission to profile on the target device. See the following link for instructions to enable permissions and get more information: https://developer.nvidia.com/ERR_NVGPUCTRPERM
Failed to launch vectorAdd kernel (error code unknown error)!
==142443== Warning: ERR_NVGPUCTRPERM - The user does not have permission to profile on the target device. See the following link for instructions to enable permissions and get more information: https://developer.nvidia.com/ERR_NVGPUCTRPERM
==142443== Warning: Some profiling data are not recorded. Make sure cudaProfilerStop() or cuProfilerStop() is called before application exit to flush profile data.
==142443== Generated result file: /results/nvprof/output-detailed.nvvp
但只有当我使用 sudo 运行时它才有效。
$ sudo /usr/local/cuda/bin/nvprof -o output-detailed.nvvp -f --analysis-metrics /usr/local/cuda/extras/demo_suite/vectorAdd
[Vector addition of 50000 elements]
==142687== NVPROF is profiling process 142687, command: /usr/local/cuda/extras/demo_suite/vectorAdd
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
==142687== Some kernel(s) will be replayed on device 0 in order to collect all events/metrics.
Replaying kernel "vectorAdd(float const *, float const *, float*, int)" (done)
Copy output data from the CUDA device to the host memory
Test PASSED
Done
==142687== Generated result file: /home/agostini/Development/nvprof/output-detailed.nvvp
为什么向可执行文件授予功能并让超级用户在没有 sudo 的情况下运行应用程序是不够的。即使对于 sudo 组中的用户来说,是否真的需要 PAM 设置?
答案1
我不知道具体是如何CAP_SYS_ADMIN
工作的,但遵循可能会更容易说明这使得非 root 用户可以使用探查器。
echo 'options nvidia "NVreg_RestrictProfilingToAdminUsers=0"' | sudo tee -a /etc/modprobe.d/nvidia.conf
sudo update-initramfs -u
sudo reboot