使用 Nvidia-Docker 构建时出现 cuda-samples 错误?

使用 Nvidia-Docker 构建时出现 cuda-samples 错误?

我正在使用 Ubuntu 20 和 CUDA 11.1

我按照这里的步骤操作:https://developer.nvidia.com/blog/nvidia-docker-gpu-server-application-deployment-made-easy/

当我到达时sudo nvidia-docker build -t device-query .,它说

unable to locate package cuda-samples
the command '/bin/sh -c apt-get update && apt-get install -y --no-install-recommends
cuda-samples:-$CUDA_PKG_VERSION && -rm -rf /var/lib/apt/lists/*'
returned a non-zero code: 100

Dockerfile

# FROM defines the base image
FROM nvidia/cuda

# RUN executes a shell command
# You can chain multiple commands together with && 
# A \ is used to split long lines to help with readability
# This particular instruction installs the source files 
# for deviceQuery by installing the CUDA samples via apt
RUN apt-get update && apt-get install -y --no-install-recommends \
        cuda-samples-$CUDA_PKG_VERSION && \
    rm -rf /var/lib/apt/lists/*

# set the working directory 
WORKDIR /usr/local/cuda-11.1/samples/1_Utilities/deviceQuery

RUN make

# CMD defines the default command to be run in the container 
# CMD is overridden by supplying a command + arguments to 
# `docker run`, e.g. `nvcc --version` or `bash`
CMD ./deviceQuery

有人可以帮忙吗?

答案1

是的,nvidia 教程不是最好的。你必须像这样更改 Dockerfile(参见man diff):

2c2
< FROM nvidia/cuda
--- 
> FROM nvidia/cuda:11.1-devel-ubuntu20.04
9,11c9,11
< RUN apt-get update && apt-get install -y --no-install-recommends \\
<         cuda-samples-$CUDA_PKG_VERSION && \
<     rm -rf /var/lib/apt/lists/*
---
> RUN apt-get update && \
>         apt-get install -y --no-install-recommends cuda-samples-11.1 && \
>         rm -rf /var/lib/apt/lists/*
14c14
< WORKDIR /usr/local/cuda-11.1/samples/1_Utilities/deviceQuery
---
> WORKDIR /usr/local/cuda/samples/1_Utilities/deviceQuery

然后开始

nvidia-docker container run --rm -ti --gpus all device-query

相关内容