我正在使用 Ubuntu 20 和 CUDA 11.1
我按照这里的步骤操作:https://developer.nvidia.com/blog/nvidia-docker-gpu-server-application-deployment-made-easy/
当我到达时sudo nvidia-docker build -t device-query .
,它说
unable to locate package cuda-samples
the command '/bin/sh -c apt-get update && apt-get install -y --no-install-recommends
cuda-samples:-$CUDA_PKG_VERSION && -rm -rf /var/lib/apt/lists/*'
returned a non-zero code: 100
是Dockerfile
:
# FROM defines the base image
FROM nvidia/cuda
# RUN executes a shell command
# You can chain multiple commands together with &&
# A \ is used to split long lines to help with readability
# This particular instruction installs the source files
# for deviceQuery by installing the CUDA samples via apt
RUN apt-get update && apt-get install -y --no-install-recommends \
cuda-samples-$CUDA_PKG_VERSION && \
rm -rf /var/lib/apt/lists/*
# set the working directory
WORKDIR /usr/local/cuda-11.1/samples/1_Utilities/deviceQuery
RUN make
# CMD defines the default command to be run in the container
# CMD is overridden by supplying a command + arguments to
# `docker run`, e.g. `nvcc --version` or `bash`
CMD ./deviceQuery
有人可以帮忙吗?
答案1
是的,nvidia 教程不是最好的。你必须像这样更改 Dockerfile(参见man diff
):
2c2
< FROM nvidia/cuda
---
> FROM nvidia/cuda:11.1-devel-ubuntu20.04
9,11c9,11
< RUN apt-get update && apt-get install -y --no-install-recommends \\
< cuda-samples-$CUDA_PKG_VERSION && \
< rm -rf /var/lib/apt/lists/*
---
> RUN apt-get update && \
> apt-get install -y --no-install-recommends cuda-samples-11.1 && \
> rm -rf /var/lib/apt/lists/*
14c14
< WORKDIR /usr/local/cuda-11.1/samples/1_Utilities/deviceQuery
---
> WORKDIR /usr/local/cuda/samples/1_Utilities/deviceQuery
然后开始
nvidia-docker container run --rm -ti --gpus all device-query