我正在尝试使我的 Docker 镜像更小(由于深度学习 Python 包很大,所以需要 4GB+)。
到目前为止我有两个想法:
- 从 切换
python:bullseye
到python:slim-bullseye
可节省 300MB。 - 安装包以
--no-cache-dir
防止缓存安装。
还有其他想法吗?
Dockerfile
# Debian OS
FROM python:3.7.12-bullseye
# ========= ROOT COMMANDS =========
# `jupyter lab` won't run as root, and root is bad practice.
# So we create a regular user.
RUN apt update
RUN apt install sudo
# Create user;
RUN useradd --create-home --password RapidRigorReproduce aiqc_usr
# Make that user an admin; can't install apt-get dependencies without `sudo` prefix otherwise.
RUN usermod -aG sudo aiqc_usr
# Give that user permissions within their home directory, /var for apt-get, /usr/local for python packages.
RUN chown -R aiqc_usr /home/aiqc_usr /var /usr/local /usr/bin/dpkg /var/cache
# Switch to that user; root user's apt-get binaries are not shared w new user.
RUN su - aiqc_usr
# ========= USER COMMANDS =========
# Can't install nodejs without updating package manager.
# Only need to use the pw once when running sudo commands.
RUN echo "RapidRigorReproduce" | sudo -S apt update
RUN sudo apt upgrade -y
RUN sudo apt update
# Add the registry that contains node
RUN sudo apt -y install curl dirmngr apt-transport-https lsb-release ca-certificates
RUN curl -sL https://deb.nodesource.com/setup_12.x | sudo -E bash -
# Install node
RUN sudo apt -y install nodejs
# For Sphinx documentation.
RUN sudo apt -y install pandoc
RUN pip install --upgrade pip
# Developer packages
# Contains JupyterLab and I want this installed prior to plotly.
# Docker paths are can't access parent directories.
COPY requirements_dev.txt /
RUN pip install --default-timeout=100 -r requirements_dev.txt
RUN rm requirements_dev.txt
# pip packages
# Installing plotly>=5.0.0 includes the prebuilt jupyter extension.
COPY requirements.txt /
RUN pip install --default-timeout=100 -r requirements.txt
RUN rm requirements.txt
# Create a place to mount the source code so that it can be imported.
RUN mkdir /home/aiqc_usr/AIQC
答案1
答案是两阶段构建。当你包含编译所需的所有工具时,镜像会变得很大(本机代码编译经常在你不知情的情况下发生pip install
)
https://pythonspeed.com/articles/multi-stage-docker-python/
伪代码
FROM big AS big-image
RUN apt-get install gcc
FROM small AS runtime-image
COPY --from=big-image /tmp .