在 aws/lambda/python Dockerfile 中为 rasterio 安装 gdal 的最佳方法

在 aws/lambda/python Dockerfile 中为 rasterio 安装 gdal 的最佳方法

AWS lambda 应用程序的 python 依赖项已超出 AWS Lambdas 的 250 MB 限制。其中一个依赖项是依赖于 gdal 的 rasterio。我正在尝试构建一个 docker 映像以绕过 250 MB 限制并将我们的代码部署到 AWS Lambda(使用 serverless.com)。

方法 1:pip install rasterio

目前我有一个Dockerfile

FROM public.ecr.aws/lambda/python:3.10
RUN pip install rasterio # Fails with error (see below)
WARNING:root:Failed to get options via gdal-config: [Errno 2] No such file or directory: 'gdal-config'
      ERROR: A GDAL API version must be specified. Provide a path to gdal-config using a GDAL_CONFIG environment variable or use a GDAL_VERSION environment variable.

方法 2:yum install gdal-devel

tl; dr:“没有可用的包 gdal-devel。”

方法 3:构建 gdal

tl; dr:很多依赖项。担心这些依赖项也会有需要构建的依赖项。

方法 4:yum install epel-release,然后安装 gdal-devel

  • 需要 fortran:yum -y install libgfortran可以工作但安装了 libgfortran.so.4
  • yum -y install gdal-devel仍然出错,例如“错误:软件包:openblas-openmp-0.3.3-2.el7.aarch64 (epel) 需要:libgfortran.so.3(GFORTRAN_1.0)(64bit)”
  • 我不确定问题出在 libgfortran 版本 4 而不是 3 版本,但我无法轻松安装libgfortran.so.3

方法五:使用aws/sam/build-python容器

service: aws-python-docker-demo
frameworkVersion: "3"

plugins:
  - serverless-python-requirements

custom:
  pythonRequirements:
    usePipenv: true
    layer: true

provider:
  name: aws
  runtime: python3.10
  deploymentBucket:
    blockPublicAccess: true

functions:
  hello:
    handler: src/main.lambda_handler
    layers:
      - !Ref PythonRequirementsLambdaLayer
  • 这个serverless-python-requirements插件似乎使用docker容器public.ecr.aws/sam/build-python3.10来安装python依赖项并将它们压缩为lambda

    • (然后失败,因为 lambda 的依赖项和代码 >= 250 MB 的大小限制)
  • 计划:

    1. 了解如何serverless-python-requirements
      1. public.ecr.aws/sam/build-python3.10在容器内安装 Python 依赖项
      2. zips python 依赖项(大小将大于 250 MB)
    2. 将该 zip 复制到 AWS lambda 的 docker 镜像中。
    3. ... ?

我不确定这是否是一个好方法,但我相信有更好的解决方案。欢迎提出任何建议。

** 更新 ** 有关新方法(第 6 号)并回应@Rob 的友好回答。

方法 6:尝试使用旧的 gdal/lambda docker 镜像

正在进行的工作在这里使用https://hub.docker.com/r/remotepixel/amazonlinux-gdal/。下一步:让它工作,然后从那里迭代到:

  • 更新 gdal
  • 使用最新的 lambda 容器
  • 使用 Python 3.10(我们的应用程序需要)

目前计划重新回答/更新这个 StackOverflow 问题的答案:https://stackoverflow.com/questions/36772111/how-can-i-install-a-recent-version-of-gdal-on-amazon-linux#comment135429542_44907360

目前错误为:

{
    "errorType": "Runtime.InvalidEntrypoint",
    "errorMessage": "RequestId: 2cda4291-3b02-4079-8d59-f1ab111f8dab Error: exec: \"main.lambda_handler\": executable file not found in $PATH"
}

对 Rob 可能的回答的回应

当我运行它时出现以下错误:

cat Dockerfile2
FROM public.ecr.aws/lambda/python:3.10
RUN pip install rasterio
docker --version
Docker version 24.0.6, build ed223bc

MacOS 12.7.2

docker build  -t testing-run-api-dependencies-2 -f ./Dockerfile2 . --progress=plain --no-cache
#0 building with "desktop-linux" instance using docker driver

#1 [internal] load .dockerignore
#1 transferring context: 2B done
#1 DONE 0.0s

#2 [internal] load build definition from Dockerfile2
#2 transferring dockerfile: 101B done
#2 DONE 0.0s

#3 [internal] load metadata for public.ecr.aws/lambda/python:3.10
#3 DONE 1.1s

#4 [1/2] FROM public.ecr.aws/lambda/python:3.10@sha256:f95780930513037d252b6b6165720381a1014096c3be9f2eac620776c8f0d167
#4 CACHED

#5 [2/2] RUN pip install rasterio
#5 1.173 Collecting rasterio
#5 1.229   Downloading rasterio-1.3.9.tar.gz (411 kB)
#5 1.309      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 411.7/411.7 kB 5.5 MB/s eta 0:00:00
#5 1.406   Installing build dependencies: started
#5 8.663   Installing build dependencies: finished with status 'done'
#5 8.666   Getting requirements to build wheel: started
#5 8.934   Getting requirements to build wheel: finished with status 'error'
#5 8.939   error: subprocess-exited-with-error
#5 8.939   
#5 8.939   × Getting requirements to build wheel did not run successfully.
#5 8.939   │ exit code: 1
#5 8.939   ╰─> [3 lines of output]
#5 8.939       <string>:22: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
#5 8.939       WARNING:root:Failed to get options via gdal-config: [Errno 2] No such file or directory: 'gdal-config'
#5 8.939       ERROR: A GDAL API version must be specified. Provide a path to gdal-config using a GDAL_CONFIG environment variable or use a GDAL_VERSION environment variable.
#5 8.939       [end of output]
#5 8.939   
#5 8.939   note: This error originates from a subprocess, and is likely not a problem with pip.
#5 8.942 error: subprocess-exited-with-error
#5 8.942 
#5 8.942 × Getting requirements to build wheel did not run successfully.
#5 8.942 │ exit code: 1
#5 8.942 ╰─> See above for output.
#5 8.942 
#5 8.942 note: This error originates from a subprocess, and is likely not a problem with pip.
#5 8.947 
#5 8.947 [notice] A new release of pip is available: 23.0.1 -> 24.0
#5 8.947 [notice] To update, run: pip install --upgrade pip
#5 ERROR: process "/bin/sh -c pip install rasterio" did not complete successfully: exit code: 1
------
 > [2/2] RUN pip install rasterio:
8.942 error: subprocess-exited-with-error
8.942 
8.942 × Getting requirements to build wheel did not run successfully.
8.942 │ exit code: 1
8.942 ╰─> See above for output.
8.942 
8.942 note: This error originates from a subprocess, and is likely not a problem with pip.
8.947 
8.947 [notice] A new release of pip is available: 23.0.1 -> 24.0
8.947 [notice] To update, run: pip install --upgrade pip
------
Dockerfile2:2
--------------------
   1 |     FROM public.ecr.aws/lambda/python:3.10
   2 | >>> RUN pip install rasterio
--------------------
ERROR: failed to solve: process "/bin/sh -c pip install rasterio" did not complete successfully: exit code: 1

答案1

也许我理解错了,但这可能是您的构建机器/Docker 版本特有的问题。当我逐字逐句地尝试上面的方法 1 在本地构建容器时,它会成功:

$ cat Dockerfile
FROM public.ecr.aws/lambda/python:3.10
RUN pip install rasterio
$ docker build . --progress=plain --no-cache
#1 [internal] load build definition from Dockerfile
#1 sha256:d1ebfcb0fe353fccdb071e1d06a6f600aac8465c1fd5a4883664ca2701cb4bdc
#1 transferring dockerfile: 106B done
#1 DONE 0.0s

#2 [internal] load .dockerignore
#2 sha256:f43dde8419761808ee740a10052d2fccd8c242389cc1a3d9d1e8a894dec623b0
#2 transferring context: 2B done
#2 DONE 0.0s

#3 [internal] load metadata for public.ecr.aws/lambda/python:3.10
#3 sha256:4d27f73d29144c07cb21fedb31129fd6d4bf13e6d609a2728602ed5805b8d5cf
#3 DONE 0.7s

#4 [1/2] FROM public.ecr.aws/lambda/python:3.10@sha256:f95780930513037d252b6b6165720381a1014096c3be9f2eac620776c8f0d167
#4 sha256:39fff7d5ce9d7fffbcb71cd9476e781934b22456974c40953be4ee60fbb44a02
#4 CACHED

#5 [2/2] RUN pip install rasterio
#5 sha256:e902f6dbbb4dd6e5c6fa56ded9099fd59b7ccccdc52a40348207aa252b773b32
#5 0.749 Collecting rasterio
#5 0.878   Downloading rasterio-1.3.9-cp310-cp310-manylinux2014_x86_64.whl (20.6 MB)
#5 3.194      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 20.6/20.6 MB 8.8 MB/s eta 0:00:00
#5 3.284 Requirement already satisfied: setuptools in /var/lang/lib/python3.10/site-packages (from rasterio) (65.5.1)
#5 3.333 Collecting click>=4.0
#5 3.357   Downloading click-8.1.7-py3-none-any.whl (97 kB)
#5 3.372      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 97.9/97.9 kB 7.2 MB/s eta 0:00:00
#5 3.430 Collecting attrs
#5 3.454   Downloading attrs-23.2.0-py3-none-any.whl (60 kB)
#5 3.486      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 60.8/60.8 kB 1.8 MB/s eta 0:00:00
#5 3.534 Collecting cligj>=0.5
#5 3.559   Downloading cligj-0.7.2-py3-none-any.whl (7.1 kB)
#5 3.592 Collecting click-plugins
#5 3.616   Downloading click_plugins-1.1.1-py2.py3-none-any.whl (7.5 kB)
#5 3.653 Collecting affine
#5 3.678   Downloading affine-2.4.0-py3-none-any.whl (15 kB)
#5 3.725 Collecting certifi
#5 3.748   Downloading certifi-2024.2.2-py3-none-any.whl (163 kB)
#5 3.770      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 163.8/163.8 kB 8.2 MB/s eta 0:00:00
#5 3.804 Collecting snuggs>=1.4.1
#5 3.828   Downloading snuggs-1.4.7-py3-none-any.whl (5.4 kB)
#5 4.264 Collecting numpy
#5 4.291   Downloading numpy-1.26.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.2 MB)
#5 6.311      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.2/18.2 MB 8.9 MB/s eta 0:00:00
#5 6.439 Collecting pyparsing>=2.1.6
#5 6.462   Downloading pyparsing-3.1.2-py3-none-any.whl (103 kB)
#5 6.478      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 103.2/103.2 kB 7.3 MB/s eta 0:00:00
#5 6.659 Installing collected packages: pyparsing, numpy, click, certifi, attrs, affine, snuggs, cligj, click-plugins, rasterio
#5 9.264 Successfully installed affine-2.4.0 attrs-23.2.0 certifi-2024.2.2 click-8.1.7 click-plugins-1.1.1 cligj-0.7.2 numpy-1.26.4 pyparsing-3.1.2 rasterio-1.3.9 snuggs-1.4.7
#5 9.264 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
#5 9.398 
#5 9.398 [notice] A new release of pip is available: 23.0.1 -> 24.0
#5 9.398 [notice] To update, run: pip install --upgrade pip
#5 DONE 9.7s

#6 exporting to image
#6 sha256:e8c613e07b0b7ff33893b694f7759a10d42e180f2b4dc349fb57dc6b71dcab00
#6 exporting layers
#6 exporting layers 0.9s done
#6 writing image sha256:f66b9b3349811656692b09d941ecd3234af229da4e7ee14974e08c7d8b8bb3f3 done
#6 DONE 0.9s

现在,理论上,docker 构建不应该对本地机器有任何依赖,在一个地方工作的 docker 构建应该在另一个地方工作,但也许你有一个过时的缓存依赖项?如果您可以访问另一个环境,或者(昂贵,清除所有未使用的缓存图像)并重建,也许可以docker build在另一台机器上尝试。docker system prune -a

相关内容