ECS 容器内的 CredentialRetrievalError

ECS 容器内的 CredentialRetrievalError

问题

我已经部署了一个 ECS 集群,并在该集群上运行作业编排平台。该平台的一个容器使用 python docker api 从我们的私有 ECR 存储库中提取容器并在容器内执行作业。当作业开始运行时,它最终会遇到一个问题,即它无法找到容器内定义的假定角色凭据/root/.aws/configcredential_source=EcsContainer这是在代码尝试调用 S3 之后发生的。

为什么会发生这种情况?凭证源在容器中定义。为什么找不到它?

细节

错误

......

The above exception was caused by the following exception:
botocore.exceptions.CredentialRetrievalError: Error when retrieving credentials from EcsContainer: No credentials found in credential_source referenced in profile default
  File "/usr/local/lib/python3.6/site-packages/dagster/core/execution/plan/utils.py", line 42, in solid_execution_error_boundary
    yield
  File "/usr/local/lib/python3.6/site-packages/dagster/utils/__init__.py", line 383, in iterate_with_context
    next_output = next(iterator)
  File "/usr/local/lib/python3.6/site-packages/dagster/core/execution/plan/compute_generator.py", line 65, in _coerce_solid_compute_fn_to_iterator
    result = fn(context, **kwargs) if context_arg_provided else fn(**kwargs)
  File "/opt/dagster/app/solids/files.py", line 33, in stream_url_to_s3
    with smart.open(f's3://{s3_bucket}/{s3_key}', 'wb', transport_params=tp) as s3location:
  File "/usr/local/lib/python3.6/site-packages/smart_open/smart_open_lib.py", line 235, in open
    binary = _open_binary_stream(uri, binary_mode, transport_params)
  File "/usr/local/lib/python3.6/site-packages/smart_open/smart_open_lib.py", line 398, in _open_binary_stream
    fobj = submodule.open_uri(uri, mode, transport_params)
  File "/usr/local/lib/python3.6/site-packages/smart_open/s3.py", line 224, in open_uri
    return open(parsed_uri['bucket_id'], parsed_uri['key_id'], mode, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/smart_open/s3.py", line 308, in open
    writebuffer=writebuffer,
  File "/usr/local/lib/python3.6/site-packages/smart_open/s3.py", line 757, in __init__
    _initialize_boto3(self, client, client_kwargs, bucket, key)
  File "/usr/local/lib/python3.6/site-packages/smart_open/s3.py", line 528, in _initialize_boto3
    client = boto3.client('s3', **init_kwargs)
  File "/usr/local/lib/python3.6/site-packages/boto3/__init__.py", line 91, in client
    return _get_default_session().client(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/boto3/session.py", line 263, in client
    aws_session_token=aws_session_token, config=config)
  File "/usr/local/lib/python3.6/site-packages/botocore/session.py", line 826, in create_client
    credentials = self.get_credentials()
  File "/usr/local/lib/python3.6/site-packages/botocore/session.py", line 431, in get_credentials
    'credential_provider').load_credentials()
  File "/usr/local/lib/python3.6/site-packages/botocore/credentials.py", line 1962, in load_credentials
    creds = provider.load()
  File "/usr/local/lib/python3.6/site-packages/botocore/credentials.py", line 1395, in load
    return self._load_creds_via_assume_role(self._profile_name)
  File "/usr/local/lib/python3.6/site-packages/botocore/credentials.py", line 1410, in _load_creds_via_assume_role
    role_config, profile_name
  File "/usr/local/lib/python3.6/site-packages/botocore/credentials.py", line 1566, in _resolve_source_credentials
    credential_source, profile_name
  File "/usr/local/lib/python3.6/site-packages/botocore/credentials.py", line 1623, in _resolve_credentials_from_source
    'in profile %s' % profile_name

配置

容器角色:

  EcsTaskRole:
    Type: AWS::IAM::Role
    Properties:
      Description: The role assumed by the containers, allowing them to call AWS services.
      RoleName: !Sub ecs-task-trans-role-development
      AssumeRolePolicyDocument:
        Statement:
        - Effect: Allow
          Principal:
            Service:
            - ecs-tasks.amazonaws.com
          Action:
          - sts:AssumeRole
      Policies:
      - PolicyName: !Sub 's3-access-${EnvironmentName}-${AWS::StackName}'
        PolicyDocument:
          Statement:
          - Effect: Allow
            Action:
              - s3:*
            Resource:
              - "*"

/root/.aws/config在容器中:

[default]
role_arn = arn:aws:iam::<my account>:role/ecs-task-trans-role-development
credential_source = EcsContainer

没有/root/.aws/credentials文件,因为从配置文件中承担角色的目的是为了检索临时凭据。 https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-role.html

部分的TaskDefinition


  TaskDefinition:
    Type: AWS::ECS::TaskDefinition
    Properties:
      ...
      ContainerDefinitions:
          ...
          MountPoints:
            - ContainerPath: "/var/run/docker.sock"
              SourceVolume: docker_sock
              ReadOnly: true
            - ContainerPath: "/root/.docker"
              SourceVolume: docker_dir
              ReadOnly: true
            - ContainerPath: "/usr/bin/docker-credential-ecr-login"
              SourceVolume: docker_creds
              ReadOnly: true

我尝试过

  1. 使用taskExecutionRole而不是容器角色。
  2. AWS_PROFILE=default在容器中导出

相关内容