问题
我已经部署了一个 ECS 集群,并在该集群上运行作业编排平台。该平台的一个容器使用 python docker api 从我们的私有 ECR 存储库中提取容器并在容器内执行作业。当作业开始运行时,它最终会遇到一个问题,即它无法找到容器内定义的假定角色凭据/root/.aws/config
。credential_source=EcsContainer
这是在代码尝试调用 S3 之后发生的。
为什么会发生这种情况?凭证源在容器中定义。为什么找不到它?
细节
错误
......
The above exception was caused by the following exception:
botocore.exceptions.CredentialRetrievalError: Error when retrieving credentials from EcsContainer: No credentials found in credential_source referenced in profile default
File "/usr/local/lib/python3.6/site-packages/dagster/core/execution/plan/utils.py", line 42, in solid_execution_error_boundary
yield
File "/usr/local/lib/python3.6/site-packages/dagster/utils/__init__.py", line 383, in iterate_with_context
next_output = next(iterator)
File "/usr/local/lib/python3.6/site-packages/dagster/core/execution/plan/compute_generator.py", line 65, in _coerce_solid_compute_fn_to_iterator
result = fn(context, **kwargs) if context_arg_provided else fn(**kwargs)
File "/opt/dagster/app/solids/files.py", line 33, in stream_url_to_s3
with smart.open(f's3://{s3_bucket}/{s3_key}', 'wb', transport_params=tp) as s3location:
File "/usr/local/lib/python3.6/site-packages/smart_open/smart_open_lib.py", line 235, in open
binary = _open_binary_stream(uri, binary_mode, transport_params)
File "/usr/local/lib/python3.6/site-packages/smart_open/smart_open_lib.py", line 398, in _open_binary_stream
fobj = submodule.open_uri(uri, mode, transport_params)
File "/usr/local/lib/python3.6/site-packages/smart_open/s3.py", line 224, in open_uri
return open(parsed_uri['bucket_id'], parsed_uri['key_id'], mode, **kwargs)
File "/usr/local/lib/python3.6/site-packages/smart_open/s3.py", line 308, in open
writebuffer=writebuffer,
File "/usr/local/lib/python3.6/site-packages/smart_open/s3.py", line 757, in __init__
_initialize_boto3(self, client, client_kwargs, bucket, key)
File "/usr/local/lib/python3.6/site-packages/smart_open/s3.py", line 528, in _initialize_boto3
client = boto3.client('s3', **init_kwargs)
File "/usr/local/lib/python3.6/site-packages/boto3/__init__.py", line 91, in client
return _get_default_session().client(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/boto3/session.py", line 263, in client
aws_session_token=aws_session_token, config=config)
File "/usr/local/lib/python3.6/site-packages/botocore/session.py", line 826, in create_client
credentials = self.get_credentials()
File "/usr/local/lib/python3.6/site-packages/botocore/session.py", line 431, in get_credentials
'credential_provider').load_credentials()
File "/usr/local/lib/python3.6/site-packages/botocore/credentials.py", line 1962, in load_credentials
creds = provider.load()
File "/usr/local/lib/python3.6/site-packages/botocore/credentials.py", line 1395, in load
return self._load_creds_via_assume_role(self._profile_name)
File "/usr/local/lib/python3.6/site-packages/botocore/credentials.py", line 1410, in _load_creds_via_assume_role
role_config, profile_name
File "/usr/local/lib/python3.6/site-packages/botocore/credentials.py", line 1566, in _resolve_source_credentials
credential_source, profile_name
File "/usr/local/lib/python3.6/site-packages/botocore/credentials.py", line 1623, in _resolve_credentials_from_source
'in profile %s' % profile_name
配置
容器角色:
EcsTaskRole:
Type: AWS::IAM::Role
Properties:
Description: The role assumed by the containers, allowing them to call AWS services.
RoleName: !Sub ecs-task-trans-role-development
AssumeRolePolicyDocument:
Statement:
- Effect: Allow
Principal:
Service:
- ecs-tasks.amazonaws.com
Action:
- sts:AssumeRole
Policies:
- PolicyName: !Sub 's3-access-${EnvironmentName}-${AWS::StackName}'
PolicyDocument:
Statement:
- Effect: Allow
Action:
- s3:*
Resource:
- "*"
/root/.aws/config
在容器中:
[default]
role_arn = arn:aws:iam::<my account>:role/ecs-task-trans-role-development
credential_source = EcsContainer
没有/root/.aws/credentials
文件,因为从配置文件中承担角色的目的是为了检索临时凭据。
https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-role.html
部分的TaskDefinition
:
TaskDefinition:
Type: AWS::ECS::TaskDefinition
Properties:
...
ContainerDefinitions:
...
MountPoints:
- ContainerPath: "/var/run/docker.sock"
SourceVolume: docker_sock
ReadOnly: true
- ContainerPath: "/root/.docker"
SourceVolume: docker_dir
ReadOnly: true
- ContainerPath: "/usr/bin/docker-credential-ecr-login"
SourceVolume: docker_creds
ReadOnly: true
我尝试过
- 使用
taskExecutionRole
而不是容器角色。 AWS_PROFILE=default
在容器中导出