EC2 上的 ECS 容器无法挂载 EFS 卷

2024-6-2 • tag-icon

我创建了一个由 EC2 自动扩展组支持的 ECS 集群，并在其中启动了一项使用 EFS 进行 NFS 存储的服务。该服务在awsvpc网络模式下运行，因此我能够控制往返于它的流量。有一个安全组允许从其自身和（用于故障排除）从 0.0.0.0/0 访问 TCP2049/NFS4，并且它连接到 EFS 挂载点和 ECS 服务。EFS 和 ECS/EC2 机器都位于同一个 VPC 和相同的三个子网中。

但是，该服务无法在 ECS 中部署任务 - 任务因以下错误而崩溃：

Error response from daemon: create ecs-service-1-images-d6b491fbece8ddc34b00: VolumeDriver.Create: mounting volume failed:
Mount attempt 1/3 failed due to timeout after 15 sec, wait 0 sec before next attempt.
Mount attempt 2/3 failed due to timeout after 15 sec, wait 0 sec before next attempt.
'mount.nfs4: Connection reset by peer'

但是，在 EC2 ECS 主机上安装 EFS 卷是可行的：

[ec2-user@ip-100-xxx ~]$ sudo mount -t efs fs-05dexxxxxxxx /mnt/efs
[ec2-user@ip-100-xxx ~]$ mount | grep fs-05de
fs-05dexxxxxxx.efs.eu-central-1.amazonaws.com:/ on /mnt/efs type nfs4

是什么导致了这种行为？所有资源都在 Terraform 中：

resource "aws_autoscaling_group" "ecs-infrastructure-asg" {
  name = "ecs-infrastructure-asg"
  vpc_zone_identifier = [
    data.aws_subnet.PrivateA.id,
    data.aws_subnet.PrivateB.id,
    data.aws_subnet.PrivateC.id
  ]
}
resource "aws_ecs_capacity_provider" "ecs-infrastructure-cp" {
  name = "infrastructure-cp"
  auto_scaling_group_provider {
    auto_scaling_group_arn         = aws_autoscaling_group.ecs-infrastructure-asg.arn

  }
}
resource "aws_ecs_cluster" "infrastructure" {
  name = "infrastructure"
}
resource "aws_ecs_cluster_capacity_providers" "infrastructure-ccp" {
  cluster_name = aws_ecs_cluster.infrastructure.name
  capacity_providers = [aws_ecs_capacity_provider.ecs-infrastructure-cp.name]
  default_capacity_provider_strategy {
    capacity_provider = aws_ecs_capacity_provider.ecs-infrastructure-cp.name
  }
}

resource "aws_security_group" "passbolt-allow-nfs-inbound" {
  name        = "passbolt-allow-nfs-inbound"
  vpc_id      = data.aws_vpc.VPC01.id
  ingress {
    from_port        = 2049
    to_port          = 2049
    protocol         = "tcp"
    cidr_blocks      = ["0.0.0.0/0"]
    ipv6_cidr_blocks = ["::/0"]
  }
  ingress {
    from_port = 2049
    to_port   = 2049
    protocol  = "tcp"
    self      = true
  }
}
resource "aws_efs_file_system" "passbolt-efs-fs" {
}
resource "aws_efs_mount_target" "passbolt-efs-mt-priva" {
  file_system_id  = aws_efs_file_system.passbolt-efs-fs.id
  subnet_id       = data.aws_subnet.PrivateA.id
  security_groups = [aws_security_group.passbolt-allow-nfs-inbound.id]
}
resource "aws_ecs_task_definition" "passbolt-task" {
  family       = "service"
  network_mode = "awsvpc"
  container_definitions = jsonencode([
    {
      name      = "passbolt-app"
      mountPoints = [
        {
          sourceVolume  = "images"
          containerPath = "/usr/share/php/passbolt/webroot/img/public"
          readOnly      = false
        }      ]
    },
  ])
  volume {
    name = "images"
    efs_volume_configuration {
      file_system_id     = aws_efs_file_system.passbolt-efs-fs.id
      root_directory     = "/images"
    }
  }
}

resource "aws_ecs_service" "infrastructure-passbolt" {
  name            = "infrastructure-passbolt"
  cluster         = aws_ecs_cluster.infrastructure.id
  task_definition = aws_ecs_task_definition.passbolt-task.arn
  desired_count   = 1
  capacity_provider_strategy {
    capacity_provider = aws_ecs_capacity_provider.ecs-infrastructure-cp.name
    weight            = 100
  }
  network_configuration {
    subnets = [
      data.aws_subnet.PrivateA.id,
      data.aws_subnet.PrivateB.id,
      data.aws_subnet.PrivateC.id
    ]
    security_groups = [
      aws_security_group.passbolt-allow-nfs-inbound.id,
    ]
  }
}

答案1

找到原因了 - 由于安全组是由 Terraform 创建的，因此不存在出口规则。添加允许出口的规则即可0.0.0.0/0修复连接。

答案1

相关内容