如何通过 CloudFormation 创建带有 VPC CNI 插件的 EKS 集群?

如何通过 CloudFormation 创建带有 VPC CNI 插件的 EKS 集群?

我通过 cloudformation 创建了一个 EKS 集群(1.24),它在没有 CNI 插件的情况下运行良好,但当我添加 vpc-cni 插件时失败:


  AddonCNI:
    Type: 'AWS::EKS::Addon'
    Properties:
      AddonName: vpc-cni
      AddonVersion: v1.12.0-eksbuild.1
      ClusterName: !Ref ControlPlane
      ResolveConflicts: OVERWRITE
      ServiceAccountRoleArn: !GetAtt
      - CNIRole
      - Arn
      Tags:
      - Key: Name
        Value: !Sub '${AWS::StackName}/AddonCNI'
    DependsOn:
    - CNIRole

  CNIRole:
    Type: 'AWS::IAM::Role'
    Properties:
      AssumeRolePolicyDocument:
        Statement:
        - Action:
          - 'sts:AssumeRole'
          Effect: Allow
          Principal:
            Service:
            - !FindInMap
              - ServicePrincipalPartitionMap
              - !Ref 'AWS::Partition'
              - EKS
        Version: 2012-10-17
      ManagedPolicyArns:
      - !Sub 'arn:${AWS::Partition}:iam::aws:policy/AmazonEKS_CNI_Policy'
      - !Sub 'arn:${AWS::Partition}:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly'
      Tags:
      - Key: Name
        Value: !Sub '${AWS::StackName}/CNIRole'

我还添加了AmazonEKS_CNI_Policy节点角色。

节点状态为 NotReady:

container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized

节点 pod 日志:

Defaulted container "aws-node" out of: aws-node, aws-vpc-cni-init (init)
{"level":"info","ts":"2023-01-06T16:24:55.411Z","caller":"entrypoint.sh","msg":"Validating env variables ..."}
{"level":"info","ts":"2023-01-06T16:24:55.412Z","caller":"entrypoint.sh","msg":"Install CNI binaries.."}
{"level":"info","ts":"2023-01-06T16:24:55.424Z","caller":"entrypoint.sh","msg":"Starting IPAM daemon in the background ... "}
{"level":"info","ts":"2023-01-06T16:24:55.425Z","caller":"entrypoint.sh","msg":"Checking for IPAM connectivity ... "}
{"level":"info","ts":"2023-01-06T16:24:56.430Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2023-01-06T16:24:57.435Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}

/host/var/log/aws-routed-eni/ipamd.log来自节点 pod 容器内部的日志(部分):

Defaulted container "aws-node" out of: aws-node, aws-vpc-cni-init (init)
{"level":"info","ts":"2023-01-06T13:19:47.759Z","caller":"logger/logger.go:52","msg":"Constructed new logger instance"}
{"level":"info","ts":"2023-01-06T13:19:47.759Z","caller":"eniconfig/eniconfig.go:61","msg":"Initialized new logger as an existing instance was not found"}
{"level":"info","ts":"2023-01-06T13:19:48.012Z","caller":"aws-k8s-agent/main.go:28","msg":"Starting L-IPAMD   ..."}
{"level":"info","ts":"2023-01-06T13:19:48.020Z","caller":"aws-k8s-agent/main.go:39","msg":"Testing communication with server"}
{"level":"info","ts":"2023-01-06T13:19:48.063Z","caller":"wait/wait.go:211","msg":"Successful communication with the Cluster! Cluster Version is: v1.24+. git version: v1.24.8-eks-ffeb93d. git tree state: clean. commit: abb98ec0631dfe573ec5eae40dc48fd8f2017424. platform: linux/amd64"}
{"level":"warn","ts":"2023-01-06T13:19:48.083Z","caller":"awssession/session.go:64","msg":"HTTP_TIMEOUT env is not set or set to less than 10 seconds, defaulting to httpTimeout to 10sec"}
{"level":"debug","ts":"2023-01-06T13:19:48.085Z","caller":"ipamd/ipamd.go:379","msg":"Discovered region: us-east-1"}
{"level":"info","ts":"2023-01-06T13:19:48.085Z","caller":"ipamd/ipamd.go:379","msg":"Custom networking enabled false"}
{"level":"debug","ts":"2023-01-06T13:19:48.085Z","caller":"awsutils/awsutils.go:415","msg":"Found availability zone: us-east-1c "}
{"level":"debug","ts":"2023-01-06T13:19:48.086Z","caller":"awsutils/awsutils.go:415","msg":"Discovered the instance primary IPv4 address: 10.0.66.216"}
{"level":"debug","ts":"2023-01-06T13:19:48.086Z","caller":"awsutils/awsutils.go:415","msg":"Found instance-id: i-06b7496334df06d96 "}
{"level":"debug","ts":"2023-01-06T13:19:48.087Z","caller":"awsutils/awsutils.go:415","msg":"Found instance-type: c5.xlarge "}
{"level":"debug","ts":"2023-01-06T13:19:48.088Z","caller":"awsutils/awsutils.go:415","msg":"Found primary interface's MAC address: 0a:3f:bd:93:2a:8d"}
{"level":"debug","ts":"2023-01-06T13:19:48.088Z","caller":"awsutils/awsutils.go:415","msg":"eni-05097e0aa87b119d5 is the primary ENI of this instance"}
{"level":"debug","ts":"2023-01-06T13:19:48.089Z","caller":"awsutils/awsutils.go:415","msg":"Found subnet-id: subnet-0e9870d3f07c0c322 "}
{"level":"debug","ts":"2023-01-06T13:19:48.089Z","caller":"ipamd/ipamd.go:388","msg":"Using WARM_ENI_TARGET 1"}
{"level":"debug","ts":"2023-01-06T13:19:48.089Z","caller":"ipamd/ipamd.go:391","msg":"Using WARM_PREFIX_TARGET 1"}
{"level":"info","ts":"2023-01-06T13:19:48.089Z","caller":"ipamd/ipamd.go:409","msg":"Prefix Delegation enabled false"}
{"level":"debug","ts":"2023-01-06T13:19:48.089Z","caller":"ipamd/ipamd.go:414","msg":"Start node init"}
{"level":"debug","ts":"2023-01-06T13:19:48.089Z","caller":"ipamd/ipamd.go:446","msg":"Max ip per ENI 14 and max prefixes per ENI 0"}
{"level":"info","ts":"2023-01-06T13:19:48.089Z","caller":"ipamd/ipamd.go:456","msg":"Setting up host network... "}
{"level":"debug","ts":"2023-01-06T13:19:48.089Z","caller":"networkutils/network.go:280","msg":"Trying to find primary interface that has mac : 0a:3f:bd:93:2a:8d"}
{"level":"debug","ts":"2023-01-06T13:19:48.089Z","caller":"networkutils/network.go:280","msg":"Discovered interface: lo, mac: "}
{"level":"debug","ts":"2023-01-06T13:19:48.089Z","caller":"networkutils/network.go:280","msg":"Discovered interface: eth0, mac: 0a:3f:bd:93:2a:8d"}
{"level":"info","ts":"2023-01-06T13:19:48.089Z","caller":"networkutils/network.go:280","msg":"Discovered primary interface: eth0"}
{"level":"info","ts":"2023-01-06T13:19:48.089Z","caller":"ipamd/ipamd.go:456","msg":"Skip updating RPF for primary interface: net/ipv4/conf/eth0/rp_filter"}
{"level":"info","ts":"2023-01-06T13:19:48.090Z","caller":"awsutils/awsutils.go:1643","msg":"Will attempt to clean up AWS CNI leaked ENIs after waiting 4m41s."}
{"level":"debug","ts":"2023-01-06T13:19:48.090Z","caller":"networkutils/network.go:307","msg":"Found the Link that uses mac address 0a:3f:bd:93:2a:8d and its index is 2 (attempt 1/5)"}
{"level":"debug","ts":"2023-01-06T13:19:48.090Z","caller":"networkutils/network.go:383","msg":"Trying to find primary interface that has mac : 0a:3f:bd:93:2a:8d"}
{"level":"debug","ts":"2023-01-06T13:19:48.090Z","caller":"networkutils/network.go:383","msg":"Discovered interface: lo, mac: "}
{"level":"debug","ts":"2023-01-06T13:19:48.090Z","caller":"networkutils/network.go:383","msg":"Discovered interface: eth0, mac: 0a:3f:bd:93:2a:8d"}
{"level":"info","ts":"2023-01-06T13:19:48.090Z","caller":"networkutils/network.go:383","msg":"Discovered primary interface: eth0"}
{"level":"debug","ts":"2023-01-06T13:19:48.187Z","caller":"networkutils/network.go:403","msg":"Adding 10.0.0.0/16 CIDR to NAT chain"}
{"level":"debug","ts":"2023-01-06T13:19:48.187Z","caller":"networkutils/network.go:403","msg":"Total CIDRs to program - 1"}
{"level":"debug","ts":"2023-01-06T13:19:48.187Z","caller":"networkutils/network.go:403","msg":"Setup Host Network: iptables -N AWS-SNAT-CHAIN-0 -t nat"}
{"level":"debug","ts":"2023-01-06T13:19:48.189Z","caller":"networkutils/network.go:403","msg":"Setup Host Network: iptables -N AWS-SNAT-CHAIN-1 -t nat"}
{"level":"debug","ts":"2023-01-06T13:19:48.190Z","caller":"networkutils/network.go:403","msg":"Setup Host Network: iptables -A POSTROUTING -m comment --comment \"AWS SNAT CHAIN\" -j AWS-SNAT-CHAIN-0"}
{"level":"debug","ts":"2023-01-06T13:19:48.190Z","caller":"networkutils/network.go:403","msg":"Setup Host Network: iptables -A AWS-SNAT-CHAIN-0 ! -d {10.0.0.0/16 %!s(bool=false)} -t nat -j AWS-SNAT-CHAIN-1"}
{"level":"debug","ts":"2023-01-06T13:19:48.190Z","caller":"networkutils/network.go:714","msg":"Setup Host Network: loading existing iptables nat rules with chain prefix AWS-SNAT-CHAIN"}
{"level":"debug","ts":"2023-01-06T13:19:48.237Z","caller":"networkutils/network.go:714","msg":"host network setup: found potentially stale SNAT rule for chain AWS-SNAT-CHAIN-0: [-N AWS-SNAT-CHAIN-0]"}
{"level":"debug","ts":"2023-01-06T13:19:48.238Z","caller":"networkutils/network.go:714","msg":"host network setup: found potentially stale SNAT rule for chain AWS-SNAT-CHAIN-1: [-N AWS-SNAT-CHAIN-1]"}
{"level":"debug","ts":"2023-01-06T13:19:48.238Z","caller":"networkutils/network.go:509","msg":"Setup Host Network: computing stale iptables rules for %s table with chain prefix %s"}
{"level":"debug","ts":"2023-01-06T13:19:48.238Z","caller":"networkutils/network.go:509","msg":"Setup Host Network: active chain found: AWS-SNAT-CHAIN-0"}
{"level":"debug","ts":"2023-01-06T13:19:48.238Z","caller":"networkutils/network.go:509","msg":"Setup Host Network: active chain found: AWS-SNAT-CHAIN-1"}
{"level":"debug","ts":"2023-01-06T13:19:48.238Z","caller":"networkutils/network.go:403","msg":"iptableRules: [nat/POSTROUTING rule first SNAT rules for non-VPC outbound traffic shouldExist true rule [-m comment --comment AWS SNAT CHAIN -j AWS-SNAT-CHAIN-0] nat/AWS-SNAT-CHAIN-0 rule [0] AWS-SNAT-CHAIN shouldExist true rule [! -d 10.0.0.0/16 -m comment --comment AWS SNAT CHAIN -j AWS-SNAT-CHAIN-1] nat/AWS-SNAT-CHAIN-1 rule last SNAT rule for non-VPC outbound traffic shouldExist true rule [! -o vlan+ -m comment --comment AWS, SNAT -m addrtype ! --dst-type LOCAL -j SNAT --to-source 10.0.66.216 --random-fully] mangle/PREROUTING rule connmark for primary ENI shouldExist true rule [-m comment --comment AWS, primary ENI -i eth0 -m addrtype --dst-type LOCAL --limit-iface-in -j CONNMARK --set-mark 0x80/0x80] mangle/PREROUTING rule connmark restore for primary ENI shouldExist true rule [-m comment --comment AWS, primary ENI -i eni+ -j CONNMARK --restore-mark --mask 0x80] mangle/PREROUTING rule connmark restore for primary ENI from vlan shouldExist true rule [-m comment --comment AWS, primary ENI -i vlan+ -j CONNMARK --restore-mark --mask 0x80]]"}
{"level":"debug","ts":"2023-01-06T13:19:48.238Z","caller":"networkutils/network.go:407","msg":"execute iptable rule : first SNAT rules for non-VPC outbound traffic"}
{"level":"debug","ts":"2023-01-06T13:19:48.240Z","caller":"networkutils/network.go:407","msg":"rule nat/POSTROUTING rule first SNAT rules for non-VPC outbound traffic shouldExist true rule [-m comment --comment AWS SNAT CHAIN -j AWS-SNAT-CHAIN-0] exists false, err <nil>"}
{"level":"debug","ts":"2023-01-06T13:19:48.241Z","caller":"networkutils/network.go:407","msg":"execute iptable rule : [0] AWS-SNAT-CHAIN"}
{"level":"debug","ts":"2023-01-06T13:19:48.242Z","caller":"networkutils/network.go:407","msg":"rule nat/AWS-SNAT-CHAIN-0 rule [0] AWS-SNAT-CHAIN shouldExist true rule [! -d 10.0.0.0/16 -m comment --comment AWS SNAT CHAIN -j AWS-SNAT-CHAIN-1] exists false, err <nil>"}
{"level":"debug","ts":"2023-01-06T13:19:48.243Z","caller":"networkutils/network.go:407","msg":"execute iptable rule : last SNAT rule for non-VPC outbound traffic"}
{"level":"debug","ts":"2023-01-06T13:19:48.244Z","caller":"networkutils/network.go:407","msg":"rule nat/AWS-SNAT-CHAIN-1 rule last SNAT rule for non-VPC outbound traffic shouldExist true rule [! -o vlan+ -m comment --comment AWS, SNAT -m addrtype ! --dst-type LOCAL -j SNAT --to-source 10.0.66.216 --random-fully] exists false, err <nil>"}
{"level":"debug","ts":"2023-01-06T13:19:48.245Z","caller":"networkutils/network.go:407","msg":"execute iptable rule : connmark for primary ENI"}
{"level":"debug","ts":"2023-01-06T13:19:48.250Z","caller":"networkutils/network.go:407","msg":"rule mangle/PREROUTING rule connmark for primary ENI shouldExist true rule [-m comment --comment AWS, primary ENI -i eth0 -m addrtype --dst-type LOCAL --limit-iface-in -j CONNMARK --set-mark 0x80/0x80] exists false, err <nil>"}
{"level":"debug","ts":"2023-01-06T13:19:48.251Z","caller":"networkutils/network.go:407","msg":"execute iptable rule : connmark restore for primary ENI"}
{"level":"debug","ts":"2023-01-06T13:19:48.252Z","caller":"networkutils/network.go:407","msg":"rule mangle/PREROUTING rule connmark restore for primary ENI shouldExist true rule [-m comment --comment AWS, primary ENI -i eni+ -j CONNMARK --restore-mark --mask 0x80] exists false, err <nil>"}
{"level":"debug","ts":"2023-01-06T13:19:48.253Z","caller":"networkutils/network.go:407","msg":"execute iptable rule : connmark restore for primary ENI from vlan"}
{"level":"debug","ts":"2023-01-06T13:19:48.254Z","caller":"networkutils/network.go:407","msg":"rule mangle/PREROUTING rule connmark restore for primary ENI from vlan shouldExist true rule [-m comment --comment AWS, primary ENI -i vlan+ -j CONNMARK --restore-mark --mask 0x80] exists false, err <nil>"}
{"level":"debug","ts":"2023-01-06T13:19:48.255Z","caller":"networkutils/network.go:411","msg":"Total CIDRs to exempt from connmark rules - 1"}
{"level":"debug","ts":"2023-01-06T13:19:48.255Z","caller":"networkutils/network.go:411","msg":"Setup Host Network: iptables -N AWS-CONNMARK-CHAIN-0 -t nat"}
{"level":"debug","ts":"2023-01-06T13:19:48.256Z","caller":"networkutils/network.go:411","msg":"Setup Host Network: iptables -N AWS-CONNMARK-CHAIN-1 -t nat"}
{"level":"debug","ts":"2023-01-06T13:19:48.257Z","caller":"networkutils/network.go:411","msg":"Setup Host Network: iptables -t nat -A PREROUTING -i eni+ -m comment --comment \"AWS, outbound connections\" -m state --state NEW -j AWS-CONNMARK-CHAIN-0"}
{"level":"debug","ts":"2023-01-06T13:19:48.257Z","caller":"networkutils/network.go:411","msg":"Setup Host Network: iptables -A AWS-CONNMARK-CHAIN-0 ! -d 10.0.0.0/16 -t nat -j AWS-CONNMARK-CHAIN-1"}
{"level":"debug","ts":"2023-01-06T13:19:48.257Z","caller":"networkutils/network.go:714","msg":"Setup Host Network: loading existing iptables nat rules with chain prefix AWS-CONNMARK-CHAIN"}
{"level":"debug","ts":"2023-01-06T13:19:48.259Z","caller":"networkutils/network.go:714","msg":"host network setup: found potentially stale SNAT rule for chain AWS-CONNMARK-CHAIN-0: [-N AWS-CONNMARK-CHAIN-0]"}
{"level":"debug","ts":"2023-01-06T13:19:48.260Z","caller":"networkutils/network.go:714","msg":"host network setup: found potentially stale SNAT rule for chain AWS-CONNMARK-CHAIN-1: [-N AWS-CONNMARK-CHAIN-1]"}
{"level":"debug","ts":"2023-01-06T13:19:48.260Z","caller":"networkutils/network.go:639","msg":"Setup Host Network: computing stale iptables rules for %s table with chain prefix %s"}
{"level":"debug","ts":"2023-01-06T13:19:48.260Z","caller":"networkutils/network.go:639","msg":"Setup Host Network: active chain found: AWS-CONNMARK-CHAIN-0"}
{"level":"debug","ts":"2023-01-06T13:19:48.260Z","caller":"networkutils/network.go:639","msg":"Setup Host Network: active chain found: AWS-CONNMARK-CHAIN-1"}
{"level":"debug","ts":"2023-01-06T13:19:48.260Z","caller":"networkutils/network.go:411","msg":"iptableRules: [nat/PREROUTING rule connmark rule for non-VPC outbound traffic shouldExist true rule [-i eni+ -m comment --comment AWS, outbound connections -m state --state NEW -j AWS-CONNMARK-CHAIN-0] nat/AWS-CONNMARK-CHAIN-0 rule [0] AWS-SNAT-CHAIN shouldExist true rule [! -d 10.0.0.0/16 -m comment --comment AWS CONNMARK CHAIN, VPC CIDR -j AWS-CONNMARK-CHAIN-1] nat/AWS-CONNMARK-CHAIN-1 rule connmark rule for external  outbound traffic shouldExist true rule [-m comment --comment AWS, CONNMARK -j CONNMARK --set-xmark 0x80/0x80] nat/PREROUTING rule connmark to fwmark copy shouldExist false rule [-m comment --comment AWS, CONNMARK -j CONNMARK --restore-mark --mask 0x80] nat/PREROUTING rule connmark to fwmark copy shouldExist true rule [-m comment --comment AWS, CONNMARK -j CONNMARK --restore-mark --mask 0x80]]"}
{"level":"debug","ts":"2023-01-06T13:19:48.260Z","caller":"networkutils/network.go:415","msg":"execute iptable rule : connmark rule for non-VPC outbound traffic"}
{"level":"debug","ts":"2023-01-06T13:19:48.264Z","caller":"networkutils/network.go:415","msg":"rule nat/PREROUTING rule connmark rule for non-VPC outbound traffic shouldExist true rule [-i eni+ -m comment --comment AWS, outbound connections -m state --state NEW -j AWS-CONNMARK-CHAIN-0] exists false, err <nil>"}
{"level":"debug","ts":"2023-01-06T13:19:48.266Z","caller":"networkutils/network.go:415","msg":"execute iptable rule : [0] AWS-SNAT-CHAIN"}
{"level":"debug","ts":"2023-01-06T13:19:48.267Z","caller":"networkutils/network.go:415","msg":"rule nat/AWS-CONNMARK-CHAIN-0 rule [0] AWS-SNAT-CHAIN shouldExist true rule [! -d 10.0.0.0/16 -m comment --comment AWS CONNMARK CHAIN, VPC CIDR -j AWS-CONNMARK-CHAIN-1] exists false, err <nil>"}
{"level":"debug","ts":"2023-01-06T13:19:48.267Z","caller":"networkutils/network.go:415","msg":"execute iptable rule : connmark rule for external  outbound traffic"}
{"level":"debug","ts":"2023-01-06T13:19:48.269Z","caller":"networkutils/network.go:415","msg":"rule nat/AWS-CONNMARK-CHAIN-1 rule connmark rule for external  outbound traffic shouldExist true rule [-m comment --comment AWS, CONNMARK -j CONNMARK --set-xmark 0x80/0x80] exists false, err <nil>"}
{"level":"debug","ts":"2023-01-06T13:19:48.270Z","caller":"networkutils/network.go:415","msg":"execute iptable rule : connmark to fwmark copy"}
{"level":"debug","ts":"2023-01-06T13:19:48.271Z","caller":"networkutils/network.go:415","msg":"rule nat/PREROUTING rule connmark to fwmark copy shouldExist false rule [-m comment --comment AWS, CONNMARK -j CONNMARK --restore-mark --mask 0x80] exists false, err <nil>"}
{"level":"debug","ts":"2023-01-06T13:19:48.271Z","caller":"networkutils/network.go:415","msg":"execute iptable rule : connmark to fwmark copy"}
{"level":"debug","ts":"2023-01-06T13:19:48.272Z","caller":"networkutils/network.go:415","msg":"rule nat/PREROUTING rule connmark to fwmark copy shouldExist true rule [-m comment --comment AWS, CONNMARK -j CONNMARK --restore-mark --mask 0x80] exists false, err <nil>"}
{"level":"debug","ts":"2023-01-06T13:19:48.273Z","caller":"awsutils/awsutils.go:1140","msg":"Total number of interfaces found: 1 "}
{"level":"debug","ts":"2023-01-06T13:19:48.273Z","caller":"awsutils/awsutils.go:592","msg":"Found ENI MAC address: 0a:3f:bd:93:2a:8d"}
{"level":"debug","ts":"2023-01-06T13:19:48.276Z","caller":"awsutils/awsutils.go:592","msg":"Found ENI: eni-05097e0aa87b119d5, MAC 0a:3f:bd:93:2a:8d, device 0"}
{"level":"error","ts":"2023-01-06T13:20:38.949Z","caller":"ipamd/ipamd.go:462","msg":"Failed to call ec2:DescribeNetworkInterfaces for [eni-05097e0aa87b119d5]: WebIdentityErr: failed to retrieve credentials\ncaused by: InvalidIdentityToken: No OpenIDConnect provider found in your account for https://oidc.eks.us-east-1.amazonaws.com/id/XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX\n\tstatus code: 400, request id: 5afe536f-4b18-4f21-a5ad-bc2d0341da2e"}
....

从上面的日志来看,这个错误很奇怪(XXXXX是替换):

无法为 [eni-05097e0aa87b119d5] 调用 ec2:DescribeNetworkInterfaces:WebIdentityErr:无法检索凭证\n原因:InvalidIdentityToken:您的账户中未找到 OpenIDConnect 提供商https://oidc.eks.us-east-1.amazonaws.com/id/XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX\n\tstatus代码:400,请求 ID:5afe536f-4b18-4f21-a5ad-bc2d0341da2e

sts:AssumeRole我猜这与行动有关

相关内容