Solutions for AWS EKS are more thirsty for v4 IP addresses than thirsty for water

Tram Ho

I. Situation

Model of Amazon Elastic Container Service for Kubernetes (EKS) at default creation:

situation (1).png

  • Normally, Kubernetes CNI will allocate each pod an address in the intra-cluster IP range, without affecting the host network range.
  • However, AWS EKS uses the Amazon VPC CNI by default, which assigns each pod an IP address from the primary subnet (the subnet to which the primary ENI is attached), which is usually the host/node’s network range. This is a problem if the CIDR range of the planned host subnet is not enough IPv4 to be assigned to both the pod and all other resources.
  • Not only that, the VPC CNI plugin also keeps (receives) a certain amount of IP addresses at each node for quick assignment to new pods. Each instance type has its own limit on the number of network interfaces. Detailed list here . This results in a specific max-pods for each instance type. Details of max-pods refer here .
  • In addition, in terms of security, by default, the security group cannot be used separately for the primary network interface (the node’s network port) and the secondary network interfaces (the network port that represents the internal pod), because the VPC CNI plugin will automatically share the security group. for both types of network interfaces.

II. Satisfy EKS’s IP v4 thirst

By expanding with VPC’s 2nd CIDR range incorporating CNI custom networking

1. Secondary CIDR block

Amazon Elastic Container Service for Kubernetes (EKS) has enabled cluster to create in a VPC with IPv4 CIDR block 2 , can use 100.64.0.0/10 and 198.19.0.0/16 ranges exclusively for pods. This network range is outside the commonly used intranet ranges and is not routable on the internet (non-routable). This increases the number of IPs available to the cluster without overlapping the internal IP range .

2. Combination of secondary-CIDR-block and CNI-Custom-networking

The CNI-custom-networking feature is the solution to all the problems outlined in the facts section. Moreover, it also supports the case that nodes in public-subnets want to place pods in private-subnets. Specifically, CNI-custom-networking will assign IPs to nodes and pods or just pods to the VPC’s secondary-CIDR-block. It allows customizing ENIConfig to be allowed to use the network range

EKS-solution.drawio.png

3. Step-by-step instructions for converting from existing systems

Prerequisites:

  • The user account has enough rights to operate with VPC and EKS.
  • Verification preconfigured for AWS CLI and EKS context.
  • We need to make sure Amazon VPC CNI is on version 1.6.3-eksbuild.2 or later, by running the following command to check:

If the version is smaller, you need to follow the instructions to update the CNI first.

3.2. Additional configuration secondary-CIDR-block

Before configuring EKS, we need to enable secondary-CIDR-blocks on the VPC and make sure they are properly configured with tags and route table.

There are some limitations on secondary-CIDR-blocks for extending VPC, see details here .

3.2.1. Use CLI

Add secondary-CIDR-blocks The following two statements will add the 100.64.0.0/16 CIDR range to the VPC of the EKS cluster. Change my-eks-cluster to the name of the existing EKS cluster.

Create subnet With an environment with more than 3 instances located on 3 subnets (3 different AZs), it is necessary to create subnets for 3 corresponding AZs. Change my-eks-cluster to the name of the existing EKS cluster.:

3.2.2. Use interface

Add secondary-CIDR-blocks

3.3. Configure Kubernetes

3.3.1 Configure Custom networking

3.3.1.1 Security-group Create a separate security-group and then get the ID or get the security-group ID of the node/cluster to run the following command:

3.3.1.2 Subnet-ID

Note : If using CLI of section 3.2.1, skip this step.

  • If using the interface, follow the instructions:

3.3.1.3 Creating and deploying yaml . files

Note : If using CLI of section 3.2.1, change $az_1 to $POD_AZS[0] and corresponding $AZ_3 to POD_AZS[2]

  • Subnet1-AZ1

  • Subnet2-AZ2

  • Subnet3-AZ3

3.3.1.4 Check

Get the same result:

3.3.4. Automatic configuration with AZ labels – Availability-Zone-Labels:

You can allow Kubernetes to automatically apply the corresponding ENIConfig to worker-nodes according to the Availability-Zone (AZ). End result: The name of the ENIConfig will correspond to the AZ names for each subnet. (ap-southeast-1a, ap-southeast-1b, ap-southeast-1c). Kubernetes also automatically adds labels: topology.kubernetes.io/zone to the worker-nodes corresponding to the AZs. You can check with the command:

The output is similar to:

Therefore, we take advantage of this label to automatically apply the ENIConfig corresponding to the AZ of the node with the command:

Note: This section label failure-domain.beta.kubernetes.io/zone can be used but is planned to be removed, we should always replace it with label topology.kubernetes.io/zone .

3.4 Apply configuration to worker-nodes

We need to replace all the running worker-nodes with the new node so that it starts automatically with the newly created network configuration. If it is a Dev/test environment, without worrying about downtime, you can freely delete all nodes and open a new node. If it is a production environment, please refer to a few of my suggestions below: NOTE DOWN TIME With EKS clusters running prodution, because the node needs to be terminated, be very careful at this time:

  • Deploy at off-peak hours.
  • Make sure that each deployment has a minimum number of pods spread across the nodes.
  • Rolling-update each node: Open one more to delete one. It is best to open twice the number of existing nodes first, then scale-in to the current number to let the AutoScaling process handle itself. (By default it will delete the oldest nodes first and before that it automatically switches the running pods to another node.)
Share the news now

Source : Viblo