When I'm creating EKS cluster with single nodepool using terraform, I'm facing the kubelet certificate problem, i.e. csr's are stuck in pending state like this:
NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION
csr-8qmz5 4m57s kubernetes.io/kubelet-serving kubernetes-admin <none> Pending
csr-mq9rx 5m kubernetes.io/kubelet-serving kubernetes-admin <none> Pending
As we can see REQUESTOR here is kubernetes-admin, and I'm really not sure why. My terrafrom code for cluster itself:
resource "aws_eks_cluster" "eks" {
name = var.eks_cluster_name
role_arn = var.eks_role_arn
version = var.k8s_version
vpc_config {
endpoint_private_access = "true"
endpoint_public_access = "true"
subnet_ids = var.eks_public_network_ids
security_group_ids = var.eks_security_group_ids
}
kubernetes_network_config {
ip_family = "ipv4"
service_ipv4_cidr = "10.100.0.0/16"
}
}
Terraform code for nodegroup:
resource "aws_eks_node_group" "aks-NG" {
depends_on = [aws_ec2_tag.eks-subnet-cluster-tag, aws_key_pair.eks-deployer]
cluster_name = aws_eks_cluster.eks.name
node_group_name = "aks-dev-NG"
ami_type = "AL2_x86_64"
node_role_arn = var.eks_role_arn
subnet_ids = var.eks_public_network_ids
capacity_type = "ON_DEMAND"
instance_types = var.eks_nodepool_instance_types
disk_size = "50"
scaling_config {
desired_size = 2
max_size = 2
min_size = 2
}
tags = {
Name = "${var.eks_cluster_name}-node"
"kubernetes.io/cluster/${var.eks_cluster_name}" = "owned"
}
remote_access {
ec2_ssh_key = "eks-deployer-key"
}
}
Per my understanding it's very basic configuration.
Now, when I'm creating cluster and nodegroup via AWS management console with exactly SAME parameters, i.e. cluster IAM role and nodegroup IAM roles are same as for Terraform, everything is fine:
NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION
csr-86qtg 6m20s kubernetes.io/kubelet-serving system:node:ip-172-31-201-140.ec2.internal <none> Approved,Issued
csr-np42b 6m43s kubernetes.io/kubelet-serving system:node:ip-172-31-200-199.ec2.internal <none> Approved,Issued
But here, certificate requestor it's node itself (per my understanding). So I would like to know what's the problem is here? Why requestor is different in this case, what's the difference between creating of these resources from AWS management console and using terraform, and how do I manage this issue? Please help.
UPD.
I found that this problem appears when I'm creating cluster using terraform via assumed role created for terraform. When i'm creating the cluster using terraform with regular IAM user credentials, with same permissions set everything is fine. It doesn't gives any answer regarding the root casue, but still, it's something to consider. Right now it seems like weird EKS bug.