Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terraform Creation failing following https://learnk8s.io/terraform-eks: Error: Kubernetes cluster unreachable: Get #2

Open
marcellodesales opened this issue Oct 18, 2020 · 2 comments

Comments

@marcellodesales
Copy link

marcellodesales commented Oct 18, 2020

Hi there,

Thank you for the awesome tutorial at https://learnk8s.io/terraform-eks#you-can-provision-an-eks-cluster-with-terraform-too... Very useful as I was looking for an example to get different clusters per environment... I just need 2... really appreciated your work!!!

Just got an error creating the cluster using the step 6. I had updated a couple of properties shown below, but here's the error...

Error

I'm getting the following error:

module.prd_cluster.module.eks.aws_iam_role_policy_attachment.workers_AmazonEKS_CNI_Policy[0]: Refreshing state... [id=eks-prd-super-cash-example-com20201018045153077400000007-2020101804515980130000000a]
module.prd_cluster.module.eks.aws_iam_role_policy_attachment.workers_AmazonEC2ContainerRegistryReadOnly[0]: Refreshing state... [id=eks-prd-super-cash-example-com20201018045153077400000007-20201018045159789200000008]
module.prd_cluster.module.eks.aws_iam_role_policy_attachment.workers_AmazonEKSWorkerNodePolicy[0]: Refreshing state... [id=eks-prd-super-cash-example-com20201018045153077400000007-2020101804515988710000000b]
module.prd_cluster.module.eks.aws_iam_role_policy_attachment.workers_additional_policies[0]: Refreshing state... [id=eks-prd-super-cash-example-com20201018045153077400000007-20201018045159794400000009]

Error: Kubernetes cluster unreachable: Get https://44C5045D2C00520DBF55914A260A17C8.
   gr7.sa-east-1.eks.amazonaws.com/version?timeout=32s: dial tcp: lookup 
   44C5045D2C00520DBF55914A260A17C8.gr7.sa-east-1.eks.amazonaws.com on 192.168.1.1:53: 
   read udp 192.168.1.35:54700->192.168.1.1:53: i/o timeout

At this point, I know I can ping amazonaws.com... But maybe we are missing a security group? The cluster got created...

Environment

$ terraform version
Terraform v0.13.4
+ provider registry.terraform.io/hashicorp/aws v3.11.0
+ provider registry.terraform.io/hashicorp/helm v1.3.1
+ provider registry.terraform.io/hashicorp/kubernetes v1.13.2
+ provider registry.terraform.io/hashicorp/local v2.0.0
+ provider registry.terraform.io/hashicorp/null v3.0.0
+ provider registry.terraform.io/hashicorp/random v3.0.0
+ provider registry.terraform.io/hashicorp/template v2.2.0

Setup

  • The UI lists the clusters

Screen Shot 2020-10-18 at 2 13 08 AM

  • I can also list them from the CMD
$ aws eks list-clusters
{
    "clusters": [
        "eks-prd-super-cash-example-com",
        "eks-ppd-super-cash-example-com"
    ]
}

Missing sep to install the authenticator

ATTENTION: The article doesn't mention the creation of the aws-iam-authenticator

  • All the kubeconfig files were created with the authenticator dependency
$ kubectl get pods --all-namespaces
Unable to connect to the server: getting credentials: exec: exec: "aws-iam-authenticator": executable file not found in $PATH

$ brew install aws-iam-authenticator
  • Just got the list of files
$ ls -la kubeconfig_eks-p*
-rw-r--r--  1 marcellodesales  staff  2056 Oct 18 01:52 kubeconfig_eks-ppd-super-cash-example-com
-rw-r--r--  1 marcellodesales  staff  2056 Oct 18 01:51 kubeconfig_eks-prd-super-cash-example-com

$ kubectl get pods --all-namespaces
NAMESPACE     NAME                                                  READY   STATUS    RESTARTS   AGE
default       ingress-aws-alb-ingress-controller-6ccd59df99-8lsvh   0/1     Pending   0          29m
kube-system   coredns-59dcf49c5-5wkkf                               0/1     Pending   0          32m
kube-system   coredns-59dcf49c5-hbqtl                               0/1     Pending   0          32m

Other changes made to the original

  • Changed Kubernetes version from 1.17 to 1.18
  • Changed the subnets to have odd and even octets per subnet type... Not sure if that would affect the access...
  private_subnets      = ["172.16.1.0/24", "172.16.3.0/24", "172.16.5.0/24"]
  public_subnets       = ["172.16.2.0/24", "172.16.4.0/24", "172.16.6.0/24"]

API server SSL certs might be wrong

  • I'm not sure if the problem is related to the certs... Even though it says unreachable, I can see that the certs are incorrect...
$ curl -v  https://DCF5F17BFF0ACDC562845DA97F3B171F.sk1.sa-east-1.eks.amazonaws.com/api/v1/namespaces/kube-system/configmaps
*   Trying 54.207.147.62...
* TCP_NODELAY set
* Connected to DCF5F17BFF0ACDC562845DA97F3B171F.sk1.sa-east-1.eks.amazonaws.com (54.207.147.62) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/cert.pem
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (OUT), TLS alert, unknown CA (560):
* SSL certificate problem: unable to get local issuer certificate
* Closing connection 0
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: https://curl.haxx.se/docs/sslcerts.html

curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.

Thank you
Marcello

@marcellodesales
Copy link
Author

marcellodesales commented Oct 18, 2020

t2.micro problems - 0/1 nodes are available: 1 Too many pods.

  • Creating a dev cluster I got 0/1 nodes are available: 1 Too many pods.... Even though there's an autoscale group for the cluster... Not sure the reason... I changed to t2.medium and resolved...
$ kubectl get pods
NAME                                                READY   STATUS    RESTARTS   AGE
ingress-aws-alb-ingress-controller-66f95d8d-v9n6m   0/1     Pending   0          114s

 ☸️  [email protected] 📛 [email protected]     🧾 [email protected]
provider
⎈ default 🔐 eks_eks-ppd-super-cash-example-com
~/dev/github.com/k-mitevski/terraform-k8s/06_terraform_envs_customised/environments/ppd on  master! ⌚ 13:53:13
$ kubectl describe pod ingress-aws-alb-ingress-controller-66f95d8d-v9n6m
Name:           ingress-aws-alb-ingress-controller-66f95d8d-v9n6m
Namespace:      default
Priority:       0
Node:           <none>
Labels:         app.kubernetes.io/instance=ingress
                app.kubernetes.io/name=aws-alb-ingress-controller
                pod-template-hash=66f95d8d
Annotations:    kubernetes.io/psp: eks.privileged
Status:         Pending
IP:
IPs:            <none>
Controlled By:  ReplicaSet/ingress-aws-alb-ingress-controller-66f95d8d
Containers:
  aws-alb-ingress-controller:
    Image:      docker.io/amazon/aws-alb-ingress-controller:v1.1.8
    Port:       10254/TCP
    Host Port:  0/TCP
    Args:
      --cluster-name=eks-ppd-super-cash-example-com
      --ingress-class=alb
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from ingress-aws-alb-ingress-controller-token-bgv6p (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  ingress-aws-alb-ingress-controller-token-bgv6p:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ingress-aws-alb-ingress-controller-token-bgv6p
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age                 From               Message
  ----     ------            ----                ----               -------
  Warning  FailedScheduling  35s (x5 over 2m6s)  default-scheduler  0/1 nodes are available: 1 Too many pods.

Missing the install of the cluster autoscaler

@marcellodesales
Copy link
Author

Error when Re-running

  • Just got this error after re-running:
Error: error creating EKS Node Group (eks-ppd-super-cash-example-com:eks-ppd-super-cash-example-com-first-grand-primate): InvalidParameterException: Subnets are not tagged with the required tag. Please tag all subnets with Key: kubernetes.io/cluster/eks-ppd-super-cash-example-com Value: shared
{
  RespMetadata: {
    StatusCode: 400,
    RequestID: "249ff5ae-e506-40aa-a56f-ecc3441e856e"
  },
  ClusterName: "eks-ppd-super-cash-example-com",
  Message_: "Subnets are not tagged with the required tag. Please tag all subnets with Key: kubernetes.io/cluster/eks-ppd-super-cash-example-com Value: shared",
  NodegroupName: "eks-ppd-super-cash-example-com-first-grand-primate"
}
  • I noticed that the subgroups are not prefixed with eks, which is the name of the cluster...

FROM

  public_subnet_tags = {
    "kubernetes.io/cluster/${var.cluster_name}" = "shared"
    "kubernetes.io/role/elb"                    = "1"
  }

  private_subnet_tags = {
    "kubernetes.io/cluster/${var.cluster_name}" = "shared"
    "kubernetes.io/role/internal-elb"           = "1"
  }

TO

  public_subnet_tags = {
    "kubernetes.io/cluster/eks-${local.env_domain}" = "shared"
    "kubernetes.io/role/elb"                        = "1"
  }

  private_subnet_tags = {
    "kubernetes.io/cluster/eks-${local.env_domain}" = "shared"
    "kubernetes.io/role/internal-elb"               = "1"
  }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant