Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Submariner is not working for LKE #2489

Closed
tamalsaha opened this issue May 22, 2023 · 13 comments
Closed

Submariner is not working for LKE #2489

tamalsaha opened this issue May 22, 2023 · 13 comments

Comments

@tamalsaha
Copy link

I have create 3 clusters in Linode / LKE (1) hub cluster for broker, (2) us-east and (3) us-west cluster. I have collected the output of the following command in the git repo below:

calicoctl get ippools -o wide --allow-version-mismatch

subctl diagnose all
subctl show all
subctl gather

Logs: https://github.com/tamalsaha/lke-submariner

Can you please help me make this work? Thanks a lot!

@yboaron yboaron added the Calico label May 22, 2023
@yboaron yboaron self-assigned this May 22, 2023
@tamalsaha
Copy link
Author

tamalsaha commented May 22, 2023

I think we might have found the source of the issue from Yossi's comment below:

https://kubernetes.slack.com/archives/C010RJV694M/p1684767007974559?thread_ts=1684727924.394349&cid=C010RJV694M

I see that subariner does not discover the podCIDR correctly in my case.

findPodIPRangeKubeController does not work for me. Because

$ k get pods -n kube-system --show-labels

NAME                                       READY   STATUS    RESTARTS   AGE     LABELS
calico-kube-controllers-689746b4fc-zxmcb   1/1     Running   0          4h6m    k8s-app=calico-kube-controllers,pod-template-hash=689746b4fc
calico-node-7b64f                          1/1     Running   0          4h6m    controller-revision-hash=7cf99d9574,k8s-app=calico-node,pod-template-generation=1
calico-node-dv5hn                          1/1     Running   0          4h6m    controller-revision-hash=7cf99d9574,k8s-app=calico-node,pod-template-generation=1
calico-node-xvfz7                          1/1     Running   0          4h5m    controller-revision-hash=7cf99d9574,k8s-app=calico-node,pod-template-generation=1
coredns-75fd9f59f7-74j6p                   1/1     Running   0          4h6m    k8s-app=kube-dns,pod-template-hash=75fd9f59f7
coredns-75fd9f59f7-rw28c                   1/1     Running   0          4h6m    k8s-app=kube-dns,pod-template-hash=75fd9f59f7
csi-linode-controller-0                    4/4     Running   0          4h6m    app=csi-linode-controller,controller-revision-hash=csi-linode-controller-857df897b9,role=csi-linode,statefulset.kubernetes.io/pod-name=csi-linode-controller-0
csi-linode-node-8p5r4                      2/2     Running   0          4h6m    app=csi-linode-node,controller-revision-hash=d59df455b,pod-template-generation=1,role=csi-linode
csi-linode-node-954m7                      2/2     Running   0          4h5m    app=csi-linode-node,controller-revision-hash=d59df455b,pod-template-generation=1,role=csi-linode
csi-linode-node-xmtgr                      2/2     Running   0          4h6m    app=csi-linode-node,controller-revision-hash=d59df455b,pod-template-generation=1,role=csi-linode
kube-proxy-kpkbk                           1/1     Running   0          4h6m    controller-revision-hash=7fc48f5466,k8s-app=kube-proxy,pod-template-generation=1
kube-proxy-wtczf                           1/1     Running   0          4h6m    controller-revision-hash=7fc48f5466,k8s-app=kube-proxy,pod-template-generation=1
kube-proxy-xh2p5                           1/1     Running   0          4h5m    controller-revision-hash=7fc48f5466,k8s-app=kube-proxy,pod-template-generation=1
root-ssh-manager-7gplb                     1/1     Running   0          3h10m   app=root-ssh-manager,controller-revision-hash=5446488d7b,pod-template-generation=1
root-ssh-manager-gqqfj                     1/1     Running   0          3h10m   app=root-ssh-manager,controller-revision-hash=5446488d7b,pod-template-generation=1
root-ssh-manager-hqfms                     1/1     Running   0          3h10m   app=root-ssh-manager,controller-revision-hash=5446488d7b,pod-template-generation=1

LKE is managed cluster and kube-controller-manager is invisble.

Then submariner tries to find podCIDR range from the kube-proxy pod's command[]. There are 2 problems here:

  1. The pod label selector component=kube-proxy is not present in my case.
  2. The command[] does not have the --cluster-cidr. Rather kube-proxy has a configmap mounted that has this data in YAML format correctly.
$ k get pods -n kube-system kube-proxy-kpkbk -o yaml
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: "2023-05-22T13:22:51Z"
  generateName: kube-proxy-
  labels:
    controller-revision-hash: 7fc48f5466
    k8s-app: kube-proxy
    pod-template-generation: "1"
  name: kube-proxy-kpkbk
  namespace: kube-system
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: DaemonSet
    name: kube-proxy
    uid: 197c7858-48f6-4ca6-b33a-cfa668a58c16
  resourceVersion: "709"
  uid: eda3a8f0-ad10-4c37-8978-a95260b59542
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchFields:
          - key: metadata.name
            operator: In
            values:
            - lke109600-163534-646b6c270f06
  containers:
  - command:
    - /usr/local/bin/kube-proxy
    - --config=/var/lib/kube-proxy/config.conf
    - --hostname-override=$(NODE_NAME)
    env:
    - name: NODE_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: spec.nodeName
    image: linode/kube-proxy-amd64:v1.26.3
    imagePullPolicy: IfNotPresent
    name: kube-proxy
    resources: {}
    securityContext:
      privileged: true
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/lib/kube-proxy
      name: kube-proxy
    - mountPath: /run/xtables.lock
      name: xtables-lock
    - mountPath: /lib/modules
      name: lib-modules
      readOnly: true
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-7jj2s
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  hostNetwork: true
  nodeName: lke109600-163534-646b6c270f06
  preemptionPolicy: PreemptLowerPriority
  priority: 2000001000
  priorityClassName: system-node-critical
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: kube-proxy
  serviceAccountName: kube-proxy
  terminationGracePeriodSeconds: 30
  tolerations:
  - key: CriticalAddonsOnly
    operator: Exists
  - operator: Exists
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/disk-pressure
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/memory-pressure
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/pid-pressure
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/unschedulable
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/network-unavailable
    operator: Exists
  volumes:
  - configMap:
      defaultMode: 420
      name: kube-proxy
    name: kube-proxy
  - hostPath:
      path: /run/xtables.lock
      type: FileOrCreate
    name: xtables-lock
  - hostPath:
      path: /lib/modules
      type: ""
    name: lib-modules
  - name: kube-api-access-7jj2s
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2023-05-22T13:22:54Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2023-05-22T13:22:57Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2023-05-22T13:22:57Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2023-05-22T13:22:51Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: containerd://68fbbc9b7141da97296a02b9ae49e3ff5733d1cc5ff61a8f40aada3e40e6523e
    image: docker.io/linode/kube-proxy-amd64:v1.26.3
    imageID: docker.io/linode/kube-proxy-amd64@sha256:9fd18772468841f2eb567b07ed7022cda7aab2365b4539d32a08c525d2d94c1a
    lastState: {}
    name: kube-proxy
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2023-05-22T13:22:57Z"
  hostIP: 192.168.148.168
  phase: Running
  podIP: 192.168.148.168
  podIPs:
  - ip: 192.168.148.168
  qosClass: BestEffort
  startTime: "2023-05-22T13:22:54Z"
$ k get cm -n kube-system kube-proxy -o yaml
apiVersion: v1
data:
  config.conf: |-
    apiVersion: kubeproxy.config.k8s.io/v1alpha1
    bindAddress: 0.0.0.0
    clientConnection:
      acceptContentTypes: ""
      burst: 0
      contentType: ""
      kubeconfig: /var/lib/kube-proxy/kubeconfig.conf
      qps: 0
    clusterCIDR: 10.2.0.0/16 # This must be the PodCIDR for LKE!
    configSyncPeriod: 0s
    conntrack:
      maxPerCore: null
      min: null
      tcpCloseWaitTimeout: null
      tcpEstablishedTimeout: null
    enableProfiling: false
    healthzBindAddress: ""
    hostnameOverride: ""
    iptables:
      masqueradeAll: false
      masqueradeBit: null
      minSyncPeriod: 0s
      syncPeriod: 0s
    ipvs:
      excludeCIDRs: null
      minSyncPeriod: 0s
      scheduler: ""
      strictARP: false
      syncPeriod: 0s
    kind: KubeProxyConfiguration
    metricsBindAddress: ""
    mode: ""
    nodePortAddresses: null
    oomScoreAdj: null
    portRange: ""
    udpIdleTimeout: 0s
    winkernel:
      enableDSR: false
      networkName: ""
      sourceVip: ""
  kubeconfig.conf: |-
    apiVersion: v1
    kind: Config
    clusters:
    - cluster:
        certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        server: https://67d96b4d-7bf7-4328-80ee-450b32e2830e.us-east-1.linodelke.net:443
      name: default
    contexts:
    - context:
        cluster: default
        namespace: default
        user: default
      name: default
    current-context: default
    users:
    - name: default
      user:
        tokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
kind: ConfigMap
metadata:
  creationTimestamp: "2023-05-22T13:21:43Z"
  labels:
    app: kube-proxy
  name: kube-proxy
  namespace: kube-system
  resourceVersion: "303"
  uid: b95b5e9d-88eb-45c8-9656-c60ceb8cc8a8

Notice the clusterCIDR: 10.2.0.0/16 # This must be the PodCIDR for LKE!

Once this fails, code tries to find podCIDR range from node spec and that has the following data

apiVersion: v1
kind: Node
metadata:
  name: lke109600-163534-646b6c270f06
spec:
  podCIDR: 10.2.1.0/24
  podCIDRs:
  - 10.2.1.0/24
  providerID: linode://46358710
apiVersion: v1
kind: Node
metadata:
  name: lke109600-163534-646b6c272e97
spec:
  podCIDR: 10.2.0.0/24
  podCIDRs:
  - 10.2.0.0/24
  providerID: linode://46358711
apiVersion: v1
kind: Node
metadata:
  name: lke109600-163534-646b6c274e5a
spec:
  podCIDR: 10.2.2.0/24
  podCIDRs:
  - 10.2.2.0/24
  providerID: linode://46358712

So, it seems that node spec.podCIDR only defines a given node's podCIDR range, not the cluster podCIDR. This also looks like a bug
https://github.com/submariner-io/submariner-operator/blob/devel/pkg/discovery/network/generic.go#L182-L191

as the submariner code just returns the first node's podCIDR as cluster podCIDR.

I tried to override this by passing podCIDR to subctl join command, but it is ignored with the following message:

$ subctl join --kubeconfig /home/tamal/Downloads/tamal-us-east-kubeconfig.yaml broker-info.subm \
                                       --clusterid cluster-us-east \
                                       --globalnet \
                                       --clustercidr 10.2.0.0/16
 ✓ broker-info.subm indicates broker is at https://246aea18-562e-400f-989d-32f9c7e85b49.cpc1-us-central.linodelke.net:443
 ✓ Discovering network details 
        Network plugin:  calico
        Service CIDRs:   [10.128.0.0/16]
        Cluster CIDRs:   [10.2.1.0/24]
 ⚠ The provided pod CIDR for the cluster (10.2.0.0/16) does not match the discovered CIDR (10.2.1.0/24)
 ✓ Retrieving the gateway nodes
 ✓ Retrieving all worker nodes 

Proposal:

  1. Fix the obvious bugs/issues/limitations in the auto network discovery code.
  2. For calico CNI, code should also check the spec.cidr for the default-ipv4-ippool IPPool.
apiVersion: crd.projectcalico.org/v1
kind: IPPool
metadata:
  creationTimestamp: "2023-05-22T13:23:10Z"
  generation: 1
  name: default-ipv4-ippool
  resourceVersion: "863"
  uid: f03a4a24-da85-4bbc-a79e-e71f92a94b9c
spec:
  allowedUses:
  - Workload
  - Tunnel
  blockSize: 26
  cidr: 10.2.0.0/16
  ipipMode: Always
  natOutgoing: true
  nodeSelector: all()
  vxlanMode: Never

@tamalsaha
Copy link
Author

After passing --clustercidr 10.2.0.0/16 with subctl join command, the basic curl nginx.nginx-test.svc.clusterset.local:8080 command is working. 🎉

But when I run the full test suite, some tests are still failing. You can see it here: https://github.com/tamalsaha/lke-submariner/blob/master/e2e-tests.md

Can you help me fix those tests?

@yboaron
Copy link
Contributor

yboaron commented May 22, 2023

@tamalsaha Very nice explanation of the problem, description of the root cause, and even possible solutions, well done!

As per subctl verify failures, it appears that some TCP connectivity tests have failed,
could you please run subctl verify after changing Calico default-ip-pool to VxLAN overlay on both clusters ?

@tamalsaha
Copy link
Author

Hello, I have tried to switch to VXLAN mode by editing IPPool. But if I do that then even curl to a local nginx service does not work anymore.

@yboaron
Copy link
Contributor

yboaron commented May 23, 2023

OK,
After Calico overlay set to VxLAN please restart all submariner route-agent pods , you can use [1] command and retry.

In case it didn't help, set Calico overlay back to IPIP restart all submariner route-agent , and attach the subctl gather

[1]
kubectl delete pod -n submariner-operator -l app=submariner-routeagent

@tamalsaha
Copy link
Author

Hi,
I tried as you said. But VXLAN is not working. I have attached the output of subctl gather command.

submariner-20230523090055.zip

@yboaron
Copy link
Contributor

yboaron commented Aug 2, 2023

Hi @tamalsaha , sorry for the late response,

any update on this ?

@tamalsaha
Copy link
Author

I am waiting to hear from you. I was not able to get it to work on LKE.

@yboaron
Copy link
Contributor

yboaron commented Aug 2, 2023

Ack,
I checked the logs and they seem OK,
I noticed that only the connectivity tests (in subctl verify) between pods that run on non GW nodes failed, Submariner uses VxLAN (port 4800) for intra-cluster communication between GW node and other nodes , could you please check if UDP port 4800 is allowed between nodes ?

Also could you please add --packet-size 500 parameter when you run the subctl verify command (to verify that you don't hit MTU issue) ?

@yboaron
Copy link
Contributor

yboaron commented Aug 2, 2023

Sorry, don't need to check UDP port 4800 , it was covered by 'subctl diagnose' [1] , seems OK , just try to add --packet-size 500 parameter to subctl verify command

[1]
The firewall configuration allows intra-cluster VXLAN traffic

@sridhargaddam
Copy link
Member

This issue seems to be similar to - #2660

@eremcan
Copy link

eremcan commented Dec 25, 2023

Is there any update so far? I got stuck at same place. @tamalsaha

Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further
activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale label Apr 24, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants