-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Submariner is not working for LKE #2489
Comments
I think we might have found the source of the issue from Yossi's comment below: I see that subariner does not discover the podCIDR correctly in my case.
$ k get pods -n kube-system --show-labels
NAME READY STATUS RESTARTS AGE LABELS
calico-kube-controllers-689746b4fc-zxmcb 1/1 Running 0 4h6m k8s-app=calico-kube-controllers,pod-template-hash=689746b4fc
calico-node-7b64f 1/1 Running 0 4h6m controller-revision-hash=7cf99d9574,k8s-app=calico-node,pod-template-generation=1
calico-node-dv5hn 1/1 Running 0 4h6m controller-revision-hash=7cf99d9574,k8s-app=calico-node,pod-template-generation=1
calico-node-xvfz7 1/1 Running 0 4h5m controller-revision-hash=7cf99d9574,k8s-app=calico-node,pod-template-generation=1
coredns-75fd9f59f7-74j6p 1/1 Running 0 4h6m k8s-app=kube-dns,pod-template-hash=75fd9f59f7
coredns-75fd9f59f7-rw28c 1/1 Running 0 4h6m k8s-app=kube-dns,pod-template-hash=75fd9f59f7
csi-linode-controller-0 4/4 Running 0 4h6m app=csi-linode-controller,controller-revision-hash=csi-linode-controller-857df897b9,role=csi-linode,statefulset.kubernetes.io/pod-name=csi-linode-controller-0
csi-linode-node-8p5r4 2/2 Running 0 4h6m app=csi-linode-node,controller-revision-hash=d59df455b,pod-template-generation=1,role=csi-linode
csi-linode-node-954m7 2/2 Running 0 4h5m app=csi-linode-node,controller-revision-hash=d59df455b,pod-template-generation=1,role=csi-linode
csi-linode-node-xmtgr 2/2 Running 0 4h6m app=csi-linode-node,controller-revision-hash=d59df455b,pod-template-generation=1,role=csi-linode
kube-proxy-kpkbk 1/1 Running 0 4h6m controller-revision-hash=7fc48f5466,k8s-app=kube-proxy,pod-template-generation=1
kube-proxy-wtczf 1/1 Running 0 4h6m controller-revision-hash=7fc48f5466,k8s-app=kube-proxy,pod-template-generation=1
kube-proxy-xh2p5 1/1 Running 0 4h5m controller-revision-hash=7fc48f5466,k8s-app=kube-proxy,pod-template-generation=1
root-ssh-manager-7gplb 1/1 Running 0 3h10m app=root-ssh-manager,controller-revision-hash=5446488d7b,pod-template-generation=1
root-ssh-manager-gqqfj 1/1 Running 0 3h10m app=root-ssh-manager,controller-revision-hash=5446488d7b,pod-template-generation=1
root-ssh-manager-hqfms 1/1 Running 0 3h10m app=root-ssh-manager,controller-revision-hash=5446488d7b,pod-template-generation=1 LKE is managed cluster and kube-controller-manager is invisble. Then
$ k get pods -n kube-system kube-proxy-kpkbk -o yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: "2023-05-22T13:22:51Z"
generateName: kube-proxy-
labels:
controller-revision-hash: 7fc48f5466
k8s-app: kube-proxy
pod-template-generation: "1"
name: kube-proxy-kpkbk
namespace: kube-system
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: DaemonSet
name: kube-proxy
uid: 197c7858-48f6-4ca6-b33a-cfa668a58c16
resourceVersion: "709"
uid: eda3a8f0-ad10-4c37-8978-a95260b59542
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchFields:
- key: metadata.name
operator: In
values:
- lke109600-163534-646b6c270f06
containers:
- command:
- /usr/local/bin/kube-proxy
- --config=/var/lib/kube-proxy/config.conf
- --hostname-override=$(NODE_NAME)
env:
- name: NODE_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
image: linode/kube-proxy-amd64:v1.26.3
imagePullPolicy: IfNotPresent
name: kube-proxy
resources: {}
securityContext:
privileged: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/lib/kube-proxy
name: kube-proxy
- mountPath: /run/xtables.lock
name: xtables-lock
- mountPath: /lib/modules
name: lib-modules
readOnly: true
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-7jj2s
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
hostNetwork: true
nodeName: lke109600-163534-646b6c270f06
preemptionPolicy: PreemptLowerPriority
priority: 2000001000
priorityClassName: system-node-critical
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: kube-proxy
serviceAccountName: kube-proxy
terminationGracePeriodSeconds: 30
tolerations:
- key: CriticalAddonsOnly
operator: Exists
- operator: Exists
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
- effect: NoSchedule
key: node.kubernetes.io/disk-pressure
operator: Exists
- effect: NoSchedule
key: node.kubernetes.io/memory-pressure
operator: Exists
- effect: NoSchedule
key: node.kubernetes.io/pid-pressure
operator: Exists
- effect: NoSchedule
key: node.kubernetes.io/unschedulable
operator: Exists
- effect: NoSchedule
key: node.kubernetes.io/network-unavailable
operator: Exists
volumes:
- configMap:
defaultMode: 420
name: kube-proxy
name: kube-proxy
- hostPath:
path: /run/xtables.lock
type: FileOrCreate
name: xtables-lock
- hostPath:
path: /lib/modules
type: ""
name: lib-modules
- name: kube-api-access-7jj2s
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2023-05-22T13:22:54Z"
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2023-05-22T13:22:57Z"
status: "True"
type: Ready
- lastProbeTime: null
lastTransitionTime: "2023-05-22T13:22:57Z"
status: "True"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: "2023-05-22T13:22:51Z"
status: "True"
type: PodScheduled
containerStatuses:
- containerID: containerd://68fbbc9b7141da97296a02b9ae49e3ff5733d1cc5ff61a8f40aada3e40e6523e
image: docker.io/linode/kube-proxy-amd64:v1.26.3
imageID: docker.io/linode/kube-proxy-amd64@sha256:9fd18772468841f2eb567b07ed7022cda7aab2365b4539d32a08c525d2d94c1a
lastState: {}
name: kube-proxy
ready: true
restartCount: 0
started: true
state:
running:
startedAt: "2023-05-22T13:22:57Z"
hostIP: 192.168.148.168
phase: Running
podIP: 192.168.148.168
podIPs:
- ip: 192.168.148.168
qosClass: BestEffort
startTime: "2023-05-22T13:22:54Z"
$ k get cm -n kube-system kube-proxy -o yaml
apiVersion: v1
data:
config.conf: |-
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0
clientConnection:
acceptContentTypes: ""
burst: 0
contentType: ""
kubeconfig: /var/lib/kube-proxy/kubeconfig.conf
qps: 0
clusterCIDR: 10.2.0.0/16 # This must be the PodCIDR for LKE!
configSyncPeriod: 0s
conntrack:
maxPerCore: null
min: null
tcpCloseWaitTimeout: null
tcpEstablishedTimeout: null
enableProfiling: false
healthzBindAddress: ""
hostnameOverride: ""
iptables:
masqueradeAll: false
masqueradeBit: null
minSyncPeriod: 0s
syncPeriod: 0s
ipvs:
excludeCIDRs: null
minSyncPeriod: 0s
scheduler: ""
strictARP: false
syncPeriod: 0s
kind: KubeProxyConfiguration
metricsBindAddress: ""
mode: ""
nodePortAddresses: null
oomScoreAdj: null
portRange: ""
udpIdleTimeout: 0s
winkernel:
enableDSR: false
networkName: ""
sourceVip: ""
kubeconfig.conf: |-
apiVersion: v1
kind: Config
clusters:
- cluster:
certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
server: https://67d96b4d-7bf7-4328-80ee-450b32e2830e.us-east-1.linodelke.net:443
name: default
contexts:
- context:
cluster: default
namespace: default
user: default
name: default
current-context: default
users:
- name: default
user:
tokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
kind: ConfigMap
metadata:
creationTimestamp: "2023-05-22T13:21:43Z"
labels:
app: kube-proxy
name: kube-proxy
namespace: kube-system
resourceVersion: "303"
uid: b95b5e9d-88eb-45c8-9656-c60ceb8cc8a8 Notice the Once this fails, code tries to find podCIDR range from node spec and that has the following data
So, it seems that node as the submariner code just returns the first node's podCIDR as cluster podCIDR. I tried to override this by passing podCIDR to
Proposal:
apiVersion: crd.projectcalico.org/v1
kind: IPPool
metadata:
creationTimestamp: "2023-05-22T13:23:10Z"
generation: 1
name: default-ipv4-ippool
resourceVersion: "863"
uid: f03a4a24-da85-4bbc-a79e-e71f92a94b9c
spec:
allowedUses:
- Workload
- Tunnel
blockSize: 26
cidr: 10.2.0.0/16
ipipMode: Always
natOutgoing: true
nodeSelector: all()
vxlanMode: Never
|
After passing But when I run the full test suite, some tests are still failing. You can see it here: https://github.com/tamalsaha/lke-submariner/blob/master/e2e-tests.md Can you help me fix those tests? |
@tamalsaha Very nice explanation of the problem, description of the root cause, and even possible solutions, well done! As per subctl verify failures, it appears that some TCP connectivity tests have failed, |
Hello, I have tried to switch to VXLAN mode by editing IPPool. But if I do that then even curl to a local nginx service does not work anymore. |
OK, In case it didn't help, set Calico overlay back to IPIP restart all submariner route-agent , and attach the subctl gather [1] |
Hi, |
Hi @tamalsaha , sorry for the late response, any update on this ? |
I am waiting to hear from you. I was not able to get it to work on LKE. |
Ack, Also could you please add |
Sorry, don't need to check UDP port 4800 , it was covered by 'subctl diagnose' [1] , seems OK , just try to add --packet-size 500 parameter to subctl verify command [1] |
This issue seems to be similar to - #2660 |
Is there any update so far? I got stuck at same place. @tamalsaha |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further |
I have create 3 clusters in Linode / LKE (1) hub cluster for broker, (2) us-east and (3) us-west cluster. I have collected the output of the following command in the git repo below:
Logs: https://github.com/tamalsaha/lke-submariner
Can you please help me make this work? Thanks a lot!
The text was updated successfully, but these errors were encountered: