Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Virtual-kubelet skipping sync pod #14

Open
lmq1999 opened this issue Jul 23, 2020 · 34 comments
Open

Virtual-kubelet skipping sync pod #14

lmq1999 opened this issue Jul 23, 2020 · 34 comments

Comments

@lmq1999
Copy link

lmq1999 commented Jul 23, 2020

I have install VK and run it, but it keep skipping pod like this

root@controller:~# virtual-kubelet 
ERRO[0000] TLS certificates not provided, not setting up pod http server  certPath= keyPath= node=virtual-kubelet operatingSystem=Linux provider=openstack watchedNamespace=
INFO[0000] Initialized                                   node=virtual-kubelet operatingSystem=Linux provider=openstack watchedNamespace=
INFO[0000] Pod cache in-sync                             node=virtual-kubelet operatingSystem=Linux provider=openstack watchedNamespace=
INFO[0000] starting workers                              node=virtual-kubelet operatingSystem=Linux provider=openstack watchedNamespace=
INFO[0000] started workers                               node=virtual-kubelet operatingSystem=Linux provider=openstack watchedNamespace=
INFO[0000] Updated pod in provider                       key=kube-system/kube-proxy-zv842 method=createOrUpdatePod name=kube-proxy-zv842 namespace=kube-system node=virtual-kubelet operatingSystem=Linux phase=Pending pod=kube-proxy-zv842 provider=openstack reason= uid=20cafefa-3f14-45d3-98dc-43cf31a3d94a watchedNamespace= workerId=0
WARN[0005] skipping sync of pod "kube-system/kube-proxy-zv842" in "Failed" phase  key=kube-system/kube-proxy-zv842 method=syncPodInProvider name=kube-proxy-zv842 namespace=kube-system node=virtual-kubelet operatingSystem=Linux phase=Failed provider=openstack reason= uid=20cafefa-3f14-45d3-98dc-43cf31a3d94a watchedNamespace= workerId=9
INFO[0007] Created pod in provider                       key=kube-system/kube-proxy-zb9xz method=createOrUpdatePod name=kube-proxy-zb9xz namespace=kube-system node=virtual-kubelet operatingSystem=Linux phase=Pending pod=kube-proxy-zb9xz provider=openstack reason= uid=2146469d-5c85-4e20-b2b9-c58dc00b1926 watchedNamespace= workerId=2
INFO[0010] Updated pod in provider                       key=kube-system/kube-proxy-zb9xz method=createOrUpdatePod name=kube-proxy-zb9xz namespace=kube-system node=virtual-kubelet operatingSystem=Linux phase=Pending pod=kube-proxy-zb9xz provider=openstack reason= uid=2146469d-5c85-4e20-b2b9-c58dc00b1926 watchedNamespace= workerId=3
WARN[0020] skipping sync of pod "kube-system/kube-proxy-zb9xz" in "Failed" phase  key=kube-system/kube-proxy-zb9xz method=syncPodInProvider name=kube-proxy-zb9xz namespace=kube-system node=virtual-kubelet operatingSystem=Linux phase=Failed provider=openstack reason= uid=2146469d-5c85-4e20-b2b9-c58dc00b1926 watchedNamespace= workerId=6
INFO[0024] Created pod in provider                       key=kube-system/kube-proxy-k6tvn method=createOrUpdatePod name=kube-proxy-k6tvn namespace=kube-system node=virtual-kubelet operatingSystem=Linux phase=Pending pod=kube-proxy-k6tvn provider=openstack reason= uid=2e46dde1-289f-4497-9764-aaa6e8ad084e watchedNamespace= workerId=8
INFO[0025] Updated pod in provider                       key=kube-system/kube-proxy-k6tvn method=createOrUpdatePod name=kube-proxy-k6tvn namespace=kube-system node=virtual-kubelet operatingSystem=Linux phase=Pending pod=kube-proxy-k6tvn provider=openstack reason= uid=2e46dde1-289f-4497-9764-aaa6e8ad084e watchedNamespace= workerId=0
WARN[0035] skipping sync of pod "kube-system/kube-proxy-k6tvn" in "Failed" phase  key=kube-system/kube-proxy-k6tvn method=syncPodInProvider name=kube-proxy-k6tvn namespace=kube-system node=virtual-kubelet operatingSystem=Linux phase=Failed provider=openstack reason= uid=2e46dde1-289f-4497-9764-aaa6e8ad084e watchedNamespace= workerId=9
INFO[0039] Created pod in provider                       key=kube-system/kube-proxy-lszzc method=createOrUpdatePod name=kube-proxy-lszzc namespace=kube-system node=virtual-kubelet operatingSystem=Linux phase=Pending pod=kube-proxy-lszzc provider=openstack reason= uid=5025454f-a95c-4e71-82ec-d385fb3a8354 watchedNamespace= workerId=2
INFO[0040] Updated pod in provider                       key=kube-system/kube-proxy-lszzc method=createOrUpdatePod name=kube-proxy-lszzc namespace=kube-system node=virtual-kubelet operatingSystem=Linux phase=Pending pod=kube-proxy-lszzc provider=openstack reason= uid=5025454f-a95c-4e71-82ec-d385fb3a8354 watchedNamespace= workerId=3

I install k8s master on openstack controller node:

root@controller:~# kubectl get nodes
NAME              STATUS   ROLES    AGE     VERSION
controller        Ready    master   16m     v1.18.6
virtual-kubelet   Ready    agent    3m59s   v1.14.3

So how to deal with this problems ?

@hongbin
Copy link
Collaborator

hongbin commented Jul 23, 2020

@lmq1999 the failed pod is kube-proxy pod created by k8s. Since we are using virtual node so we don't need the kube-proxy process. Therefore, the failed pod is not important. However, let me know if you create a normal pod and failed.

@lmq1999
Copy link
Author

lmq1999 commented Jul 27, 2020

Hmmmm I meet problems too with normal pod either

I try 2 pods

1st is in the README.md of this repo

2nd is the kubia.yaml from kubernetes in action (luska) https://github.com/luksa/kubernetes-in-action

Normaly they would schedule to k8s-worker node but in here they stuck in creating

NAMESPACE     NAME                                 READY   STATUS     RESTARTS   AGE
default       kubia-9j5wn                          0/1     Pending    0          93s
default       myapp-pod                            0/1     Creating   0          5m6s

myapp-pod already creating for 5m

debug mode:

+0000 UTC KubeletHasSufficientMemory kubelet has sufficient memory available} {DiskPressure False 2020-07-27 03:58:01 +0000 UTC 2020-07-27 03:50:30 +0000 UTC KubeletHasNoDiskPressure kubelet has no disk pressure} {NetworkUnavailable False 2020-07-27 03:58:01 +0000 UTC 2020-07-27 03:50:30 +0000 UTC R000 UTC KubeletHasSufficientDisk kubelet has sufficient disk space available} {MemoryPressure False 2020-07-27 03:58:21 +0000 UTC 2020-07-27 0[0/2055]+0000 UTC KubeletHasSufficientMemory kubelet has sufficient memory available} {DiskPressure False 2020-07-27 03:58:21 +0000 UTC 2020-07-27 03:50:30 +0000 UTC KubeletHasNoDiskPressure kubelet has no disk pressure} {NetworkUnavailable False 2020-07-27 03:58:21 +0000 UTC 2020-07-27 03:50:30 +0000 UTC R
outeCreated RouteController created a route}]" node.UID=0979c831-9910-4a7a-8658-4b7ea979aa84 node.cluster= node.name=virtual-kubelet node.resourceVersion=1926 node.taints="virtual-kubelet.io/provider=openstack:NoSchedule" operatingSystem=Linux provider=openstack watchedNamespace=                    
DEBU[0471] Successful node ping                          node=virtual-kubelet operatingSystem=Linux provider=openstack watchedNamespace=              
DEBU[0471] Skipping pod status update                    method=syncProviderWrapper.syncPodStatuses nPods=2 node=virtual-kubelet operatingSystem=Linux
 provider=openstack watchedNamespace=                                                                                                   
DEBU[0471] Skipping pod status update                    method=syncProviderWrapper.syncPodStatuses nPods=2 node=virtual-kubelet operatingSystem=Linux provider=openstack watchedNamespace=
DEBU[0471] Pod status update loop start                  node=virtual-kubelet operatingSystem=Linux provider=openstack watchedNamespace=              
DEBU[0476] Skipping pod status update                    method=syncProviderWrapper.syncPodStatuses nPods=2 node=virtual-kubelet operatingSystem=Linux
 provider=openstack watchedNamespace=                                                                                                   
DEBU[0476] Skipping pod status update                    method=syncProviderWrapper.syncPodStatuses nPods=2 node=virtual-kubelet operatingSystem=Linux provider=openstack watchedNamespace=
DEBU[0476] Pod status update loop start                  node=virtual-kubelet operatingSystem=Linux provider=openstack watchedNamespace=              
DEBU[0480] Got queue object                              method=handleQueueItem node=virtual-kubelet operatingSystem=Linux provider=openstack watchedN
amespace= workerId=0                                                                                                                    
DEBU[0480] sync handled                                  key=default/myapp-pod method=syncHandler node=virtual-kubelet operatingSystem=Linux provider=openstack watchedNamespace= workerId=0
DEBU[0480] Got queue object                              method=handleQueueItem node=virtual-kubelet operatingSystem=Linux provider=openstack watchedN
amespace= workerId=4                                                                                                                                  
DEBU[0480] sync handled                                  key=kube-system/kube-proxy-4fzx9 method=syncHandler node=virtual-kubelet operatingSystem=Linu
x provider=openstack watchedNamespace= workerId=4                                                                                                     
DEBU[0480] Processed queue item                          method=handleQueueItem node=virtual-kubelet operatingSystem=Linux provider=openstack watchedNamespace= workerId=0                                                                                                                                  
DEBU[0480] Processed queue item                          method=handleQueueItem node=virtual-kubelet operatingSystem=Linux provider=openstack watchedN
amespace= workerId=4                                                                                                                    
DEBU[0481] got node from api server                      method=UpdateNodeStatus node=virtual-kubelet operatingSystem=Linux provider=openstack watchedNamespace=                           
DEBU[0481] updated node status in api server             method=UpdateNodeStatus node=virtual-kubelet node.Status.Conditions="[{Ready True 2020-07-27 03:58:31 +0000 UTC 2020-07-27 03:50:30 +0000 UTC KubeletReady kubelet is ready.} {OutOfDisk False 2020-07-27 03:58:31 +0000 UTC 2020-07-27 03:50:30 +0
000 UTC KubeletHasSufficientDisk kubelet has sufficient disk space available} {MemoryPressure False 2020-07-27 03:58:31 +0000 UTC 2020-07-27 03:50:30
+0000 UTC KubeletHasSufficientMemory kubelet has sufficient memory available} {DiskPressure False 2020-07-27 03:58:31 +0000 UTC 2020-07-27 03:50:30 +0000 UTC KubeletHasNoDiskPressure kubelet has no disk pressure} {NetworkUnavailable False 2020-07-27 03:58:31 +0000 UTC 2020-07-27 03:50:30 +0000 UTC R
outeCreated RouteController created a route}]" node.UID=0979c831-9910-4a7a-8658-4b7ea979aa84 node.cluster= node.name=virtual-kubelet node.resourceVersion=1949 node.taints="virtual-kubelet.io/provider=openstack:NoSchedule" operatingSystem=Linux provider=openstack watchedNamespace=
DEBU[0481] Successful node ping                          node=virtual-kubelet operatingSystem=Linux provider=openstack watchedNamespace=
DEBU[0481] Skipping pod status update                    method=syncProviderWrapper.syncPodStatuses nPods=2 node=virtual-kubelet operatingSystem=Linux provider=openstack watchedNamespace=
DEBU[0481] Skipping pod status update                    method=syncProviderWrapper.syncPodStatuses nPods=2 node=virtual-kubelet operatingSystem=Linux
 provider=openstack watchedNamespace=                                                                                                                 
DEBU[0481] Pod status update loop start                  node=virtual-kubelet operatingSystem=Linux provider=openstack watchedNamespace= 

I create k8s-master node by using kubeadm as well

kubeadm init --apiserver-advertise-address 192.168.122.146 --pod-network-cidr=10.10.10.0/24

The 10.10.10.0/24 is the same as external network from openstack:

root@controller:~# openstack subnet list
+--------------------------------------+-------------+--------------------------------------+------------------+
| ID                                   | Name        | Network                              | Subnet           |
+--------------------------------------+-------------+--------------------------------------+------------------+
| 5c04d6e5-7f35-4331-a9e8-7a5f23f5a846 | selfservice | a75c38ec-b7ff-4d14-8236-ef68e8a90ec4 | 10.10.10.0/24    |
| e01c75b7-ba01-4622-a1a3-305a368fa2e1 | provider    | 5a97816d-c2cc-4c0c-86e2-f6f91cc65431 | 192.168.122.0/24 |
+--------------------------------------+-------------+--------------------------------------+------------------+

@hongbin
Copy link
Collaborator

hongbin commented Jul 27, 2020

@lmq1999 , several things to check:

  • if your Zun is correctly functioning: https://docs.openstack.org/zun/latest/install/verify.html
  • if your pod is created in Zun (use command "openstack capsule list")?
  • if the pod is not created in Zun, check if virtual-kubelet can communicate with Zun.
  • if the pod is created but failed, check if there is any error message in the logs of the Zun processes (zun-api, zun-compute).

@lmq1999
Copy link
Author

lmq1999 commented Jul 27, 2020

Zun run correctly

Pod is created but fail

+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field           | Value                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| containers      | [{'uuid': '2744d973-e666-4354-ab6b-bbc195a04331', 'name': 'capsule-default-myapp-pod-eta-13', 'project_id': '73f9f1d50afe40829c65d17cafa42353', 'user_id': '157a1532cf2e4cdd8ce73a6982153ef2', 'image': 'busybox', 'cpu': None, 'cpu_policy': 'shared', 'memory': None, 'command': ['sh', '-c', 'echo Hello Kubernetes! && sleep 3600'], 'status': 'Creating', 'status_reason': None, 'task_state': None, 'environment': {'KUBERNETES_PORT': 'tcp://10.96.0.1:443', 'KUBERNETES_PORT_443_TCP': 'tcp://10.96.0.1:443', 'KUBERNETES_PORT_443_TCP_ADDR': '10.96.0.1', 'KUBERNETES_PORT_443_TCP_PORT': '443', 'KUBERNETES_PORT_443_TCP_PROTO': 'tcp', 'KUBERNETES_SERVICE_HOST': '10.96.0.1', 'KUBERNETES_SERVICE_PORT': '443', 'KUBERNETES_SERVICE_PORT_HTTPS': '443'}, 'workdir': None, 'auto_remove': False, 'ports': [], 'hostname': None, 'labels': {}, 'addresses': {}, 'restart_policy': {'MaximumRetryCount': '0', 'Name': 'always'}, 'status_detail': None, 'interactive': False, 'tty': False, 'image_driver': None, 'security_groups': [], 'disk': 0, 'auto_heal': False, 'healthcheck': {}, 'registry_id': None, 'entrypoint': []}] |
| init_containers | []                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| uuid            | 0eeda473-16c4-4eb5-9ad8-376bb8673264                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| links           | [{'href': 'http://controller:9517/v1/capsules/0eeda473-16c4-4eb5-9ad8-376bb8673264', 'rel': 'self'}, {'href': 'http://controller:9517/capsules/0eeda473-16c4-4eb5-9ad8-376bb8673264', 'rel': 'bookmark'}]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| name            | default-myapp-pod                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| project_id      | 73f9f1d50afe40829c65d17cafa42353                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| user_id         | 157a1532cf2e4cdd8ce73a6982153ef2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| cpu             | 0.0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| memory          | 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| status          | Error                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| status_reason   | No 'zun.vif_translators' driver found, looking for 'bridge'                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
| labels          | {'ClusterName': '', 'CreationTimestamp': '2020-07-27 07:03:01 +0000 UTC', 'Namespace': 'default', 'NodeName': 'virtual-kubelet', 'PodName': 'myapp-pod', 'UID': '6e8c4abc-e862-4d36-9f7f-3e672fc1eec5'}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| addresses       | 10.10.10.137                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| restart_policy  | {'MaximumRetryCount': '0', 'Name': 'always'}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| annotations     | None                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| created_at      | 2020-07-27 07:04:08                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| updated_at      | 2020-07-27 07:04:19                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| networks        | a75c38ec-b7ff-4d14-8236-ef68e8a90ec4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
+-----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

zun-compute logs:

2020-07-27 07:13:20.723 6394 WARNING stevedore.named [req-aed2264b-16e7-4da3-8e0c-7dba5d2fa23f - - - - -] Could not load bridge
2020-07-27 07:13:20.724 6394 ERROR zun.compute.manager [req-aed2264b-16e7-4da3-8e0c-7dba5d2fa23f - - - - -] Unexpected exception: No 'zun.vif_translators' driver found, looking for 'bridge': stevedore.exception.NoMatches: No 'zun.vif_translators' driver found, looking for 'bridge'
2020-07-27 07:13:20.724 6394 ERROR zun.compute.manager Traceback (most recent call last):
2020-07-27 07:13:20.724 6394 ERROR zun.compute.manager   File "/usr/local/lib/python3.6/dist-packages/zun/network/os_vif_util.py", line 236, in neutron_to_osvif_vif
2020-07-27 07:13:20.724 6394 ERROR zun.compute.manager     mgr = _VIF_MANAGERS[vif_translator]
2020-07-27 07:13:20.724 6394 ERROR zun.compute.manager KeyError: 'bridge'
2020-07-27 07:13:20.724 6394 ERROR zun.compute.manager 
2020-07-27 07:13:20.724 6394 ERROR zun.compute.manager During handling of the above exception, another exception occurred:
2020-07-27 07:13:20.724 6394 ERROR zun.compute.manager 
2020-07-27 07:13:20.724 6394 ERROR zun.compute.manager Traceback (most recent call last):
2020-07-27 07:13:20.724 6394 ERROR zun.compute.manager   File "/usr/local/lib/python3.6/dist-packages/zun/compute/manager.py", line 368, in _do_container_create_base
2020-07-27 07:13:20.724 6394 ERROR zun.compute.manager     requested_volumes)
2020-07-27 07:13:20.724 6394 ERROR zun.compute.manager   File "/usr/local/lib/python3.6/dist-packages/zun/container/cri/driver.py", line 52, in create_capsule
2020-07-27 07:13:20.724 6394 ERROR zun.compute.manager     self._create_pod_sandbox(context, capsule, requested_networks)
2020-07-27 07:13:20.724 6394 ERROR zun.compute.manager   File "/usr/local/lib/python3.6/dist-packages/zun/container/cri/driver.py", line 79, in _create_pod_sandbox
2020-07-27 07:13:20.724 6394 ERROR zun.compute.manager     self._write_cni_metadata(context, capsule, requested_networks)
2020-07-27 07:13:20.724 6394 ERROR zun.compute.manager   File "/usr/local/lib/python3.6/dist-packages/zun/container/cri/driver.py", line 116, in _write_cni_metadata
2020-07-27 07:13:20.724 6394 ERROR zun.compute.manager     subnets)
2020-07-27 07:13:20.724 6394 ERROR zun.compute.manager   File "/usr/local/lib/python3.6/dist-packages/zun/network/os_vif_util.py", line 240, in neutron_to_osvif_vif
2020-07-27 07:13:20.724 6394 ERROR zun.compute.manager     name=vif_translator, invoke_on_load=False)
2020-07-27 07:13:20.724 6394 ERROR zun.compute.manager   File "/usr/lib/python3/dist-packages/stevedore/driver.py", line 61, in __init__
2020-07-27 07:13:20.724 6394 ERROR zun.compute.manager     warn_on_missing_entrypoint=warn_on_missing_entrypoint
2020-07-27 07:13:20.724 6394 ERROR zun.compute.manager   File "/usr/lib/python3/dist-packages/stevedore/named.py", line 89, in __init__
2020-07-27 07:13:20.724 6394 ERROR zun.compute.manager     self._init_plugins(extensions)
2020-07-27 07:13:20.724 6394 ERROR zun.compute.manager   File "/usr/lib/python3/dist-packages/stevedore/driver.py", line 113, in _init_plugins
2020-07-27 07:13:20.724 6394 ERROR zun.compute.manager     (self.namespace, name))
2020-07-27 07:13:20.724 6394 ERROR zun.compute.manager stevedore.exception.NoMatches: No 'zun.vif_translators' driver found, looking for 'bridge'

zun-api logs:

2020-07-27 07:13:12.551 13724 INFO zun.api.controllers.v1.capsules [req-d43ac4a7-5636-42e0-a72a-fafd9dd796b5 - - - - -] Policy doesn't support image_pull_policy
2020-07-27 07:13:14.061 13724 INFO eventlet.wsgi.server [req-d43ac4a7-5636-42e0-a72a-fafd9dd796b5 - - - - -] 10.0.0.125 "POST /v1/capsules HTTP/1.1" status: 202  len: 2420 time: 1.7096784
2020-07-27 07:13:16.124 13724 INFO eventlet.wsgi.server [req-0ac19f24-50fe-4eac-a551-b2fdcbda71a0 - - - - -] 10.0.0.125 "GET /v1/capsules/default-myapp-pod HTTP/1.1" status: 200  len: 2339 time: 0.0457959
2020-07-27 07:13:16.189 13724 INFO eventlet.wsgi.server [req-d95c1ad9-759c-48ed-9ef2-8dee7d3a23ca - - - - -] 10.0.0.125 "GET /v1/capsules/default-myapp-pod HTTP/1.1" status: 200  len: 2339 time: 0.0453143
2020-07-27 07:13:21.167 13725 INFO eventlet.wsgi.server [req-3b4fe0c1-1e26-41f5-9879-88073cd1cec8 - - - - -] 10.0.0.125 "GET /v1/capsules/default-myapp-pod HTTP/1.1" status: 200  len: 2603 time: 0.03993

@lmq1999
Copy link
Author

lmq1999 commented Jul 27, 2020

I read the code in /usr/local/lib/python3.6/dist-packages/zun/network/os_vif_util.py and only see OVS, isn't it support linux bridge ?

@lmq1999
Copy link
Author

lmq1999 commented Jul 27, 2020

Latest update:

After reconfig openstack from linuxbridge to ovs:

+-----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field           | Value                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
+-----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| containers      | [{'uuid': '7594ff37-f036-4ba0-bed4-77bf7fa90c8b', 'name': 'capsule-default-myapp-pod-pi-13', 'project_id': '73f9f1d50afe40829c65d17cafa42353', 'user_id': '157a1532cf2e4cdd8ce73a6982153ef2', 'image': 'busybox', 'cpu': None, 'cpu_policy': 'shared', 'memory': None, 'command': ['sh', '-c', 'echo Hello Kubernetes! && sleep 3600'], 'status': 'Creating', 'status_reason': None, 'task_state': None, 'environment': {'KUBERNETES_PORT': 'tcp://10.96.0.1:443', 'KUBERNETES_PORT_443_TCP': 'tcp://10.96.0.1:443', 'KUBERNETES_PORT_443_TCP_ADDR': '10.96.0.1', 'KUBERNETES_PORT_443_TCP_PORT': '443', 'KUBERNETES_PORT_443_TCP_PROTO': 'tcp', 'KUBERNETES_SERVICE_HOST': '10.96.0.1', 'KUBERNETES_SERVICE_PORT': '443', 'KUBERNETES_SERVICE_PORT_HTTPS': '443'}, 'workdir': None, 'auto_remove': False, 'ports': [], 'hostname': None, 'labels': {}, 'addresses': {}, 'restart_policy': {'MaximumRetryCount': '0', 'Name': 'always'}, 'status_detail': None, 'interactive': False, 'tty': False, 'image_driver': None, 'security_groups': [], 'disk': 0, 'auto_heal': False, 'healthcheck': {}, 'registry_id': None, 'entrypoint': []}] |
| init_containers | []                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| uuid            | 8e0f709f-99f8-438b-be30-32f1b83e2e3a                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| links           | [{'href': 'http://controller:9517/v1/capsules/8e0f709f-99f8-438b-be30-32f1b83e2e3a', 'rel': 'self'}, {'href': 'http://controller:9517/capsules/8e0f709f-99f8-438b-be30-32f1b83e2e3a', 'rel': 'bookmark'}]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| name            | default-myapp-pod                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| project_id      | 73f9f1d50afe40829c65d17cafa42353                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| user_id         | 157a1532cf2e4cdd8ce73a6982153ef2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| cpu             | 0.0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| memory          | 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| status          | Error                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| status_reason   | <_InactiveRpcError of RPC that terminated with:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
|                 | 	status = StatusCode.UNAVAILABLE                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
|                 | 	details = "failed to connect to all addresses"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
|                 | 	debug_error_string = "{"created":"@1595845092.954850424","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3948,"referenced_errors":[{"created":"@1595845059.361132817","description":"failed to connect to all addresses","file":"src/core/ext/filters/client_channel/lb_policy/pick_first/pick_first.cc","file_line":394,"grpc_status":14}]}"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
|                 | >                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| labels          | {'ClusterName': '', 'CreationTimestamp': '2020-07-27 10:18:02 +0000 UTC', 'Namespace': 'default', 'NodeName': 'virtual-kubelet', 'PodName': 'myapp-pod', 'UID': 'da0507bd-7411-4bf9-aa27-d51f9b46cfa1'}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
| addresses       | 10.10.10.231                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| restart_policy  | {'MaximumRetryCount': '0', 'Name': 'always'}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| annotations     | None                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| created_at      | 2020-07-27 10:18:03                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| updated_at      | 2020-07-27 10:18:12                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| networks        | a4322dbc-4ffa-4a38-895d-16a32ee1deb9                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
+-----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

zun-api.log

2020-07-27 10:18:03.743 16291 INFO zun.api.controllers.v1.capsules [req-40283159-dcf3-435a-9020-e342defc1f16 - - - - -] Policy doesn't support image_pull_policy
2020-07-27 10:18:05.720 16291 INFO eventlet.wsgi.server [req-40283159-dcf3-435a-9020-e342defc1f16 - - - - -] 10.0.0.125 "POST /v1/capsules HTTP/1.1" status: 202  len: 2419 time: 2.3464501
2020-07-27 10:18:06.414 16291 INFO eventlet.wsgi.server [req-0e891c97-1d77-4c28-8f6b-939b001622c8 - - - - -] 10.0.0.125 "GET /v1/capsules/default-myapp-pod HTTP/1.1" status: 200  len: 2338 time: 0.0382912
2020-07-27 10:18:06.548 16291 INFO eventlet.wsgi.server [req-d22700a0-4bab-45d0-a2e1-61120404c979 - - - - -] 10.0.0.125 "GET /v1/capsules/default-myapp-pod HTTP/1.1" status: 200  len: 2338 time: 0.1146283
2020-07-27 10:18:11.495 16291 INFO eventlet.wsgi.server [req-90e60174-0924-4f09-9d11-5485ed708299 - - - - -] 10.0.0.125 "GET /v1/capsules/default-myapp-pod HTTP/1.1" status: 200  len: 2338 time: 0.0784547

zun-compute:

2020-07-27 10:12:05.841 4633 INFO zun.cmd.compute [-] Starting server in PID 4633
2020-07-27 10:12:05.919 4633 INFO zun.container.driver [-] Loading container driver 'docker'
2020-07-27 10:12:05.999 4633 INFO zun.volume.driver [-] Loading volume driver 'cinder'
2020-07-27 10:12:06.000 4633 INFO zun.volume.driver [-] Loading volume driver 'local'
2020-07-27 10:12:06.163 4633 INFO zun.image.driver [-] Loading container image driver 'glance'
2020-07-27 10:12:06.164 4633 INFO zun.image.driver [-] Loading container image driver 'docker'
2020-07-27 10:12:06.578 4633 INFO zun.volume.driver [-] Loading volume driver 'cinder'
2020-07-27 10:12:06.579 4633 INFO zun.volume.driver [-] Loading volume driver 'local'
2020-07-27 10:12:06.997 4633 WARNING zun.compute.compute_node_tracker [req-2b7a4024-7d42-42a1-99bc-7d20b818af1d - - - - -] No compute node record for: worker1: zun.common.exception.ComputeNodeNotFound: Compute node worker1 could not be found.
2020-07-27 10:12:07.049 4633 INFO zun.compute.compute_node_tracker [req-2b7a4024-7d42-42a1-99bc-7d20b818af1d - - - - -] Node created for :worker1
2020-07-27 10:12:50.039 4633 WARNING zun.compute.manager [req-9c52d922-6e55-4198-8d0e-5f1fd812ed7a 157a1532cf2e4cdd8ce73a6982153ef2 73f9f1d50afe40829c65d17cafa42353 default - -] Ignore the configured default disk size because the driver does not support disk quota.
2020-07-27 10:12:50.084 4633 INFO zun.compute.claims [req-9c52d922-6e55-4198-8d0e-5f1fd812ed7a 157a1532cf2e4cdd8ce73a6982153ef2 73f9f1d50afe40829c65d17cafa42353 default - -] Attempting claim: memory 512, cpu 1.00 CPU, disk 0
2020-07-27 10:12:50.085 4633 INFO zun.compute.claims [req-9c52d922-6e55-4198-8d0e-5f1fd812ed7a 157a1532cf2e4cdd8ce73a6982153ef2 73f9f1d50afe40829c65d17cafa42353 default - -] Total memory: 7976 MB, used: 0.00 MB
2020-07-27 10:12:50.085 4633 INFO zun.compute.claims [req-9c52d922-6e55-4198-8d0e-5f1fd812ed7a 157a1532cf2e4cdd8ce73a6982153ef2 73f9f1d50afe40829c65d17cafa42353 default - -] memory limit: 7976.00 MB, free: 7976.00 MB
2020-07-27 10:12:50.086 4633 INFO zun.compute.claims [req-9c52d922-6e55-4198-8d0e-5f1fd812ed7a 157a1532cf2e4cdd8ce73a6982153ef2 73f9f1d50afe40829c65d17cafa42353 default - -] Total vcpu: 2 VCPU, used: 0.00 VCPU
2020-07-27 10:12:50.086 4633 INFO zun.compute.claims [req-9c52d922-6e55-4198-8d0e-5f1fd812ed7a 157a1532cf2e4cdd8ce73a6982153ef2 73f9f1d50afe40829c65d17cafa42353 default - -] vcpu limit: 2.00 VCPU, free: 2.00 VCPU
2020-07-27 10:12:50.086 4633 INFO zun.compute.claims [req-9c52d922-6e55-4198-8d0e-5f1fd812ed7a 157a1532cf2e4cdd8ce73a6982153ef2 73f9f1d50afe40829c65d17cafa42353 default - -] Total disk: 17 GB, used: 0.00 GB
2020-07-27 10:12:50.086 4633 INFO zun.compute.claims [req-9c52d922-6e55-4198-8d0e-5f1fd812ed7a 157a1532cf2e4cdd8ce73a6982153ef2 73f9f1d50afe40829c65d17cafa42353 default - -] disk limit not specified, defaulting to unlimited
2020-07-27 10:12:50.087 4633 INFO zun.compute.claims [req-9c52d922-6e55-4198-8d0e-5f1fd812ed7a 157a1532cf2e4cdd8ce73a6982153ef2 73f9f1d50afe40829c65d17cafa42353 default - -] Claim successful
2020-07-27 10:16:23.879 4633 WARNING zun.compute.manager [req-3f6e0118-f8c2-45f2-bf15-dfac6226dbc5 157a1532cf2e4cdd8ce73a6982153ef2 73f9f1d50afe40829c65d17cafa42353 default - -] Ignore the configured default disk size because the driver does not support disk quota.
2020-07-27 10:16:23.908 4633 INFO zun.compute.claims [req-3f6e0118-f8c2-45f2-bf15-dfac6226dbc5 157a1532cf2e4cdd8ce73a6982153ef2 73f9f1d50afe40829c65d17cafa42353 default - -] Attempting claim: memory 0, cpu 0.00 CPU, disk 0
2020-07-27 10:16:23.909 4633 INFO zun.compute.claims [req-3f6e0118-f8c2-45f2-bf15-dfac6226dbc5 157a1532cf2e4cdd8ce73a6982153ef2 73f9f1d50afe40829c65d17cafa42353 default - -] Total memory: 7976 MB, used: 512.00 MB
2020-07-27 10:16:23.909 4633 INFO zun.compute.claims [req-3f6e0118-f8c2-45f2-bf15-dfac6226dbc5 157a1532cf2e4cdd8ce73a6982153ef2 73f9f1d50afe40829c65d17cafa42353 default - -] memory limit: 7976.00 MB, free: 7464.00 MB
2020-07-27 10:16:23.910 4633 INFO zun.compute.claims [req-3f6e0118-f8c2-45f2-bf15-dfac6226dbc5 157a1532cf2e4cdd8ce73a6982153ef2 73f9f1d50afe40829c65d17cafa42353 default - -] Total vcpu: 2 VCPU, used: 1.00 VCPU
2020-07-27 10:16:23.910 4633 INFO zun.compute.claims [req-3f6e0118-f8c2-45f2-bf15-dfac6226dbc5 157a1532cf2e4cdd8ce73a6982153ef2 73f9f1d50afe40829c65d17cafa42353 default - -] vcpu limit not specified, defaulting to unlimited
2020-07-27 10:16:23.910 4633 INFO zun.compute.claims [req-3f6e0118-f8c2-45f2-bf15-dfac6226dbc5 157a1532cf2e4cdd8ce73a6982153ef2 73f9f1d50afe40829c65d17cafa42353 default - -] Total disk: 17 GB, used: 0.00 GB
2020-07-27 10:16:23.911 4633 INFO zun.compute.claims [req-3f6e0118-f8c2-45f2-bf15-dfac6226dbc5 157a1532cf2e4cdd8ce73a6982153ef2 73f9f1d50afe40829c65d17cafa42353 default - -] disk limit not specified, defaulting to unlimited
2020-07-27 10:16:23.911 4633 INFO zun.compute.claims [req-3f6e0118-f8c2-45f2-bf15-dfac6226dbc5 157a1532cf2e4cdd8ce73a6982153ef2 73f9f1d50afe40829c65d17cafa42353 default - -] Claim successful
2020-07-27 10:16:33.492 4633 ERROR zun.compute.manager [req-d94ed33f-4e9a-4268-bddd-404de7322507 - - - - -] Unexpected exception: <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "failed to connect to all addresses"
	debug_error_string = "{"created":"@1595844993.492038743","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3948,"referenced_errors":[{"created":"@1595844993.492036009","description":"failed to connect to all addresses","file":"src/core/ext/filters/client_channel/lb_policy/pick_first/pick_first.cc","file_line":394,"grpc_status":14}]}"
>: grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "failed to connect to all addresses"
	debug_error_string = "{"created":"@1595844993.492038743","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3948,"referenced_errors":[{"created":"@1595844993.492036009","description":"failed to connect to all addresses","file":"src/core/ext/filters/client_channel/lb_policy/pick_first/pick_first.cc","file_line":394,"grpc_status":14}]}"
>
2020-07-27 10:16:33.492 4633 ERROR zun.compute.manager Traceback (most recent call last):
2020-07-27 10:16:33.492 4633 ERROR zun.compute.manager   File "/usr/local/lib/python3.6/dist-packages/zun/compute/manager.py", line 368, in _do_container_create_base
2020-07-27 10:16:33.492 4633 ERROR zun.compute.manager     requested_volumes)
2020-07-27 10:16:33.492 4633 ERROR zun.compute.manager   File "/usr/local/lib/python3.6/dist-packages/zun/container/cri/driver.py", line 52, in create_capsule
2020-07-27 10:16:33.492 4633 ERROR zun.compute.manager     self._create_pod_sandbox(context, capsule, requested_networks)
2020-07-27 10:16:33.492 4633 ERROR zun.compute.manager   File "/usr/local/lib/python3.6/dist-packages/zun/container/cri/driver.py", line 83, in _create_pod_sandbox
2020-07-27 10:16:33.492 4633 ERROR zun.compute.manager     runtime_handler=runtime,
2020-07-27 10:16:33.492 4633 ERROR zun.compute.manager   File "/usr/local/lib/python3.6/dist-packages/grpc/_channel.py", line 826, in __call__
2020-07-27 10:16:33.492 4633 ERROR zun.compute.manager     return _end_unary_response_blocking(state, call, False, None)
2020-07-27 10:16:33.492 4633 ERROR zun.compute.manager   File "/usr/local/lib/python3.6/dist-packages/grpc/_channel.py", line 729, in _end_unary_response_blocking
2020-07-27 10:16:33.492 4633 ERROR zun.compute.manager     raise _InactiveRpcError(state)
2020-07-27 10:16:33.492 4633 ERROR zun.compute.manager grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
2020-07-27 10:16:33.492 4633 ERROR zun.compute.manager 	status = StatusCode.UNAVAILABLE
2020-07-27 10:16:33.492 4633 ERROR zun.compute.manager 	details = "failed to connect to all addresses"
2020-07-27 10:16:33.492 4633 ERROR zun.compute.manager 	debug_error_string = "{"created":"@1595844993.492038743","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3948,"referenced_errors":[{"created":"@1595844993.492036009","description":"failed to connect to all addresses","file":"src/core/ext/filters/client_channel/lb_policy/pick_first/pick_first.cc","file_line":394,"grpc_status":14}]}"
2020-07-27 10:16:33.492 4633 ERROR zun.compute.manager >
2020-07-27 10:16:33.492 4633 ERROR zun.compute.manager 
2020-07-27 10:18:05.723 4633 WARNING zun.compute.manager [req-261a1b41-629c-46e2-92fc-8d57702b0247 157a1532cf2e4cdd8ce73a6982153ef2 73f9f1d50afe40829c65d17cafa42353 default - -] Ignore the configured default disk size because the driver does not support disk quota.
2020-07-27 10:18:05.744 4633 INFO zun.compute.claims [req-261a1b41-629c-46e2-92fc-8d57702b0247 157a1532cf2e4cdd8ce73a6982153ef2 73f9f1d50afe40829c65d17cafa42353 default - -] Attempting claim: memory 0, cpu 0.00 CPU, disk 0
2020-07-27 10:18:05.745 4633 INFO zun.compute.claims [req-261a1b41-629c-46e2-92fc-8d57702b0247 157a1532cf2e4cdd8ce73a6982153ef2 73f9f1d50afe40829c65d17cafa42353 default - -] Total memory: 7976 MB, used: 512.00 MB
2020-07-27 10:18:05.745 4633 INFO zun.compute.claims [req-261a1b41-629c-46e2-92fc-8d57702b0247 157a1532cf2e4cdd8ce73a6982153ef2 73f9f1d50afe40829c65d17cafa42353 default - -] memory limit: 7976.00 MB, free: 7464.00 MB
2020-07-27 10:18:05.746 4633 INFO zun.compute.claims [req-261a1b41-629c-46e2-92fc-8d57702b0247 157a1532cf2e4cdd8ce73a6982153ef2 73f9f1d50afe40829c65d17cafa42353 default - -] Total vcpu: 2 VCPU, used: 1.00 VCPU
2020-07-27 10:18:05.746 4633 INFO zun.compute.claims [req-261a1b41-629c-46e2-92fc-8d57702b0247 157a1532cf2e4cdd8ce73a6982153ef2 73f9f1d50afe40829c65d17cafa42353 default - -] vcpu limit not specified, defaulting to unlimited
2020-07-27 10:18:05.746 4633 INFO zun.compute.claims [req-261a1b41-629c-46e2-92fc-8d57702b0247 157a1532cf2e4cdd8ce73a6982153ef2 73f9f1d50afe40829c65d17cafa42353 default - -] Total disk: 17 GB, used: 0.00 GB
2020-07-27 10:18:05.747 4633 INFO zun.compute.claims [req-261a1b41-629c-46e2-92fc-8d57702b0247 157a1532cf2e4cdd8ce73a6982153ef2 73f9f1d50afe40829c65d17cafa42353 default - -] disk limit not specified, defaulting to unlimited
2020-07-27 10:18:05.747 4633 INFO zun.compute.claims [req-261a1b41-629c-46e2-92fc-8d57702b0247 157a1532cf2e4cdd8ce73a6982153ef2 73f9f1d50afe40829c65d17cafa42353 default - -] Claim successful
2020-07-27 10:18:12.955 4633 ERROR zun.compute.manager [req-16c30d4e-17af-4c17-b51f-efdfa3b6b3d6 - - - - -] Unexpected exception: <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "failed to connect to all addresses"
	debug_error_string = "{"created":"@1595845092.954850424","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3948,"referenced_errors":[{"created":"@1595845059.361132817","description":"failed to connect to all addresses","file":"src/core/ext/filters/client_channel/lb_policy/pick_first/pick_first.cc","file_line":394,"grpc_status":14}]}"
>: grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "failed to connect to all addresses"
	debug_error_string = "{"created":"@1595845092.954850424","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3948,"referenced_errors":[{"created":"@1595845059.361132817","description":"failed to connect to all addresses","file":"src/core/ext/filters/client_channel/lb_policy/pick_first/pick_first.cc","file_line":394,"grpc_status":14}]}"
>
2020-07-27 10:18:12.955 4633 ERROR zun.compute.manager Traceback (most recent call last):
2020-07-27 10:18:12.955 4633 ERROR zun.compute.manager   File "/usr/local/lib/python3.6/dist-packages/zun/compute/manager.py", line 368, in _do_container_create_base
2020-07-27 10:18:12.955 4633 ERROR zun.compute.manager     requested_volumes)
2020-07-27 10:18:12.955 4633 ERROR zun.compute.manager   File "/usr/local/lib/python3.6/dist-packages/zun/container/cri/driver.py", line 52, in create_capsule
2020-07-27 10:18:12.955 4633 ERROR zun.compute.manager     self._create_pod_sandbox(context, capsule, requested_networks)
2020-07-27 10:18:12.955 4633 ERROR zun.compute.manager   File "/usr/local/lib/python3.6/dist-packages/zun/container/cri/driver.py", line 83, in _create_pod_sandbox
2020-07-27 10:18:12.955 4633 ERROR zun.compute.manager     runtime_handler=runtime,
2020-07-27 10:18:12.955 4633 ERROR zun.compute.manager   File "/usr/local/lib/python3.6/dist-packages/grpc/_channel.py", line 826, in __call__
2020-07-27 10:18:12.955 4633 ERROR zun.compute.manager     return _end_unary_response_blocking(state, call, False, None)
2020-07-27 10:18:12.955 4633 ERROR zun.compute.manager   File "/usr/local/lib/python3.6/dist-packages/grpc/_channel.py", line 729, in _end_unary_response_blocking
2020-07-27 10:18:12.955 4633 ERROR zun.compute.manager     raise _InactiveRpcError(state)
2020-07-27 10:18:12.955 4633 ERROR zun.compute.manager grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
2020-07-27 10:18:12.955 4633 ERROR zun.compute.manager 	status = StatusCode.UNAVAILABLE
2020-07-27 10:18:12.955 4633 ERROR zun.compute.manager 	details = "failed to connect to all addresses"
2020-07-27 10:18:12.955 4633 ERROR zun.compute.manager 	debug_error_string = "{"created":"@1595845092.954850424","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3948,"referenced_errors":[{"created":"@1595845059.361132817","description":"failed to connect to all addresses","file":"src/core/ext/filters/client_channel/lb_policy/pick_first/pick_first.cc","file_line":394,"grpc_status":14}]}"
2020-07-27 10:18:12.955 4633 ERROR zun.compute.manager >
2020-07-27 10:18:12.955 4633 ERROR zun.compute.manager 

@hongbin
Copy link
Collaborator

hongbin commented Jul 27, 2020

@lmq1999,

For the linux bridge support, I will add support to it.

For the "failed to connect to all addresses" error, it seems your zun-cni-daemon is not installed correctly. Could you double check (systemctl status zun-cni-daemon)?

@lmq1999
Copy link
Author

lmq1999 commented Jul 28, 2020

everything seem file except that the VIF in log show is it linux-bridge :/

zun-cni status:

root@worker1:~# service zun-cni-daemon status
● zun-cni-daemon.service - OpenStack Container Service CNI daemon
   Loaded: loaded (/etc/systemd/system/zun-cni-daemon.service; enabled; vendor preset: enabled)
   Active: active (running) since Tue 2020-07-28 04:00:58 UTC; 11min ago
 Main PID: 1088 (zun-cni-daemon:)
    Tasks: 13 (limit: 4915)
   CGroup: /system.slice/zun-cni-daemon.service
           ├─1088 zun-cni-daemon: master process [/usr/local/bin/zun-cni-daemon]
           ├─3488 zun-cni-daemon: master process [/usr/local/bin/zun-cni-daemon]
           ├─3497 zun-cni-daemon: watcher worker(0)
           └─3500 zun-cni-daemon: server worker(0)

Jul 28 04:00:58 worker1 systemd[1]: Started OpenStack Container Service CNI daemon.
Jul 28 04:04:32 worker1 zun-cni-daemon[1088]:  * Serving Flask app "zun-cni-daemon" (lazy loading)
Jul 28 04:04:32 worker1 zun-cni-daemon[1088]:  * Environment: production
Jul 28 04:04:32 worker1 zun-cni-daemon[1088]:    WARNING: This is a development server. Do not use it in a production deployment.
Jul 28 04:04:32 worker1 zun-cni-daemon[1088]:    Use a production WSGI server instead.
Jul 28 04:04:32 worker1 zun-cni-daemon[1088]:  * Debug mode: off

zun-cni log:

root@worker1:~# tail -f /var/log/zun/zun-cni-daemon.log
2020-07-27 10:04:17.184 20605 INFO cotyledon._service [req-d51bab0b-3ed0-46d2-aae1-6c794a0982e3 - - - - -] Caught SIGTERM signal, graceful exiting of service server(0) [20605]
2020-07-27 10:04:17.189 20573 INFO cotyledon._service_manager [-] Caught SIGTERM signal, graceful exiting of master process
2020-07-27 10:04:17.190 20602 INFO cotyledon._service [req-9b9cc37c-8c69-4d1f-939d-8caa44a8ee92 - - - - -] Caught SIGTERM signal, graceful exiting of service watcher(0) [20602]
2020-07-27 10:05:42.622 1054 INFO os_vif [-] Loaded VIF plugins: linux_bridge, noop, ovs
2020-07-27 10:05:42.797 2811 INFO werkzeug [-]  * Running on http://127.0.0.1:9036/ (Press CTRL+C to quit)
2020-07-27 10:34:35.290 2811 INFO cotyledon._service [req-d3390ab8-0991-4fdb-bf8e-4f6986a1032f - - - - -] Caught SIGTERM signal, graceful exiting of service server(0) [2811]
2020-07-27 10:34:35.310 1054 INFO cotyledon._service_manager [-] Caught SIGTERM signal, graceful exiting of master process
2020-07-27 10:34:35.293 2808 INFO cotyledon._service [req-8130e990-0954-4a6b-ad92-be2854f71dcf - - - - -] Caught SIGTERM signal, graceful exiting of service watcher(0) [2808]
2020-07-28 04:04:32.349 1088 INFO os_vif [-] Loaded VIF plugins: linux_bridge, noop, ovs
2020-07-28 04:04:32.774 3500 INFO werkzeug [-]  * Running on http://127.0.0.1:9036/ (Press CTRL+C to quit)

When I enable debug-mode:

zun-cni log:

root@worker1:~# tail -f /var/log/zun/zun-cni-daemon.log
2020-07-28 04:23:06.736 18114 DEBUG futurist.periodics [-] Submitting periodic callback 'zun.cni.daemon.service.CNIDaemonWatcherService.poll_vif_status' _process_scheduled /usr/lib/python3/dist-packages/futurist/periodics.py:642
2020-07-28 04:23:07.737 18114 DEBUG futurist.periodics [-] Submitting periodic callback 'zun.cni.daemon.service.CNIDaemonWatcherService.poll_vif_status' _process_scheduled /usr/lib/python3/dist-packages/futurist/periodics.py:642
2020-07-28 04:23:08.739 18114 DEBUG futurist.periodics [-] Submitting periodic callback 'zun.cni.daemon.service.CNIDaemonWatcherService.poll_vif_status' _process_scheduled /usr/lib/python3/dist-packages/futurist/periodics.py:642
2020-07-28 04:23:51.630 18114 DEBUG futurist.periodics [-] Submitting periodic callback 'zun.cni.daemon.service.CNIDaemonWatcherService.sync_capsules' _process_scheduled /usr/lib/python3/dist-packages/futurist/periodics.py:642
2020-07-28 04:23:51.631 18114 DEBUG zun.cni.daemon.service [-] Start syncing capsule states. sync_capsules /usr/local/lib/python3.6/dist-packages/zun/cni/daemon/service.py:153

zun-compute-log:

2020-07-28 04:24:54.590 19208 DEBUG eventlet.wsgi.server [-] (19208) accepted ('10.0.0.121', 53618) server /usr/lib/python3/dist-packages/eventlet/wsgi.py:985
2020-07-28 04:24:54.601 19208 INFO eventlet.wsgi.server [req-34aab59b-c246-439f-982a-9b8134719a7d - - - - -] 10.0.0.121 "GET / HTTP/1.1" status: 200  len: 641 time: 0.0097344
2020-07-28 04:24:54.885 19208 DEBUG zun.common.policy [req-17829629-ff67-41bd-8679-0b472373e81f - - - - -] Policy check for container:get_one:image_pull_policy failed with credentials {'is_admin': False, 'user_id': '157a1532cf2e4cdd8ce73a6982153ef2', 'user_domain_id': None, 'system_scope': None, 'domain_id': 'default', 'project_id': '73f9f1d50afe40829c65d17cafa42353', 'project_domain_id': None, 'roles': ['myrole'], 'is_admin_project': True, 'service_user_id': None, 'service_user_domain_id': None, 'service_project_id': None, 'service_project_domain_id': None, 'service_roles': []} authorize /usr/local/lib/python3.6/dist-packages/zun/common/policy.py:147
2020-07-28 04:24:54.886 19208 DEBUG zun.common.policy [req-17829629-ff67-41bd-8679-0b472373e81f - - - - -] Policy check for container:get_one:host failed with credentials {'is_admin': False, 'user_id': '157a1532cf2e4cdd8ce73a6982153ef2', 'user_domain_id': None, 'system_scope': None, 'domain_id': 'default', 'project_id': '73f9f1d50afe40829c65d17cafa42353', 'project_domain_id': None, 'roles': ['myrole'], 'is_admin_project': True, 'service_user_id': None, 'service_user_domain_id': None, 'service_project_id': None, 'service_project_domain_id': None, 'service_roles': []} authorize /usr/local/lib/python3.6/dist-packages/zun/common/policy.py:147
2020-07-28 04:24:54.886 19208 DEBUG zun.common.policy [req-17829629-ff67-41bd-8679-0b472373e81f - - - - -] Policy check for container:get_one:runtime failed with credentials {'is_admin': False, 'user_id': '157a1532cf2e4cdd8ce73a6982153ef2', 'user_domain_id': None, 'system_scope': None, 'domain_id': 'default', 'project_id': '73f9f1d50afe40829c65d17cafa42353', 'project_domain_id': None, 'roles': ['myrole'], 'is_admin_project': True, 'service_user_id': None, 'service_user_domain_id': None, 'service_project_id': None, 'service_project_domain_id': None, 'service_roles': []} authorize /usr/local/lib/python3.6/dist-packages/zun/common/policy.py:147
2020-07-28 04:24:54.887 19208 DEBUG zun.common.policy [req-17829629-ff67-41bd-8679-0b472373e81f - - - - -] Policy check for container:get_one:privileged failed with credentials {'is_admin': False, 'user_id': '157a1532cf2e4cdd8ce73a6982153ef2', 'user_domain_id': None, 'system_scope': None, 'domain_id': 'default', 'project_id': '73f9f1d50afe40829c65d17cafa42353', 'project_domain_id': None, 'roles': ['myrole'], 'is_admin_project': True, 'service_user_id': None, 'service_user_domain_id': None, 'service_project_id': None, 'service_project_domain_id': None, 'service_roles': []} authorize /usr/local/lib/python3.6/dist-packages/zun/common/policy.py:147
2020-07-28 04:24:54.888 19208 DEBUG zun.common.policy [req-17829629-ff67-41bd-8679-0b472373e81f - - - - -] Policy check for capsule:get:host failed with credentials {'is_admin': False, 'user_id': '157a1532cf2e4cdd8ce73a6982153ef2', 'user_domain_id': None, 'system_scope': None, 'domain_id': 'default', 'project_id': '73f9f1d50afe40829c65d17cafa42353', 'project_domain_id': None, 'roles': ['myrole'], 'is_admin_project': True, 'service_user_id': None, 'service_user_domain_id': None, 'service_project_id': None, 'service_project_domain_id': None, 'service_roles': []} authorize /usr/local/lib/python3.6/dist-packages/zun/common/policy.py:147
2020-07-28 04:24:54.889 19208 INFO eventlet.wsgi.server [req-17829629-ff67-41bd-8679-0b472373e81f - - - - -] 10.0.0.121 "GET /v1/capsules/4fa2515a-e49d-445f-a0fa-0e8fea28244d HTTP/1.1" status: 200  len: 3129 time: 0.2853196


@hongbin
Copy link
Collaborator

hongbin commented Jul 28, 2020

zun-cni-daemon doesn't receive any requests from containerd. Could you check the status of containerd (systemctl status containerd)?

Perhaps, containerd doesn't have the right permission to accept grpc requests. Could you check the containerd's config file as well (/etc/containerd/config.toml)?

@lmq1999
Copy link
Author

lmq1999 commented Jul 28, 2020

contained services status:

   Loaded: loaded (/lib/systemd/system/containerd.service; enabled; vendor preset: enabled)
   Active: active (running) since Tue 2020-07-28 04:01:23 UTC; 31min ago
     Docs: https://containerd.io
  Process: 1602 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
 Main PID: 1720 (containerd)
    Tasks: 189
   CGroup: /system.slice/containerd.service
           ├─ 1720 /usr/bin/containerd
           ├─ 4329 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/b8a856568f73b1dbf0bb0698cc0edab732f8e1c4e13a2d2a8b59832575f3076c -address /run/containerd/containerd.sock -containerd-binary /usr/bin/co
           ├─ 4391 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/03157db729d273231f46cb7f543d53420dd6a115dc17d97d3eeb4124af0653b9 -address /run/containerd/containerd.sock -containerd-binary /usr/bin/co
           ├─ 4401 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/fe1184f49e49d98d93fe85d7733c5da5770baeabbcc3d1ff7e28fce375969dd2 -address /run/containerd/containerd.sock -containerd-binary /usr/bin/co
           ├─ 4478 /pause
           ├─ 4489 /pause
           ├─ 4495 /pause
           ├─ 4528 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/21edf2d4a24e28b269953b8c89f8919192d2e2bba4fde971b173c75afaa22a7b -address /run/containerd/containerd.sock -containerd-binary /usr/bin/co
           ├─ 4556 /pause
           ├─ 4714 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/0c62d1ba49fb9ea754c8339b720053a636d9db21e968320c621575ddca0e8815 -address /run/containerd/containerd.sock -containerd-binary /usr/bin/co
           ├─ 4738 etcd --advertise-client-urls=https://10.0.0.125:2379 --cert-file=/etc/kubernetes/pki/etcd/server.crt --client-cert-auth=true --data-dir=/var/lib/etcd --initial-advertise-peer-urls=https://10.0.0.125:2380 --initial-cluster=k8s-master=
           ├─ 4762 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/9d62ca5b1cf4347d4a5f2fc36f9df601e6856700eed8e400d774c44b59af41c0 -address /run/containerd/containerd.sock -containerd-binary /usr/bin/co
           ├─ 4787 kube-controller-manager --allocate-node-cidrs=true --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf --bind-address=127.0.0.1 --client-ca-file=/etc/
           ├─ 4812 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/64478258df3706ab6bfa5aa3e32a442b7b49d9f89ed104a2746f418dbf1c2e62 -address /run/containerd/containerd.sock -containerd-binary /usr/bin/co
           ├─ 4835 kube-scheduler --authentication-kubeconfig=/etc/kubernetes/scheduler.conf --authorization-kubeconfig=/etc/kubernetes/scheduler.conf --bind-address=127.0.0.1 --kubeconfig=/etc/kubernetes/scheduler.conf --leader-elect=true
           ├─ 4871 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/f11aef97036ab74e081786493b2c969af6e574e6ba6fa3a92fca3c0da245fd18 -address /run/containerd/containerd.sock -containerd-binary /usr/bin/co
           ├─ 4902 kube-apiserver --advertise-address=10.0.0.125 --allow-privileged=true --authorization-mode=Node,RBAC --client-ca-file=/etc/kubernetes/pki/ca.crt --enable-admission-plugins=NodeRestriction --enable-bootstrap-token-auth=true --etcd-caf
           ├─ 7975 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/288ece3b3c4a7f0283d1bbf3d6d04eaf81ab1622d743b196852756c0c3fb5aee -address /run/containerd/containerd.sock -containerd-binary /usr/bin/co
           ├─ 7999 /pause
           ├─ 8265 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/69ae396832550fe6c61d689339c1e81fede6568d61d6093fc131c45b9cbf3768 -address /run/containerd/containerd.sock -containerd-binary /usr/bin/co
           ├─ 8325 /pause
           ├─ 8735 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/0c63a0f0b721595f3d52bf38c4eee4bc15f98f533b25dd4153a3f94e1c450050 -address /run/containerd/containerd.sock -containerd-binary /usr/bin/co
           ├─ 8836 /usr/local/bin/kube-proxy --config=/var/lib/kube-proxy/config.conf --hostname-override=k8s-master
           ├─ 9217 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/193a5e32393733fb51dc7ae7f5a0b31924b61f3acbb7b32b1dfe8578f84d8ca9 -address /run/containerd/containerd.sock -containerd-binary /usr/bin/co
           ├─ 9242 /opt/bin/flanneld --ip-masq --kube-subnet-mgr
           ├─ 9904 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/33676735f514c7ea05521317020e0d4a6153ae89a7fb91012a2c5a49c62a368a -address /run/containerd/containerd.sock -containerd-binary /usr/bin/co
           ├─ 9948 /pause
           ├─10097 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/888a4dc06dc0c9c973730e54005ec95a2c3a16f47d689c9fcb901d44145d3a07 -address /run/containerd/containerd.sock -containerd-binary /usr/bin/co
           ├─10124 /pause
           ├─10327 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/913229ee11594da79db4b80a369d79805dbbaaff08613a551d56a3bc960fa246 -address /run/containerd/containerd.sock -containerd-binary /usr/bin/co
           ├─10413 /coredns -conf /etc/coredns/Corefile
           ├─10514 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/f134aeb3d70efcc1755b8efaae339b2f0354f49ecba2fb49ff7ed25f0c679838 -address /run/containerd/containerd.sock -containerd-binary /usr/bin/co
           └─10541 /coredns -conf /etc/coredns/Corefile

Jul 28 04:06:21 k8s-master containerd[1720]: time="2020-07-28T04:06:21.170296081Z" level=info msg="shim reaped" id=55db7cab4c0b650e20d9dfdd6111a78fdf331879420a7de63438bef2d6307752
Jul 28 04:06:26 k8s-master containerd[1720]: time="2020-07-28T04:06:26.629739409Z" level=info msg="shim containerd-shim started" address=/containerd-shim/bfec9c282a83fb0be840e0edc96c01198ad723d90bb73d6e096dc26a94b71616.sock debug=false pid=9187
Jul 28 04:06:26 k8s-master containerd[1720]: time="2020-07-28T04:06:26.746569650Z" level=info msg="shim containerd-shim started" address=/containerd-shim/ad13efc632aa50e62c92e2fd9baa2be4c7270e083d5e14ee5e6ccdee1717a323.sock debug=false pid=9217
Jul 28 04:06:28 k8s-master containerd[1720]: time="2020-07-28T04:06:28.238123304Z" level=info msg="shim containerd-shim started" address=/containerd-shim/40ff38c3977e740f26be17caf1263899aed09e8ed68e6873e908935e5f80c934.sock debug=false pid=9344
Jul 28 04:06:28 k8s-master containerd[1720]: time="2020-07-28T04:06:28.694249680Z" level=info msg="shim reaped" id=dd0d9cf7a5bc826d02ea5d96000ece44231cfc6f345759532b6148fbb880de52
Jul 28 04:06:29 k8s-master containerd[1720]: time="2020-07-28T04:06:29.592948522Z" level=info msg="shim reaped" id=ecf75fe29006355f3b45c3cbdf692cb405026cf9587a907926b35d691b771b9e
Jul 28 04:06:37 k8s-master containerd[1720]: time="2020-07-28T04:06:37.742604617Z" level=info msg="shim containerd-shim started" address=/containerd-shim/6d1688b78e46a28ae3b303f0fc8b40b3f04f30efaa8974c919eb5896e46ad136.sock debug=false pid=9904

the config file:

#   Copyright 2018-2020 Docker Inc.

#   Licensed under the Apache License, Version 2.0 (the "License");
#   you may not use this file except in compliance with the License.
#   You may obtain a copy of the License at

#       http://www.apache.org/licenses/LICENSE-2.0

#   Unless required by applicable law or agreed to in writing, software
#   distributed under the License is distributed on an "AS IS" BASIS,
#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#   See the License for the specific language governing permissions and
#   limitations under the License.

disabled_plugins = ["cri"]

#root = "/var/lib/containerd"
#state = "/run/containerd"
#subreaper = true
#oom_score = 0

#[grpc]
#  address = "/run/containerd/containerd.sock"
#  uid = 0
#  gid = 0

#[debug]
#  address = "/run/containerd/debug.sock"
#  uid = 0
#  gid = 0
#  level = "info"

well I only see this file in openstack compute node, not in k8s node

@hongbin
Copy link
Collaborator

hongbin commented Jul 28, 2020

Right. This file needs to be configured in compute nodes. Will it work if you configure the "gid" then restart the containerd process?

[grpc]
...
gid = ZUN_GROUP_ID

Replace ZUN_GROUP_ID with the real group ID of zun user. You can retrieve the ID by (for example):

getent group zun | cut -d: -f3

@lmq1999
Copy link
Author

lmq1999 commented Jul 28, 2020

hmmm, i change like you said to gid=997

disabled_plugins = ["cri"]

#root = "/var/lib/containerd"
#state = "/run/containerd"
#subreaper = true
#oom_score = 0

[grpc]
#  address = "/run/containerd/containerd.sock"
#uid = 0
gid = 997

#[debug]
#  address = "/run/containerd/debug.sock"
#  uid = 0
#  gid = 0
#  level = "info"

root@worker1:~# getent group zun | cut -d: -f3
997

after create pod in virtual-kubelet, this new error appear:

+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field           | Value                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| containers      | [{'uuid': 'f749866d-77bb-4fba-9ff3-0eb4bcdbd101', 'name': 'capsule-default-myapp-pod-rho-18', 'project_id': '73f9f1d50afe40829c65d17cafa42353', 'user_id': '157a1532cf2e4cdd8ce73a6982153ef2', 'image': 'busybox', 'cpu': None, 'cpu_policy': 'shared', 'memory': None, 'command': ['sh', '-c', 'echo Hello Kubernetes! && sleep 3600'], 'status': 'Creating', 'status_reason': None, 'task_state': None, 'environment': {'KUBERNETES_PORT': 'tcp://10.96.0.1:443', 'KUBERNETES_PORT_443_TCP': 'tcp://10.96.0.1:443', 'KUBERNETES_PORT_443_TCP_ADDR': '10.96.0.1', 'KUBERNETES_PORT_443_TCP_PORT': '443', 'KUBERNETES_PORT_443_TCP_PROTO': 'tcp', 'KUBERNETES_SERVICE_HOST': '10.96.0.1', 'KUBERNETES_SERVICE_PORT': '443', 'KUBERNETES_SERVICE_PORT_HTTPS': '443'}, 'workdir': None, 'auto_remove': False, 'ports': [], 'hostname': None, 'labels': {}, 'addresses': {}, 'restart_policy': {'MaximumRetryCount': '0', 'Name': 'always'}, 'status_detail': None, 'interactive': False, 'tty': False, 'image_driver': None, 'security_groups': [], 'disk': 0, 'auto_heal': False, 'healthcheck': {}, 'registry_id': None, 'entrypoint': []}] |
| init_containers | []                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| uuid            | 071ea789-df74-4348-81bf-4140448e7187                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| links           | [{'href': 'http://controller:9517/v1/capsules/071ea789-df74-4348-81bf-4140448e7187', 'rel': 'self'}, {'href': 'http://controller:9517/capsules/071ea789-df74-4348-81bf-4140448e7187', 'rel': 'bookmark'}]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| name            | default-myapp-pod                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| project_id      | 73f9f1d50afe40829c65d17cafa42353                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| user_id         | 157a1532cf2e4cdd8ce73a6982153ef2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| cpu             | 0.0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| memory          | 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| status          | Error                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| status_reason   | <_InactiveRpcError of RPC that terminated with:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
|                 | 	status = StatusCode.UNIMPLEMENTED                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
|                 | 	details = "unknown service runtime.v1alpha2.RuntimeService"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
|                 | 	debug_error_string = "{"created":"@1595911426.107134797","description":"Error received from peer unix:/run/containerd/containerd.sock","file":"src/core/lib/surface/call.cc","file_line":1055,"grpc_message":"unknown service runtime.v1alpha2.RuntimeService","grpc_status":12}"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
|                 | >                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| labels          | {'ClusterName': '', 'CreationTimestamp': '2020-07-28 04:43:38 +0000 UTC', 'Namespace': 'default', 'NodeName': 'virtual-kubelet', 'PodName': 'myapp-pod', 'UID': '52c65dbc-980c-4205-bd04-c84b3c492557'}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| addresses       | 10.10.10.97                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
| restart_policy  | {'MaximumRetryCount': '0', 'Name': 'always'}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| annotations     | None                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| created_at      | 2020-07-28 04:43:38                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| updated_at      | 2020-07-28 04:43:46                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| networks        | a4322dbc-4ffa-4a38-895d-16a32ee1deb9                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

containerd logs:

Jul 28 04:42:47 worker1 systemd[1]: Stopping containerd container runtime...
Jul 28 04:42:47 worker1 systemd[1]: Stopped containerd container runtime.
Jul 28 04:42:47 worker1 systemd[1]: Starting containerd container runtime...
Jul 28 04:42:47 worker1 systemd[1]: Started containerd container runtime.
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.124072976Z" level=info msg="starting containerd" revision=7ad184331fa3e55e52b890ea95e65ba581ae3429 version=1.2.13
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.124481607Z" level=info msg="loading plugin "io.containerd.content.v1.content"..." type=io.containerd.content.v1
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.124508591Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.btrfs"..." type=io.containerd.snapshotter
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.124672575Z" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.btrfs" error="path /var/lib/cont
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.124769284Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.aufs"..." type=io.containerd.snapshotter.
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.127220693Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.native"..." type=io.containerd.snapshotte
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.127250094Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.overlayfs"..." type=io.containerd.snapsho
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.127300400Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.zfs"..." type=io.containerd.snapshotter.v
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.127456631Z" level=info msg="skip loading plugin "io.containerd.snapshotter.v1.zfs"..." type=io.containerd.snapshot
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.127470019Z" level=info msg="loading plugin "io.containerd.metadata.v1.bolt"..." type=io.containerd.metadata.v1
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.127481935Z" level=warning msg="could not use snapshotter zfs in metadata plugin" error="path /var/lib/containerd/i
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.127488848Z" level=warning msg="could not use snapshotter btrfs in metadata plugin" error="path /var/lib/containerd
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.127577552Z" level=info msg="loading plugin "io.containerd.differ.v1.walking"..." type=io.containerd.differ.v1
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.127592720Z" level=info msg="loading plugin "io.containerd.gc.v1.scheduler"..." type=io.containerd.gc.v1
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.127626742Z" level=info msg="loading plugin "io.containerd.service.v1.containers-service"..." type=io.containerd.se
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.127638201Z" level=info msg="loading plugin "io.containerd.service.v1.content-service"..." type=io.containerd.servi
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.127647284Z" level=info msg="loading plugin "io.containerd.service.v1.diff-service"..." type=io.containerd.service.
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.127657153Z" level=info msg="loading plugin "io.containerd.service.v1.images-service"..." type=io.containerd.servic
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.127667192Z" level=info msg="loading plugin "io.containerd.service.v1.leases-service"..." type=io.containerd.servic
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.127676906Z" level=info msg="loading plugin "io.containerd.service.v1.namespaces-service"..." type=io.containerd.se
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.127685417Z" level=info msg="loading plugin "io.containerd.service.v1.snapshots-service"..." type=io.containerd.ser
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.127694671Z" level=info msg="loading plugin "io.containerd.runtime.v1.linux"..." type=io.containerd.runtime.v1
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.127735439Z" level=info msg="loading plugin "io.containerd.runtime.v2.task"..." type=io.containerd.runtime.v2
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.127765051Z" level=info msg="loading plugin "io.containerd.monitor.v1.cgroups"..." type=io.containerd.monitor.v1
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.129126076Z" level=info msg="loading plugin "io.containerd.service.v1.tasks-service"..." type=io.containerd.service
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.129155279Z" level=info msg="loading plugin "io.containerd.internal.v1.restart"..." type=io.containerd.internal.v1
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.129197648Z" level=info msg="loading plugin "io.containerd.grpc.v1.containers"..." type=io.containerd.grpc.v1
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.129209509Z" level=info msg="loading plugin "io.containerd.grpc.v1.content"..." type=io.containerd.grpc.v1
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.129218399Z" level=info msg="loading plugin "io.containerd.grpc.v1.diff"..." type=io.containerd.grpc.v1
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.129226319Z" level=info msg="loading plugin "io.containerd.grpc.v1.events"..." type=io.containerd.grpc.v1
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.129234557Z" level=info msg="loading plugin "io.containerd.grpc.v1.healthcheck"..." type=io.containerd.grpc.v1
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.129244152Z" level=info msg="loading plugin "io.containerd.grpc.v1.images"..." type=io.containerd.grpc.v1
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.129252118Z" level=info msg="loading plugin "io.containerd.grpc.v1.leases"..." type=io.containerd.grpc.v1
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.129261247Z" level=info msg="loading plugin "io.containerd.grpc.v1.namespaces"..." type=io.containerd.grpc.v1
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.129269694Z" level=info msg="loading plugin "io.containerd.internal.v1.opt"..." type=io.containerd.internal.v1
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.129298882Z" level=info msg="loading plugin "io.containerd.grpc.v1.snapshots"..." type=io.containerd.grpc.v1
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.129309788Z" level=info msg="loading plugin "io.containerd.grpc.v1.tasks"..." type=io.containerd.grpc.v1
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.129318602Z" level=info msg="loading plugin "io.containerd.grpc.v1.version"..." type=io.containerd.grpc.v1
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.129327065Z" level=info msg="loading plugin "io.containerd.grpc.v1.introspection"..." type=io.containerd.grpc.v1
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.130184755Z" level=info msg=serving... address="/run/containerd/containerd.sock"
Jul 28 04:42:47 worker1 containerd[6239]: time="2020-07-28T04:42:47.130204661Z" level=info msg="containerd successfully booted in 0.006635s"

@lmq1999
Copy link
Author

lmq1999 commented Jul 28, 2020

Lastest update:

After I use containerd config to create a : containerd config default > config.toml

And also change GID to 997 like you said:

and I got this error, seem like be back to CNI:

+-----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field           | Value                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
+-----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| containers      | [{'uuid': '0df03acf-785a-4c31-a925-cc59622cca88', 'name': 'capsule-default-myapp-pod-psi-9', 'project_id': '73f9f1d50afe40829c65d17cafa42353', 'user_id': '157a1532cf2e4cdd8ce73a6982153ef2', 'image': 'busybox', 'cpu': None, 'cpu_policy': 'shared', 'memory': None, 'command': ['sh', '-c', 'echo Hello Kubernetes! && sleep 3600'], 'status': 'Creating', 'status_reason': None, 'task_state': None, 'environment': {'KUBERNETES_PORT': 'tcp://10.96.0.1:443', 'KUBERNETES_PORT_443_TCP': 'tcp://10.96.0.1:443', 'KUBERNETES_PORT_443_TCP_ADDR': '10.96.0.1', 'KUBERNETES_PORT_443_TCP_PORT': '443', 'KUBERNETES_PORT_443_TCP_PROTO': 'tcp', 'KUBERNETES_SERVICE_HOST': '10.96.0.1', 'KUBERNETES_SERVICE_PORT': '443', 'KUBERNETES_SERVICE_PORT_HTTPS': '443'}, 'workdir': None, 'auto_remove': False, 'ports': [], 'hostname': None, 'labels': {}, 'addresses': {}, 'restart_policy': {'MaximumRetryCount': '0', 'Name': 'always'}, 'status_detail': None, 'interactive': False, 'tty': False, 'image_driver': None, 'security_groups': [], 'disk': 0, 'auto_heal': False, 'healthcheck': {}, 'registry_id': None, 'entrypoint': []}] |
| init_containers | []                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| uuid            | e990ea6c-abdd-48d4-b45b-870dc1d3b6a2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| links           | [{'href': 'http://controller:9517/v1/capsules/e990ea6c-abdd-48d4-b45b-870dc1d3b6a2', 'rel': 'self'}, {'href': 'http://controller:9517/capsules/e990ea6c-abdd-48d4-b45b-870dc1d3b6a2', 'rel': 'bookmark'}]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| name            | default-myapp-pod                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| project_id      | 73f9f1d50afe40829c65d17cafa42353                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| user_id         | 157a1532cf2e4cdd8ce73a6982153ef2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| cpu             | 0.0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| memory          | 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| status          | Error                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| status_reason   | <_InactiveRpcError of RPC that terminated with:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
|                 | 	status = StatusCode.UNKNOWN                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
|                 | 	details = "failed to setup network for sandbox "2b9b60ff3162aa28722e6cdd92490defed31994c820c379259d224066d9eb194": Got invalid status code from CNI daemon; Traceback (most recent call last):                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
|                 |   File "/usr/local/lib/python3.6/dist-packages/zun/cni/api.py", line 75, in run                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
|                 |     vif = self._add(params)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
|                 |   File "/usr/local/lib/python3.6/dist-packages/zun/cni/api.py", line 129, in _add                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
|                 |     resp = self._make_request('addNetwork', params, httplib.ACCEPTED)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                 |   File "/usr/local/lib/python3.6/dist-packages/zun/cni/api.py", line 167, in _make_request                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
|                 |     raise exception.CNIError('Got invalid status code from CNI daemon')                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
|                 | zun.common.exception.CNIError: Got invalid status code from CNI daemon                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
|                 | "                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
|                 | 	debug_error_string = "{"created":"@1595912825.832394318","description":"Error received from peer unix:/run/containerd/containerd.sock","file":"src/core/lib/surface/call.cc","file_line":1055,"grpc_message":"failed to setup network for sandbox "2b9b60ff3162aa28722e6cdd92490defed31994c820c379259d224066d9eb194": Got invalid status code from CNI daemon; Traceback (most recent call last):\n  File "/usr/local/lib/python3.6/dist-packages/zun/cni/api.py", line 75, in run\n    vif = self._add(params)\n  File "/usr/local/lib/python3.6/dist-packages/zun/cni/api.py", line 129, in _add\n    resp = self._make_request('addNetwork', params, httplib.ACCEPTED)\n  File "/usr/local/lib/python3.6/dist-packages/zun/cni/api.py", line 167, in _make_request\n    raise exception.CNIError('Got invalid status code from CNI daemon')\nzun.common.exception.CNIError: Got invalid status code from CNI daemon\n","grpc_status":2}"                                                                                                                                                                                               |
|                 | >                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| labels          | {'ClusterName': '', 'CreationTimestamp': '2020-07-28 05:06:47 +0000 UTC', 'Namespace': 'default', 'NodeName': 'virtual-kubelet', 'PodName': 'myapp-pod', 'UID': '9df92ccf-cc1d-473e-af12-707d1e559757'}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
| addresses       | 10.10.10.20                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| restart_policy  | {'MaximumRetryCount': '0', 'Name': 'always'}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| annotations     | None                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| created_at      | 2020-07-28 05:06:47                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| updated_at      | 2020-07-28 05:07:05                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| networks        | a4322dbc-4ffa-4a38-895d-16a32ee1deb9                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
+-----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

@lmq1999
Copy link
Author

lmq1999 commented Jul 28, 2020

CNI daemon logs:

2020-07-28 07:13:08.837 23254 INFO oslo.privsep.daemon [-] Running privsep helper: ['sudo', 'zun-rootwrap', '/etc/zun/rootwrap.conf', 'privsep-helper', '--config-file', '/etc/zun/zun
.conf', '--privsep_context', 'vif_plug_ovs.privsep.vif_plug', '--privsep_sock_path', '/tmp/tmp_pnf0hwm/privsep.sock']
2020-07-28 07:13:09.623 23254 INFO oslo.privsep.daemon [-] Spawned new privsep daemon via rootwrap                                              
2020-07-28 07:13:09.508 23272 INFO oslo.privsep.daemon [-] privsep daemon starting 
2020-07-28 07:13:09.520 23272 INFO oslo.privsep.daemon [-] privsep process running with uid/gid: 0/0                                                     
2020-07-28 07:13:09.523 23272 INFO oslo.privsep.daemon [-] privsep process running with capabilities (eff/prm/inh): CAP_NET_ADMIN/CAP_NET_ADMIN/none
2020-07-28 07:13:09.524 23272 INFO oslo.privsep.daemon [-] privsep daemon running as pid 23272                                                           
2020-07-28 07:13:10.154 23254 INFO os_vif [-] Successfully plugged vif VIFBridge(active=False,address=fa:16:3e:02:1a:bb,bridge_name='qbr8cd87d70-5c',has_traffic_filtering=True,id=8cd
87d70-5cb7-4397-a5a1-7478a676de91,network=Network(a4322dbc-4ffa-4a38-895d-16a32ee1deb9),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=False,vif_name='tap8cd8
7d70-5c')                                                                                                   
2020-07-28 07:13:10.157 23254 INFO oslo.privsep.daemon [-] Running privsep helper: ['sudo', 'zun-rootwrap', '/etc/zun/rootwrap.conf', 'privsep-helper', '--config-file', '/etc/zun/zun
.conf', '--privsep_context', 'zun.common.privileged.cni', '--privsep_sock_path', '/tmp/tmpxuvn729u/privsep.sock']
2020-07-28 07:13:10.938 23254 INFO oslo.privsep.daemon [-] Spawned new privsep daemon via rootwrap                                 
2020-07-28 07:13:10.820 23382 INFO oslo.privsep.daemon [-] privsep daemon starting
2020-07-28 07:13:10.823 23382 INFO oslo.privsep.daemon [-] privsep process running with uid/gid: 0/0                  
2020-07-28 07:13:10.826 23382 INFO oslo.privsep.daemon [-] privsep process running with capabilities (eff/prm/inh): CAP_NET_ADMIN|CAP_SYS_ADMIN|CAP_SYS_PTRACE/CAP_NET_ADMIN|CAP_SYS_$DMIN|CAP_SYS_PTRACE/none                                                                                                                                 
2020-07-28 07:13:10.826 23382 INFO oslo.privsep.daemon [-] privsep daemon running as pid 23382
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service [-] Error when processing addNetwork request. CNI Params: {'CNI_COMMAND': 'ADD', 'CNI_CONTAINERID': '995dfcab5c850280ddb01f
c407fefda4cf2116b37f130e9071df61de5eb6b5a4', 'CNI_NETNS': '/var/run/netns/cni-0e5ba6ca-74a9-0709-3973-db740eb0634e', 'CNI_ARGS': 'IgnoreUnknown=1;K8S_POD_NAMESPACE=default;K8S_POD_NA
ME=5973904c-be15-4c30-9df0-b0d48e8f98f2;K8S_POD_INFRA_CONTAINER_ID=995dfcab5c850280ddb01fc407fefda4cf2116b37f130e9071df61de5eb6b5a4', 'CNI_IFNAME': 'eth0', 'CNI_PATH': '/opt/cni/bin'
}: pyroute2.netlink.exceptions.NetlinkError: (1, 'Operation not permitted')
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service Traceback (most recent call last):                                                      
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/local/lib/python3.6/dist-packages/zun/cni/daemon/service.py", line 66, in add
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     vif = self.plugin.add(params)                                                     
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/local/lib/python3.6/dist-packages/zun/cni/plugins/zun_cni_registry.py", line 46, in add
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     vifs = self._do_work(params, b_base.connect)                                      
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/local/lib/python3.6/dist-packages/zun/cni/plugins/zun_cni_registry.py", line 146, in _do_work
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     container_id=params.CNI_CONTAINERID)                                             
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/local/lib/python3.6/dist-packages/zun/cni/binding/base.py", line 132, in connect
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     driver.connect(vif, ifname, netns, container_id)                                          
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/local/lib/python3.6/dist-packages/zun/cni/binding/bridge.py", line 88, in connect
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     h_br.add_port(host_ifname)                                                                
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/ipdb/transactional.py", line 209, in __exit__
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     self.commit()                                                                     
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/ipdb/interfaces.py", line 1078, in commit
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     raise error                                                                       
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/ipdb/interfaces.py", line 769, in commit
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     run(nl.link, 'update', index=i, master=self['index'])               
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/ipdb/interfaces.py", line 504, in _run
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     raise error                                            
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/ipdb/interfaces.py", line 499, in _run                                     
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     return cmd(*argv, **kwarg)                            
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/iproute/linux.py", line 1332, in link
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     msg_flags=msg_flags)
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/netlink/nlsocket.py", line 373, in nlm_request
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     return tuple(self._genlm_request(*argv, **kwarg))                                    
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/netlink/nlsocket.py", line 864, in nlm_request
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     callback=callback):                                                                                                    
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/netlink/nlsocket.py", line 376, in get                                     
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     return tuple(self._genlm_get(*argv, **kwarg))
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/netlink/nlsocket.py", line 701, in get                                     
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     raise msg['header']['error']                      
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service pyroute2.netlink.exceptions.NetlinkError: (1, 'Operation not permitted')
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service                        
2020-07-28 07:13:12.333 23254 INFO werkzeug [-] 127.0.0.1 - - [28/Jul/2020 07:13:12] "POST /addNetwork HTTP/1.1" 500 -

@lmq1999
Copy link
Author

lmq1999 commented Jul 28, 2020

I also have question, my kubernetes setup using:

kubeadm init --apiserver-advertise-address 10.0.0.251 --pod-network-cidr=10.244.0.0/16

that pod-network-cidr is not belong to any external, internal network from openstack
I also use flannel for k8s, is there anything wrong with that setup when using VK or not ?

@hongbin
Copy link
Collaborator

hongbin commented Jul 29, 2020

CNI daemon logs:

2020-07-28 07:13:08.837 23254 INFO oslo.privsep.daemon [-] Running privsep helper: ['sudo', 'zun-rootwrap', '/etc/zun/rootwrap.conf', 'privsep-helper', '--config-file', '/etc/zun/zun
.conf', '--privsep_context', 'vif_plug_ovs.privsep.vif_plug', '--privsep_sock_path', '/tmp/tmp_pnf0hwm/privsep.sock']
2020-07-28 07:13:09.623 23254 INFO oslo.privsep.daemon [-] Spawned new privsep daemon via rootwrap                                              
2020-07-28 07:13:09.508 23272 INFO oslo.privsep.daemon [-] privsep daemon starting 
2020-07-28 07:13:09.520 23272 INFO oslo.privsep.daemon [-] privsep process running with uid/gid: 0/0                                                     
2020-07-28 07:13:09.523 23272 INFO oslo.privsep.daemon [-] privsep process running with capabilities (eff/prm/inh): CAP_NET_ADMIN/CAP_NET_ADMIN/none
2020-07-28 07:13:09.524 23272 INFO oslo.privsep.daemon [-] privsep daemon running as pid 23272                                                           
2020-07-28 07:13:10.154 23254 INFO os_vif [-] Successfully plugged vif VIFBridge(active=False,address=fa:16:3e:02:1a:bb,bridge_name='qbr8cd87d70-5c',has_traffic_filtering=True,id=8cd
87d70-5cb7-4397-a5a1-7478a676de91,network=Network(a4322dbc-4ffa-4a38-895d-16a32ee1deb9),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=False,vif_name='tap8cd8
7d70-5c')                                                                                                   
2020-07-28 07:13:10.157 23254 INFO oslo.privsep.daemon [-] Running privsep helper: ['sudo', 'zun-rootwrap', '/etc/zun/rootwrap.conf', 'privsep-helper', '--config-file', '/etc/zun/zun
.conf', '--privsep_context', 'zun.common.privileged.cni', '--privsep_sock_path', '/tmp/tmpxuvn729u/privsep.sock']
2020-07-28 07:13:10.938 23254 INFO oslo.privsep.daemon [-] Spawned new privsep daemon via rootwrap                                 
2020-07-28 07:13:10.820 23382 INFO oslo.privsep.daemon [-] privsep daemon starting
2020-07-28 07:13:10.823 23382 INFO oslo.privsep.daemon [-] privsep process running with uid/gid: 0/0                  
2020-07-28 07:13:10.826 23382 INFO oslo.privsep.daemon [-] privsep process running with capabilities (eff/prm/inh): CAP_NET_ADMIN|CAP_SYS_ADMIN|CAP_SYS_PTRACE/CAP_NET_ADMIN|CAP_SYS_$DMIN|CAP_SYS_PTRACE/none                                                                                                                                 
2020-07-28 07:13:10.826 23382 INFO oslo.privsep.daemon [-] privsep daemon running as pid 23382
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service [-] Error when processing addNetwork request. CNI Params: {'CNI_COMMAND': 'ADD', 'CNI_CONTAINERID': '995dfcab5c850280ddb01f
c407fefda4cf2116b37f130e9071df61de5eb6b5a4', 'CNI_NETNS': '/var/run/netns/cni-0e5ba6ca-74a9-0709-3973-db740eb0634e', 'CNI_ARGS': 'IgnoreUnknown=1;K8S_POD_NAMESPACE=default;K8S_POD_NA
ME=5973904c-be15-4c30-9df0-b0d48e8f98f2;K8S_POD_INFRA_CONTAINER_ID=995dfcab5c850280ddb01fc407fefda4cf2116b37f130e9071df61de5eb6b5a4', 'CNI_IFNAME': 'eth0', 'CNI_PATH': '/opt/cni/bin'
}: pyroute2.netlink.exceptions.NetlinkError: (1, 'Operation not permitted')
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service Traceback (most recent call last):                                                      
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/local/lib/python3.6/dist-packages/zun/cni/daemon/service.py", line 66, in add
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     vif = self.plugin.add(params)                                                     
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/local/lib/python3.6/dist-packages/zun/cni/plugins/zun_cni_registry.py", line 46, in add
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     vifs = self._do_work(params, b_base.connect)                                      
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/local/lib/python3.6/dist-packages/zun/cni/plugins/zun_cni_registry.py", line 146, in _do_work
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     container_id=params.CNI_CONTAINERID)                                             
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/local/lib/python3.6/dist-packages/zun/cni/binding/base.py", line 132, in connect
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     driver.connect(vif, ifname, netns, container_id)                                          
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/local/lib/python3.6/dist-packages/zun/cni/binding/bridge.py", line 88, in connect
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     h_br.add_port(host_ifname)                                                                
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/ipdb/transactional.py", line 209, in __exit__
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     self.commit()                                                                     
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/ipdb/interfaces.py", line 1078, in commit
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     raise error                                                                       
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/ipdb/interfaces.py", line 769, in commit
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     run(nl.link, 'update', index=i, master=self['index'])               
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/ipdb/interfaces.py", line 504, in _run
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     raise error                                            
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/ipdb/interfaces.py", line 499, in _run                                     
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     return cmd(*argv, **kwarg)                            
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/iproute/linux.py", line 1332, in link
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     msg_flags=msg_flags)
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/netlink/nlsocket.py", line 373, in nlm_request
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     return tuple(self._genlm_request(*argv, **kwarg))                                    
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/netlink/nlsocket.py", line 864, in nlm_request
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     callback=callback):                                                                                                    
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/netlink/nlsocket.py", line 376, in get                                     
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     return tuple(self._genlm_get(*argv, **kwarg))
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/netlink/nlsocket.py", line 701, in get                                     
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     raise msg['header']['error']                      
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service pyroute2.netlink.exceptions.NetlinkError: (1, 'Operation not permitted')
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service                        
2020-07-28 07:13:12.333 23254 INFO werkzeug [-] 127.0.0.1 - - [28/Jul/2020 07:13:12] "POST /addNetwork HTTP/1.1" 500 -

It looks there is a bug on neutron hybrid plug, which I have to fix at zun side.

The easiest work-around is to use openvswitch as firewall driver in neutron side:

[securitygroup]
firewall_driver = openvswitch

Otherwise, you can wait for my fix on ovs hybrid driver.

@lmq1999
Copy link
Author

lmq1999 commented Jul 29, 2020

Tkank you very much

It working perfectly now

root@controller:~# openstack capsule list
+--------------------------------------+------------------------------+---------+--------------+
| uuid                                 | name                         | status  | addresses    |
+--------------------------------------+------------------------------+---------+--------------+
| 99339455-3483-46b7-b1ce-89a18855bf00 | kube-system-kube-proxy-r4hgv | Error   | 10.10.10.179 |
| 132da630-0ef0-4d36-8a4d-dd29793122cb | default-myapp-pod            | Running | 10.10.10.175 |
+--------------------------------------+------------------------------+---------+--------------+

@hongbin
Copy link
Collaborator

hongbin commented Jul 29, 2020

I also have question, my kubernetes setup using:

kubeadm init --apiserver-advertise-address 10.0.0.251 --pod-network-cidr=10.244.0.0/16

that pod-network-cidr is not belong to any external, internal network from openstack
I also use flannel for k8s, is there anything wrong with that setup when using VK or not ?

I guess there is nothing wrong. However, the --pod-network-cidr option applies to normal nodes. The virtual node created by virtual-kubelet doesn't use that option.

I also assume it is ok to use flannel together with virtual kubelet. Again, flannel applies to normal pods (not pods created by virtual-kubelet).

@hongbin
Copy link
Collaborator

hongbin commented Jul 29, 2020

Tkank you very much

It working perfectly now

root@controller:~# openstack capsule list
+--------------------------------------+------------------------------+---------+--------------+
| uuid                                 | name                         | status  | addresses    |
+--------------------------------------+------------------------------+---------+--------------+
| 99339455-3483-46b7-b1ce-89a18855bf00 | kube-system-kube-proxy-r4hgv | Error   | 10.10.10.179 |
| 132da630-0ef0-4d36-8a4d-dd29793122cb | default-myapp-pod            | Running | 10.10.10.175 |
+--------------------------------------+------------------------------+---------+--------------+

Cool!

@lmq1999
Copy link
Author

lmq1999 commented Jul 29, 2020

uhmmmmm, I have another question ... it about networking in VK

I use kubernetes wordpress demo link: https://kubernetes.io/docs/tutorials/stateful-application/mysql-wordpress-persistent-volume/

Of course I edit some tolerants so it can run:

root@controller:~# openstack capsule list
+--------------------------------------+------------------------------------------+---------+-----------------+
| uuid                                 | name                                     | status  | addresses       |
+--------------------------------------+------------------------------------------+---------+-----------------+
| 215e84a2-2f03-43ec-9c33-410b19410b20 | default-myapp-pod                        | Error   | 192.168.122.221 |
| 25462856-3ec3-417c-a3c9-12148f533e1b | default-myapp-pod                        | Running | 192.168.122.138 |
| a8f7b91c-ecf5-4e59-b0d0-aeb43d97d047 | default-wordpress-75dfd8796-58lql        | Running | 192.168.122.213 |
| daa42d67-708e-49da-a718-c831bca0669b | default-wordpress-mysql-69c9d5854b-gdjfb | Running | 192.168.122.205 |
+--------------------------------------+------------------------------------------+---------+-----------------+

When I check for svc:

root@k8s-master:~/wordpress# kubectl get svc -o wide
NAME              TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE   SELECTOR
kubernetes        ClusterIP   10.96.0.1       <none>        443/TCP        41h   <none>
wordpress         NodePort    10.101.115.93   <none>        80:30502/TCP   12m   app=wordpress,tier=frontend
wordpress-mysql   ClusterIP   None            <none>        3306/TCP       12m   app=wordpress,tier=mysql

So how to "access" to this pod, I already setup security group for port 22 and 80 but connection refuse

@hongbin
Copy link
Collaborator

hongbin commented Jul 29, 2020

uhmmmmm, I have another question ... it about networking in VK

I use kubernetes wordpress demo link: https://kubernetes.io/docs/tutorials/stateful-application/mysql-wordpress-persistent-volume/

Of course I edit some tolerants so it can run:

root@controller:~# openstack capsule list
+--------------------------------------+------------------------------------------+---------+-----------------+
| uuid                                 | name                                     | status  | addresses       |
+--------------------------------------+------------------------------------------+---------+-----------------+
| 215e84a2-2f03-43ec-9c33-410b19410b20 | default-myapp-pod                        | Error   | 192.168.122.221 |
| 25462856-3ec3-417c-a3c9-12148f533e1b | default-myapp-pod                        | Running | 192.168.122.138 |
| a8f7b91c-ecf5-4e59-b0d0-aeb43d97d047 | default-wordpress-75dfd8796-58lql        | Running | 192.168.122.213 |
| daa42d67-708e-49da-a718-c831bca0669b | default-wordpress-mysql-69c9d5854b-gdjfb | Running | 192.168.122.205 |
+--------------------------------------+------------------------------------------+---------+-----------------+

When I check for svc:

root@k8s-master:~/wordpress# kubectl get svc -o wide
NAME              TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE   SELECTOR
kubernetes        ClusterIP   10.96.0.1       <none>        443/TCP        41h   <none>
wordpress         NodePort    10.101.115.93   <none>        80:30502/TCP   12m   app=wordpress,tier=frontend
wordpress-mysql   ClusterIP   None            <none>        3306/TCP       12m   app=wordpress,tier=mysql

So how to "access" to this pod, I already setup security group for port 22 and 80 but connection refuse

The simplest way is to use a public facing neutron network instead of tenant network (i.e. 192.168.122.*). As a result, each pod has a public IP that is accessible from kubenetes service.

Alternatively, you can create a nova VM in the tenant network and install kubenetes control plane in there. As a result, it can access pods in the same tenant.

@lmq1999
Copy link
Author

lmq1999 commented Jul 29, 2020

I use VM so the 192.168.122.* work as provider network.

For example like this for more easier:

I create nginx deployment via VK:

+--------------------------------------+------------------------------------------+---------+-----------------+
| uuid                                 | name                                     | status  | addresses       |
+--------------------------------------+------------------------------------------+---------+-----------------+
| 215e84a2-2f03-43ec-9c33-410b19410b20 | default-myapp-pod                        | Error   | 192.168.122.221 |
| 25462856-3ec3-417c-a3c9-12148f533e1b | default-myapp-pod                        | Running | 192.168.122.138 |
| cdd0ccf4-cc7b-41b4-aa1b-7ffc636e9b42 | default-nginx-deployment-d75b7d9ff-8bbm2 | Running | 192.168.122.195 |
| e5ca07f7-2ffd-472e-9a12-64f99fdfd588 | default-nginx-deployment-d75b7d9ff-sdmkb | Running | 192.168.122.231 |
+--------------------------------------+------------------------------------------+---------+-----------------+
root@k8s-master:~/wordpress# kubectl get pod
NAME                               READY   STATUS        RESTARTS   AGE
nginx-deployment-d75b7d9ff-8bbm2   1/1     Running       0          20m
nginx-deployment-d75b7d9ff-sdmkb   1/1     Running       0          19m

I can ping, curl like this:

root@controller:~# curl 192.168.122.231
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

but when I try wordpress

root@controller:~# openstack capsule list
+--------------------------------------+------------------------------------------+---------+-----------------+
| uuid                                 | name                                     | status  | addresses       |
+--------------------------------------+------------------------------------------+---------+-----------------+
| 215e84a2-2f03-43ec-9c33-410b19410b20 | default-myapp-pod                        | Error   | 192.168.122.221 |
| 25462856-3ec3-417c-a3c9-12148f533e1b | default-myapp-pod                        | Running | 192.168.122.138 |
| ce359aff-33b8-4b8b-a212-5aa41b30addd | default-wordpress-mysql-69c9d5854b-wk26m | Running | 192.168.122.168 |
| 6dde67b5-3439-4e9e-9409-222bbb581228 | default-wordpress-75dfd8796-tctjs        | Running | 192.168.122.132 |
+--------------------------------------+------------------------------------------+---------+-----------------+

root@k8s-master:~/wordpress# kubectl get svc
NAME              TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)         AGE
kubernetes        NodePort    10.96.0.1      <none>        443:30947/TCP   42h
wordpress         ClusterIP   10.98.65.138   <none>        80/TCP          28m
wordpress-mysql   ClusterIP   None           <none>        3306/TCP        28m

I can't curl to wordpress (192.168.122.213)

root@controller:~# curl 192.168.122.132
curl: (7) Failed to connect to 192.168.122.132 port 80: Connection refused

service:

root@k8s-master:~/wordpress# kubectl get svc
NAME              TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)         AGE
kubernetes        NodePort    10.96.0.1      <none>        443:30947/TCP   42h
wordpress         NodePort    10.98.65.138   <none>        80:32119/TCP    30m
wordpress-mysql   ClusterIP   None           <none>        3306/TCP        30m

netstat at k8s:

root@k8s-master:~/wordpress# netstat -tupln | grep 32119
tcp        0      0 0.0.0.0:32119           0.0.0.0:*               LISTEN      9776/kube-proxy  

security group rules:

root@controller:~# openstack security group rule list 87190ce9-2d67-43cb-b552-0e67ec9bd2f0
+--------------------------------------+-------------+-----------+-----------+-------------+--------------------------------------+
| ID                                   | IP Protocol | Ethertype | IP Range  | Port Range  | Remote Security Group                |
+--------------------------------------+-------------+-----------+-----------+-------------+--------------------------------------+
| 3575e46b-0738-403b-a5f7-5134008f3628 | udp         | IPv4      | 0.0.0.0/0 | 33434:33434 | None                                 |
| 552cdad4-a73b-4c56-ac50-be7463f56435 | None        | IPv4      | 0.0.0.0/0 |             | None                                 |
| 5b96e032-c7e8-480f-bde4-c0be1ed8189c | None        | IPv6      | ::/0      |             | 87190ce9-2d67-43cb-b552-0e67ec9bd2f0 |
| 777c1499-6e42-4cee-8fc0-088229496bdc | tcp         | IPv4      | 0.0.0.0/0 | 80:80       | None                                 |
| 92d93823-2722-49e3-9836-0f2a72673b30 | None        | IPv6      | ::/0      |             | None                                 |
| aec7dc63-1756-4291-90ea-e0ff4529e5b0 | icmp        | IPv4      | 0.0.0.0/0 |             | None                                 |
| eb2121df-94b1-4a2c-bd82-542fc949486c | tcp         | IPv4      | 0.0.0.0/0 | 22:22       | None                                 |
| fcc06a0e-8385-4e64-9b8f-44a82313c9d1 | None        | IPv4      | 0.0.0.0/0 |             | 87190ce9-2d67-43cb-b552-0e67ec9bd2f0 |
+--------------------------------------+-------------+-----------+-----------+-------------+--------------------------------------+

So what should i do in this situation>

@lmq1999
Copy link
Author

lmq1999 commented Jul 29, 2020

CNI daemon logs:

2020-07-28 07:13:08.837 23254 INFO oslo.privsep.daemon [-] Running privsep helper: ['sudo', 'zun-rootwrap', '/etc/zun/rootwrap.conf', 'privsep-helper', '--config-file', '/etc/zun/zun
.conf', '--privsep_context', 'vif_plug_ovs.privsep.vif_plug', '--privsep_sock_path', '/tmp/tmp_pnf0hwm/privsep.sock']
2020-07-28 07:13:09.623 23254 INFO oslo.privsep.daemon [-] Spawned new privsep daemon via rootwrap                                              
2020-07-28 07:13:09.508 23272 INFO oslo.privsep.daemon [-] privsep daemon starting 
2020-07-28 07:13:09.520 23272 INFO oslo.privsep.daemon [-] privsep process running with uid/gid: 0/0                                                     
2020-07-28 07:13:09.523 23272 INFO oslo.privsep.daemon [-] privsep process running with capabilities (eff/prm/inh): CAP_NET_ADMIN/CAP_NET_ADMIN/none
2020-07-28 07:13:09.524 23272 INFO oslo.privsep.daemon [-] privsep daemon running as pid 23272                                                           
2020-07-28 07:13:10.154 23254 INFO os_vif [-] Successfully plugged vif VIFBridge(active=False,address=fa:16:3e:02:1a:bb,bridge_name='qbr8cd87d70-5c',has_traffic_filtering=True,id=8cd
87d70-5cb7-4397-a5a1-7478a676de91,network=Network(a4322dbc-4ffa-4a38-895d-16a32ee1deb9),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=False,vif_name='tap8cd8
7d70-5c')                                                                                                   
2020-07-28 07:13:10.157 23254 INFO oslo.privsep.daemon [-] Running privsep helper: ['sudo', 'zun-rootwrap', '/etc/zun/rootwrap.conf', 'privsep-helper', '--config-file', '/etc/zun/zun
.conf', '--privsep_context', 'zun.common.privileged.cni', '--privsep_sock_path', '/tmp/tmpxuvn729u/privsep.sock']
2020-07-28 07:13:10.938 23254 INFO oslo.privsep.daemon [-] Spawned new privsep daemon via rootwrap                                 
2020-07-28 07:13:10.820 23382 INFO oslo.privsep.daemon [-] privsep daemon starting
2020-07-28 07:13:10.823 23382 INFO oslo.privsep.daemon [-] privsep process running with uid/gid: 0/0                  
2020-07-28 07:13:10.826 23382 INFO oslo.privsep.daemon [-] privsep process running with capabilities (eff/prm/inh): CAP_NET_ADMIN|CAP_SYS_ADMIN|CAP_SYS_PTRACE/CAP_NET_ADMIN|CAP_SYS_$DMIN|CAP_SYS_PTRACE/none                                                                                                                                 
2020-07-28 07:13:10.826 23382 INFO oslo.privsep.daemon [-] privsep daemon running as pid 23382
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service [-] Error when processing addNetwork request. CNI Params: {'CNI_COMMAND': 'ADD', 'CNI_CONTAINERID': '995dfcab5c850280ddb01f
c407fefda4cf2116b37f130e9071df61de5eb6b5a4', 'CNI_NETNS': '/var/run/netns/cni-0e5ba6ca-74a9-0709-3973-db740eb0634e', 'CNI_ARGS': 'IgnoreUnknown=1;K8S_POD_NAMESPACE=default;K8S_POD_NA
ME=5973904c-be15-4c30-9df0-b0d48e8f98f2;K8S_POD_INFRA_CONTAINER_ID=995dfcab5c850280ddb01fc407fefda4cf2116b37f130e9071df61de5eb6b5a4', 'CNI_IFNAME': 'eth0', 'CNI_PATH': '/opt/cni/bin'
}: pyroute2.netlink.exceptions.NetlinkError: (1, 'Operation not permitted')
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service Traceback (most recent call last):                                                      
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/local/lib/python3.6/dist-packages/zun/cni/daemon/service.py", line 66, in add
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     vif = self.plugin.add(params)                                                     
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/local/lib/python3.6/dist-packages/zun/cni/plugins/zun_cni_registry.py", line 46, in add
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     vifs = self._do_work(params, b_base.connect)                                      
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/local/lib/python3.6/dist-packages/zun/cni/plugins/zun_cni_registry.py", line 146, in _do_work
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     container_id=params.CNI_CONTAINERID)                                             
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/local/lib/python3.6/dist-packages/zun/cni/binding/base.py", line 132, in connect
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     driver.connect(vif, ifname, netns, container_id)                                          
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/local/lib/python3.6/dist-packages/zun/cni/binding/bridge.py", line 88, in connect
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     h_br.add_port(host_ifname)                                                                
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/ipdb/transactional.py", line 209, in __exit__
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     self.commit()                                                                     
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/ipdb/interfaces.py", line 1078, in commit
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     raise error                                                                       
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/ipdb/interfaces.py", line 769, in commit
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     run(nl.link, 'update', index=i, master=self['index'])               
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/ipdb/interfaces.py", line 504, in _run
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     raise error                                            
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/ipdb/interfaces.py", line 499, in _run                                     
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     return cmd(*argv, **kwarg)                            
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/iproute/linux.py", line 1332, in link
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     msg_flags=msg_flags)
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/netlink/nlsocket.py", line 373, in nlm_request
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     return tuple(self._genlm_request(*argv, **kwarg))                                    
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/netlink/nlsocket.py", line 864, in nlm_request
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     callback=callback):                                                                                                    
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/netlink/nlsocket.py", line 376, in get                                     
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     return tuple(self._genlm_get(*argv, **kwarg))
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service   File "/usr/lib/python3/dist-packages/pyroute2/netlink/nlsocket.py", line 701, in get                                     
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service     raise msg['header']['error']                      
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service pyroute2.netlink.exceptions.NetlinkError: (1, 'Operation not permitted')
2020-07-28 07:13:12.330 23254 ERROR zun.cni.daemon.service                        
2020-07-28 07:13:12.333 23254 INFO werkzeug [-] 127.0.0.1 - - [28/Jul/2020 07:13:12] "POST /addNetwork HTTP/1.1" 500 -

It looks there is a bug on neutron hybrid plug, which I have to fix at zun side.

The easiest work-around is to use openvswitch as firewall driver in neutron side:

[securitygroup]
firewall_driver = openvswitch

Otherwise, you can wait for my fix on ovs hybrid driver.

I found out that firewall_drivers = iptables_hybrid can still be use when set this service:

/etc/systemd/system/zun-cni-daemon.service

run as root instead of zun (even I already put zun user in sudoers)

@hongbin
Copy link
Collaborator

hongbin commented Jul 30, 2020

I use VM so the 192.168.122.* work as provider network.

For example like this for more easier:

I create nginx deployment via VK:

+--------------------------------------+------------------------------------------+---------+-----------------+
| uuid                                 | name                                     | status  | addresses       |
+--------------------------------------+------------------------------------------+---------+-----------------+
| 215e84a2-2f03-43ec-9c33-410b19410b20 | default-myapp-pod                        | Error   | 192.168.122.221 |
| 25462856-3ec3-417c-a3c9-12148f533e1b | default-myapp-pod                        | Running | 192.168.122.138 |
| cdd0ccf4-cc7b-41b4-aa1b-7ffc636e9b42 | default-nginx-deployment-d75b7d9ff-8bbm2 | Running | 192.168.122.195 |
| e5ca07f7-2ffd-472e-9a12-64f99fdfd588 | default-nginx-deployment-d75b7d9ff-sdmkb | Running | 192.168.122.231 |
+--------------------------------------+------------------------------------------+---------+-----------------+
root@k8s-master:~/wordpress# kubectl get pod
NAME                               READY   STATUS        RESTARTS   AGE
nginx-deployment-d75b7d9ff-8bbm2   1/1     Running       0          20m
nginx-deployment-d75b7d9ff-sdmkb   1/1     Running       0          19m

I can ping, curl like this:

root@controller:~# curl 192.168.122.231
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

but when I try wordpress

root@controller:~# openstack capsule list
+--------------------------------------+------------------------------------------+---------+-----------------+
| uuid                                 | name                                     | status  | addresses       |
+--------------------------------------+------------------------------------------+---------+-----------------+
| 215e84a2-2f03-43ec-9c33-410b19410b20 | default-myapp-pod                        | Error   | 192.168.122.221 |
| 25462856-3ec3-417c-a3c9-12148f533e1b | default-myapp-pod                        | Running | 192.168.122.138 |
| ce359aff-33b8-4b8b-a212-5aa41b30addd | default-wordpress-mysql-69c9d5854b-wk26m | Running | 192.168.122.168 |
| 6dde67b5-3439-4e9e-9409-222bbb581228 | default-wordpress-75dfd8796-tctjs        | Running | 192.168.122.132 |
+--------------------------------------+------------------------------------------+---------+-----------------+
root@k8s-master:~/wordpress# kubectl get svc
NAME              TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)         AGE
kubernetes        NodePort    10.96.0.1      <none>        443:30947/TCP   42h
wordpress         ClusterIP   10.98.65.138   <none>        80/TCP          28m
wordpress-mysql   ClusterIP   None           <none>        3306/TCP        28m

I can't curl to wordpress (192.168.122.213)

root@controller:~# curl 192.168.122.132
curl: (7) Failed to connect to 192.168.122.132 port 80: Connection refused

service:

root@k8s-master:~/wordpress# kubectl get svc
NAME              TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)         AGE
kubernetes        NodePort    10.96.0.1      <none>        443:30947/TCP   42h
wordpress         NodePort    10.98.65.138   <none>        80:32119/TCP    30m
wordpress-mysql   ClusterIP   None           <none>        3306/TCP        30m

netstat at k8s:

root@k8s-master:~/wordpress# netstat -tupln | grep 32119
tcp        0      0 0.0.0.0:32119           0.0.0.0:*               LISTEN      9776/kube-proxy  

security group rules:

root@controller:~# openstack security group rule list 87190ce9-2d67-43cb-b552-0e67ec9bd2f0
+--------------------------------------+-------------+-----------+-----------+-------------+--------------------------------------+
| ID                                   | IP Protocol | Ethertype | IP Range  | Port Range  | Remote Security Group                |
+--------------------------------------+-------------+-----------+-----------+-------------+--------------------------------------+
| 3575e46b-0738-403b-a5f7-5134008f3628 | udp         | IPv4      | 0.0.0.0/0 | 33434:33434 | None                                 |
| 552cdad4-a73b-4c56-ac50-be7463f56435 | None        | IPv4      | 0.0.0.0/0 |             | None                                 |
| 5b96e032-c7e8-480f-bde4-c0be1ed8189c | None        | IPv6      | ::/0      |             | 87190ce9-2d67-43cb-b552-0e67ec9bd2f0 |
| 777c1499-6e42-4cee-8fc0-088229496bdc | tcp         | IPv4      | 0.0.0.0/0 | 80:80       | None                                 |
| 92d93823-2722-49e3-9836-0f2a72673b30 | None        | IPv6      | ::/0      |             | None                                 |
| aec7dc63-1756-4291-90ea-e0ff4529e5b0 | icmp        | IPv4      | 0.0.0.0/0 |             | None                                 |
| eb2121df-94b1-4a2c-bd82-542fc949486c | tcp         | IPv4      | 0.0.0.0/0 | 22:22       | None                                 |
| fcc06a0e-8385-4e64-9b8f-44a82313c9d1 | None        | IPv4      | 0.0.0.0/0 |             | 87190ce9-2d67-43cb-b552-0e67ec9bd2f0 |
+--------------------------------------+-------------+-----------+-----------+-------------+--------------------------------------+

So what should i do in this situation>

You might want to check the log of the wordpress container to confirm why the pod is not up. The container is running in CRI so you want to download the crictl tool to do it.

@lmq1999
Copy link
Author

lmq1999 commented Jul 30, 2020

hmmm ok it was my fault, the example was old and bug, try new one and still available

So if VK and K8s-pod in seperated network, that cause problems that can't use hybrid k8s (half VK- half nodes) right

I searching for some stuff and see this: https://docs.openstack.org/kuryr-kubernetes/latest/readme.html

Its said that: With Kuryr-Kubernetes it’s now possible to choose to run both OpenStack VMs and Kubernetes Pods on the same Neutron network if your workloads require it or to use different segments and, for example, route between them.

So it is possible to put both VK-pod and Nodes-pod in the same Openstack network ?

@hongbin
Copy link
Collaborator

hongbin commented Jul 31, 2020

hmmm ok it was my fault, the example was old and bug, try new one and still available

So if VK and K8s-pod in seperated network, that cause problems that can't use hybrid k8s (half VK- half nodes) right

I searching for some stuff and see this: https://docs.openstack.org/kuryr-kubernetes/latest/readme.html

Its said that: With Kuryr-Kubernetes it’s now possible to choose to run both OpenStack VMs and Kubernetes Pods on the same Neutron network if your workloads require it or to use different segments and, for example, route between them.

So it is possible to put both VK-pod and Nodes-pod in the same Openstack network ?

Given my limit knowledge on kuryr-kubernetes, it sounds it is possible. VK with openstack provider allows you to create pod in a neutron tenant or provider network. If you have other tools (i.e. kuryr-kubernetes, calico) that can connect normal pods to neutron, it is possible to achieve what you said.

@lmq1999
Copy link
Author

lmq1999 commented Aug 11, 2020

I am having a quite funny situation:

I have these network:

[root@controller ~(kubernetes)]$ openstack network list
+--------------------------------------+--------------------+--------------------------------------+
| ID                                   | Name               | Subnets                              |
+--------------------------------------+--------------------+--------------------------------------+
| 01b93608-1172-46db-86fb-9f355ce5a04a | services           | 1b6dd175-5f90-4766-87e4-9595f6c37b0a |
| 050df0e8-5aec-477a-9365-90a6ed151159 | kuryr-net-f2e53628 | 845ba74f-8882-4567-ad64-00a950d8c3f6 |
| 95e7931a-d774-45b6-afaa-c382ef4a0a40 | pod                | 125ac37f-bfaa-4c79-8050-552647bbe9ba |
| a154417c-7770-48bc-a024-6528d3b03aa6 | provider           | 0e7ceb5c-2744-454c-88ed-2a9a4563a167 |
| a4322dbc-4ffa-4a38-895d-16a32ee1deb9 | selfservice        | 70375905-5132-4060-9f8b-717646a278a0 |
| a47c405a-b7e3-43dd-a8a1-e3eef99c95a7 | LB-Manage-Net      | 35ad39c4-4e19-4465-9488-5e8bee993f8c |
+--------------------------------------+--------------------+--------------------------------------+

well for short describes:

provider = external
subnet = 10.10.10.0/24 selfservice=internal network 1 for VM
subnet =10.1.0.0/16 pod=internal network for kubernetes pod (normal kubernetes)
subnet= 10.2.0.0/16 service=internal network for kubernetes service (normal kubernetes)

these pod and service network set to dhcp=no (I install kuryr-kubernetes and it work for normal pod)

But when I create VK pod in that network, it running but can't ping to that IP address although I can still ping to normal pod IP address

Any advices ?

here for examples:

NAME                                        READY   STATUS    RESTARTS   AGE   IP                NODE              NOMINATED NODE   READINESS GATES
nginx-deployment-virtual-77cf5845f5-4rfpb   1/1     Running   0          46s   192.168.122.123   virtual-kubelet   <none>           <none>
nginx-deployment-worker-57d9684bf8-zcr48    1/1     Running   0          16m   10.1.3.140        k8s-worker        <none>           <none>

When I curl these:
it normal:

[root@controller ~(kubernetes)]$ curl 192.168.122.123
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>
[root@controller ~(kubernetes)]$ curl 10.1.3.140
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

but when I use selfservice network to create VK pod:

root@k8s-master:~# kubectl get pod -o wide
NAME                                        READY   STATUS    RESTARTS   AGE   IP            NODE              NOMINATED NODE   READINESS GATES
nginx-deployment-virtual-77cf5845f5-862g8   1/1     Running   0          70s   10.10.10.48   virtual-kubelet   <none>           <none>
nginx-deployment-worker-57d9684bf8-zcr48    1/1     Running   0          19m   10.1.3.140    k8s-worker        <none>           <none>


[root@controller ~(kubernetes)]$ openstack capsule list
+--------------------------------------+---------------------------------------------------+---------+-------------+
| uuid                                 | name                                              | status  | addresses   |
+--------------------------------------+---------------------------------------------------+---------+-------------+
| b79348be-2c10-4e2a-a31c-c11bd1cd9ebd | default-nginx-deployment-virtual-77cf5845f5-862g8 | Running | 10.10.10.48 |
+--------------------------------------+---------------------------------------------------+---------+-------------+


[root@controller ~(kubernetes)]$ curl 10.10.10.48
curl: (56) Recv failure: Connection reset by peer
[root@controller ~(kubernetes)]$ ping 10.10.10.48
PING 10.10.10.48 (10.10.10.48) 56(84) bytes of data.
64 bytes from 10.10.10.48: icmp_seq=1 ttl=63 time=11.8 ms
64 bytes from 10.10.10.48: icmp_seq=2 ttl=63 time=0.898 ms
64 bytes from 10.10.10.48: icmp_seq=3 ttl=63 time=1.19 ms
64 bytes from 10.10.10.48: icmp_seq=4 ttl=63 time=1.17 ms

I can only ping it

but when it come to pod network:

root@k8s-master:~# kubectl get pod -o wide
NAME                                        READY   STATUS        RESTARTS   AGE     IP            NODE              NOMINATED NODE   READINESS GATES
nginx-deployment-virtual-77cf5845f5-45stx   1/1     Running       0          53s     10.1.3.255    virtual-kubelet   <none>           <none>
nginx-deployment-worker-57d9684bf8-zcr48    1/1     Running       0          23m     10.1.3.140    k8s-worker        <none>           <none>

[root@controller ~(kubernetes)]$ openstack capsule list
+--------------------------------------+---------------------------------------------------+---------+------------+
| uuid                                 | name                                              | status  | addresses  |
+--------------------------------------+---------------------------------------------------+---------+------------+
| 7bbe5e34-1327-45dc-9a41-7d9620840344 | default-nginx-deployment-virtual-77cf5845f5-45stx | Running | 10.1.3.255 |
+--------------------------------------+---------------------------------------------------+---------+------------+
[root@controller ~(kubernetes)]$ ping 10.1.3.255
PING 10.1.3.255 (10.1.3.255) 56(84) bytes of data.

^C
--- 10.1.3.255 ping statistics ---
44 packets transmitted, 0 received, 100% packet loss, time 44030ms

I cant even ping it, but still can curl (worker) pod on the same network:

[root@controller ~(kubernetes)]$ curl 10.1.3.140
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

@hongbin
Copy link
Collaborator

hongbin commented Aug 11, 2020

I am having a quite funny situation:

I have these network:

[root@controller ~(kubernetes)]$ openstack network list
+--------------------------------------+--------------------+--------------------------------------+
| ID                                   | Name               | Subnets                              |
+--------------------------------------+--------------------+--------------------------------------+
| 01b93608-1172-46db-86fb-9f355ce5a04a | services           | 1b6dd175-5f90-4766-87e4-9595f6c37b0a |
| 050df0e8-5aec-477a-9365-90a6ed151159 | kuryr-net-f2e53628 | 845ba74f-8882-4567-ad64-00a950d8c3f6 |
| 95e7931a-d774-45b6-afaa-c382ef4a0a40 | pod                | 125ac37f-bfaa-4c79-8050-552647bbe9ba |
| a154417c-7770-48bc-a024-6528d3b03aa6 | provider           | 0e7ceb5c-2744-454c-88ed-2a9a4563a167 |
| a4322dbc-4ffa-4a38-895d-16a32ee1deb9 | selfservice        | 70375905-5132-4060-9f8b-717646a278a0 |
| a47c405a-b7e3-43dd-a8a1-e3eef99c95a7 | LB-Manage-Net      | 35ad39c4-4e19-4465-9488-5e8bee993f8c |
+--------------------------------------+--------------------+--------------------------------------+

well for short describes:

provider = external
subnet = 10.10.10.0/24 selfservice=internal network 1 for VM
subnet =10.1.0.0/16 pod=internal network for kubernetes pod (normal kubernetes)
subnet= 10.2.0.0/16 service=internal network for kubernetes service (normal kubernetes)

these pod and service network set to dhcp=no (I install kuryr-kubernetes and it work for normal pod)

But when I create VK pod in that network, it running but can't ping to that IP address although I can still ping to normal pod IP address

Any advices ?

here for examples:

NAME                                        READY   STATUS    RESTARTS   AGE   IP                NODE              NOMINATED NODE   READINESS GATES
nginx-deployment-virtual-77cf5845f5-4rfpb   1/1     Running   0          46s   192.168.122.123   virtual-kubelet   <none>           <none>
nginx-deployment-worker-57d9684bf8-zcr48    1/1     Running   0          16m   10.1.3.140        k8s-worker        <none>           <none>

When I curl these:
it normal:

[root@controller ~(kubernetes)]$ curl 192.168.122.123
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>
[root@controller ~(kubernetes)]$ curl 10.1.3.140
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

but when I use selfservice network to create VK pod:

root@k8s-master:~# kubectl get pod -o wide
NAME                                        READY   STATUS    RESTARTS   AGE   IP            NODE              NOMINATED NODE   READINESS GATES
nginx-deployment-virtual-77cf5845f5-862g8   1/1     Running   0          70s   10.10.10.48   virtual-kubelet   <none>           <none>
nginx-deployment-worker-57d9684bf8-zcr48    1/1     Running   0          19m   10.1.3.140    k8s-worker        <none>           <none>


[root@controller ~(kubernetes)]$ openstack capsule list
+--------------------------------------+---------------------------------------------------+---------+-------------+
| uuid                                 | name                                              | status  | addresses   |
+--------------------------------------+---------------------------------------------------+---------+-------------+
| b79348be-2c10-4e2a-a31c-c11bd1cd9ebd | default-nginx-deployment-virtual-77cf5845f5-862g8 | Running | 10.10.10.48 |
+--------------------------------------+---------------------------------------------------+---------+-------------+


[root@controller ~(kubernetes)]$ curl 10.10.10.48
curl: (56) Recv failure: Connection reset by peer
[root@controller ~(kubernetes)]$ ping 10.10.10.48
PING 10.10.10.48 (10.10.10.48) 56(84) bytes of data.
64 bytes from 10.10.10.48: icmp_seq=1 ttl=63 time=11.8 ms
64 bytes from 10.10.10.48: icmp_seq=2 ttl=63 time=0.898 ms
64 bytes from 10.10.10.48: icmp_seq=3 ttl=63 time=1.19 ms
64 bytes from 10.10.10.48: icmp_seq=4 ttl=63 time=1.17 ms

I can only ping it

but when it come to pod network:

root@k8s-master:~# kubectl get pod -o wide
NAME                                        READY   STATUS        RESTARTS   AGE     IP            NODE              NOMINATED NODE   READINESS GATES
nginx-deployment-virtual-77cf5845f5-45stx   1/1     Running       0          53s     10.1.3.255    virtual-kubelet   <none>           <none>
nginx-deployment-worker-57d9684bf8-zcr48    1/1     Running       0          23m     10.1.3.140    k8s-worker        <none>           <none>

[root@controller ~(kubernetes)]$ openstack capsule list
+--------------------------------------+---------------------------------------------------+---------+------------+
| uuid                                 | name                                              | status  | addresses  |
+--------------------------------------+---------------------------------------------------+---------+------------+
| 7bbe5e34-1327-45dc-9a41-7d9620840344 | default-nginx-deployment-virtual-77cf5845f5-45stx | Running | 10.1.3.255 |
+--------------------------------------+---------------------------------------------------+---------+------------+
[root@controller ~(kubernetes)]$ ping 10.1.3.255
PING 10.1.3.255 (10.1.3.255) 56(84) bytes of data.

^C
--- 10.1.3.255 ping statistics ---
44 packets transmitted, 0 received, 100% packet loss, time 44030ms

I cant even ping it, but still can curl (worker) pod on the same network:

[root@controller ~(kubernetes)]$ curl 10.1.3.140
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

My best guess is the security group blocking the traffic. I would trace down the security group/port/subnet of the "normal" pod and VK pod and check if there is any differences.

@lmq1999
Copy link
Author

lmq1999 commented Aug 12, 2020

Well after tracing down, they use the same security group and subnet.

Btw are you having any overview models of virtual-kubelet providers openstack zun. Is it possible to run 2 virtual-kubelet with the same providers ?

And what exaclty openstack-zun providers do in VK, where the pod actually running in (virtual-kubelet (k8s-master) node or in controller node) ?

@hongbin
Copy link
Collaborator

hongbin commented Aug 13, 2020

Well after tracing down, they use the same security group and subnet.

Could you describe how to reproduce the issue?

Btw are you having any overview models of virtual-kubelet providers openstack zun. Is it possible to run 2 virtual-kubelet with the same providers ?

In theory, it is possible to run 2 VK with the same provider but I haven't tried that.

And what exaclty openstack-zun providers do in VK, where the pod actually running in (virtual-kubelet (k8s-master) node or in controller node) ?

Zun provider will call Zun's API to create the pod (capsule). Eventually, the pod will be scheduled to a compute node.

@lmq1999
Copy link
Author

lmq1999 commented Aug 13, 2020

With first problems.

I'm having 4 node:

k8s-master
k8s-worker
openstack-controller
openstack-compute

The k8s-master and k8s-worker using kuryr-kubernetes so they will use network in neutron
I provide them 2 internal network

1st is: 10.1.0.0/16 for cluster network
2nd is: 10.2.0.0/16 for service network

Those network belongs to (k8s) user.

The virtual-kubelet in k8s-master node also use (k8s) user. They also schedule into 10.1.0.0/16 network. It said (running) and have IP when use both.

kubectl get pod -o wide & openstack capsule list (k8s user)

I can't ping to it even I already allow icmp security group rule for 10.1.0.0/16 network and 10.2.0.0/16 network

But because it's virtual kubelet, I can't show the logs or console those pod.

But when I create normal pod (which I don't give them toleration), It schedule into k8s-worker node. It use 10.1.0.0/16 network like I said and I have no problems ping, curl to it.

I follow the kuryr-kubernetes install here combine with openstack docs: https://github.com/zufardhiyaulhaq/kuryr-kubernetes/blob/master/Installation/kuryr.md

Second question:

Yesterday I have try connect 2 VK to same provider but it only have 1 compute node, if the 2nd connected, they will push the 1st out if it is the same user. I will add more compute node and test today

3rd question:

If the pod is scheduled into a compute node, are there anyway to know which compute node the pod in if have multiple compute node, I didn't find any docker container in those compute node

With new model

k8s-master
k8s-worker
openstack-controller
openstack-compute1
openstack-compute2

I'm trying to archive that setup an wordpress kubernetes which:
openstack-compute1 run mysql
openstack-compute2 run wordpress

But when I curl to the IP of the wordpress it return nothing, not even something like data baseconnection error, so are there any possible way to view VK-pod logs ?

@hongbin
Copy link
Collaborator

hongbin commented Aug 14, 2020

There is a 'host' field in capsule. Normal users cannot see this field, but users with admin privilege can see this field so knows which compute host runs this capsule.

The capsule is not created in docker. It is created in containerd. You can download the tool 'crictl' (https://github.com/kubernetes-sigs/cri-tools/blob/master/docs/crictl.md) to list those pods.

@lmq1999
Copy link
Author

lmq1999 commented Aug 18, 2020

Ok tks you, I can access into pod from compute node now.

Aren't there anyway to setup for VK to use docker instead of containerd?

@hongbin
Copy link
Collaborator

hongbin commented Aug 19, 2020

Docker doesn't supports the concept of "pod" so VK won't work with docker very well. I would highly recommend containerd for VK.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants