Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All podman functions hang after attempting to stop container #24487

Open
pavinjosdev opened this issue Nov 7, 2024 · 2 comments
Open

All podman functions hang after attempting to stop container #24487

pavinjosdev opened this issue Nov 7, 2024 · 2 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. network Networking related issue or feature

Comments

@pavinjosdev
Copy link

pavinjosdev commented Nov 7, 2024

Issue Description

Once containers are started, podman works normally for a few hours.
But after a few hours, if I attempt to stop one of the containers, the action hangs and then podman itself hangs on subsequent commands like ps.

The issue is solved after deleting the empty files of format netns-{UUID} in /run/user/1000/netns/ directory.
Podman once again works normally for a few hours and the problem repeats.

Steps to reproduce the issue

  1. Start container
  2. Wait few hours
  3. Try to stop container

Describe the results you received

pavin@suse-pc:~> podman --log-level debug stop trading
INFO[0000] podman filtering at log level debug          
DEBU[0000] Called stop.PersistentPreRunE(podman --log-level debug stop trading) 
DEBU[0000] Using conmon: "/usr/bin/conmon"              
INFO[0000] Using sqlite as database backend             
DEBU[0000] Using graph driver overlay                   
DEBU[0000] Using graph root /home/pavin/.local/share/containers/storage 
DEBU[0000] Using run root /run/user/1000/containers     
DEBU[0000] Using static dir /home/pavin/.local/share/containers/storage/libpod 
DEBU[0000] Using tmp dir /run/user/1000/libpod/tmp      
DEBU[0000] Using volume path /home/pavin/.local/share/containers/storage/volumes 
DEBU[0000] Using transient store: false                 
DEBU[0000] [graphdriver] trying provided driver "overlay" 
DEBU[0000] Cached value indicated that overlay is supported 
DEBU[0000] Cached value indicated that overlay is supported 
DEBU[0000] Cached value indicated that metacopy is not being used 
DEBU[0000] Cached value indicated that native-diff is usable 
DEBU[0000] backingFs=btrfs, projectQuotaSupported=false, useNativeDiff=true, usingMetacopy=false 
DEBU[0000] Initializing event backend journald          
DEBU[0000] Configured OCI runtime ocijail initialization failed: no valid executable found for OCI runtime ocijail: invalid argument 
DEBU[0000] Configured OCI runtime crun-vm initialization failed: no valid executable found for OCI runtime crun-vm: invalid argument 
DEBU[0000] Configured OCI runtime runc initialization failed: no valid executable found for OCI runtime runc: invalid argument 
DEBU[0000] Configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument 
DEBU[0000] Configured OCI runtime runj initialization failed: no valid executable found for OCI runtime runj: invalid argument 
DEBU[0000] Configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument 
DEBU[0000] Configured OCI runtime youki initialization failed: no valid executable found for OCI runtime youki: invalid argument 
DEBU[0000] Using OCI runtime "/usr/bin/crun"            
INFO[0000] Setting parallel job count to 37             
DEBU[0000] Starting parallel job on container 8ae550f0ed4cd9de128227e96eb0661d0af4b672cae41c7cb9becee96b56a53f 
DEBU[0000] Stopping ctr 8ae550f0ed4cd9de128227e96eb0661d0af4b672cae41c7cb9becee96b56a53f (timeout 10) 
DEBU[0000] Stopping container 8ae550f0ed4cd9de128227e96eb0661d0af4b672cae41c7cb9becee96b56a53f (PID 7109) 
DEBU[0000] Sending signal 15 to container 8ae550f0ed4cd9de128227e96eb0661d0af4b672cae41c7cb9becee96b56a53f 
DEBU[0010] Timed out stopping container 8ae550f0ed4cd9de128227e96eb0661d0af4b672cae41c7cb9becee96b56a53f with SIGTERM, resorting to SIGKILL: given PID did not die within timeout 
WARN[0010] StopSignal SIGTERM failed to stop container trading in 10 seconds, resorting to SIGKILL 
DEBU[0010] Sending signal 9 to container 8ae550f0ed4cd9de128227e96eb0661d0af4b672cae41c7cb9becee96b56a53f 
DEBU[0010] Container "8ae550f0ed4cd9de128227e96eb0661d0af4b672cae41c7cb9becee96b56a53f" state changed from "stopping" to "stopped" while waiting for it to be stopped: discontinuing stop procedure as another process interfered 
DEBU[0010] Cleaning up container 8ae550f0ed4cd9de128227e96eb0661d0af4b672cae41c7cb9becee96b56a53f 
DEBU[0010] Tearing down network namespace at /run/user/1000/netns/netns-d2685668-e90c-ac8b-99d2-20649aaa0936 for container 8ae550f0ed4cd9de128227e96eb0661d0af4b672cae41c7cb9becee96b56a53f 
DEBU[0010] Netns /run/user/1000/netns/netns-d2685668-e90c-ac8b-99d2-20649aaa0936 still busy, try removing it again in 10ms 
DEBU[0010] Netns /run/user/1000/netns/netns-d2685668-e90c-ac8b-99d2-20649aaa0936 still busy, try removing it again in 10ms 
... (repeated many tens of thousands of times)

Describe the results you expected

Podman stops container without hanging

podman info output

pavin@suse-pc:~> podman info
host:
  arch: amd64
  buildahVersion: 1.37.5
  cgroupControllers:
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.12-1.1.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.12, commit: unknown'
  cpuUtilization:
    idlePercent: 96.34
    systemPercent: 0.95
    userPercent: 2.72
  cpus: 12
  databaseBackend: sqlite
  distribution:
    distribution: opensuse-slowroll
    version: "20241002"
  eventLogger: journald
  freeLocks: 2002
  hostname: suse-pc
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 6.11.5-1-default
  linkmode: dynamic
  logDriver: journald
  memFree: 460140544
  memTotal: 13972418560
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.12.2-1.1.x86_64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.12.2
    package: netavark-1.12.2-1.1.x86_64
    path: /usr/libexec/podman/netavark
    version: netavark 1.12.2
  ociRuntime:
    name: crun
    package: crun-1.17-1.1.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.17
      commit: 000fa0d4eeed8938301f3bcf8206405315bc1017
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-20240906.6b38f07-2.1.x86_64
    version: |
      pasta 20240906.6b38f07-2.1
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: false
    path: /run/user/1000/podman/podman.sock
  rootlessNetworkCmd: pasta
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.3.1-1.1.x86_64
    version: |-
      slirp4netns version 1.3.1
      commit: unknown
      libslirp: 4.8.0
      SLIRP_CONFIG_VERSION_MAX: 5
      libseccomp: 2.5.5
  swapFree: 7324561408
  swapTotal: 8589930496
  uptime: 51h 16m 3.00s (Approximately 2.12 days)
  variant: ""
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - registry.opensuse.org
  - registry.suse.com
  - docker.io
store:
  configFile: /home/pavin/.config/containers/storage.conf
  containerStore:
    number: 3
    paused: 0
    running: 3
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/pavin/.local/share/containers/storage
  graphRootAllocated: 498681774080
  graphRootUsed: 88692174848
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 3
  runRoot: /run/user/1000/containers
  transientStore: false
  volumePath: /home/pavin/.local/share/containers/storage/volumes
version:
  APIVersion: 5.2.5
  Built: 1729756620
  BuiltTime: Thu Oct 24 13:27:00 2024
  GitCommit: ""
  GoVersion: go1.23.2
  Os: linux
  OsArch: linux/amd64
  Version: 5.2.5

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes

@pavinjosdev pavinjosdev added the kind/bug Categorizes issue or PR as related to a bug. label Nov 7, 2024
@Luap99
Copy link
Member

Luap99 commented Nov 7, 2024

Can you provide a fill strace -f from the podman stop process and upload the output. Also anything special about the container, are they using a userns?

Also can you reboot and see if it happens again after that?

@Luap99 Luap99 added the network Networking related issue or feature label Nov 7, 2024
@pavinjosdev
Copy link
Author

Can you provide a fill strace -f from the podman stop process and upload the output.

@Luap99 Attached strace log. podman_stop.log

Also anything special about the container, are they using a userns?

It's a vanilla container running the debian 12 toolbx image.
I also have a distrobox and jellyfin container on this host that are also experiencing the same issue.

pavin@suse-pc:~> podman inspect trading 
[
     {
          "Id": "8ae550f0ed4cd9de128227e96eb0661d0af4b672cae41c7cb9becee96b56a53f",
          "Created": "2024-08-13T08:53:06.022591283+05:30",
          "Path": "/bin/bash",
          "Args": [
               "/bin/bash"
          ],
          "State": {
               "OciVersion": "1.2.0",
               "Status": "exited",
               "Running": false,
               "Paused": false,
               "Restarting": false,
               "OOMKilled": false,
               "Dead": false,
               "Pid": 0,
               "ExitCode": 137,
               "Error": "",
               "StartedAt": "2024-11-07T19:56:35.129723307+05:30",
               "FinishedAt": "2024-11-08T01:05:32.502584379+05:30",
               "CheckpointedAt": "0001-01-01T00:00:00Z",
               "RestoredAt": "0001-01-01T00:00:00Z",
               "StoppedByUser": true
          },
          "Image": "6c0d42546081348d4c4a1799421af6406105c21bc8f4afa5987624fb0a18a92a",
          "ImageDigest": "sha256:7ba014b872d0d779a851868ef737d0f916ca292130d50f4dc956e1f721389a76",
          "ImageName": "quay.io/toolbx-images/debian-toolbox:12",
          "Rootfs": "",
          "Pod": "",
          "ResolvConfPath": "/run/user/1000/containers/overlay-containers/8ae550f0ed4cd9de128227e96eb0661d0af4b672cae41c7cb9becee96b56a53f/userdata/resolv.conf",
          "HostnamePath": "/run/user/1000/containers/overlay-containers/8ae550f0ed4cd9de128227e96eb0661d0af4b672cae41c7cb9becee96b56a53f/userdata/hostname",
          "HostsPath": "/run/user/1000/containers/overlay-containers/8ae550f0ed4cd9de128227e96eb0661d0af4b672cae41c7cb9becee96b56a53f/userdata/hosts",
          "StaticDir": "/home/pavin/.local/share/containers/storage/overlay-containers/8ae550f0ed4cd9de128227e96eb0661d0af4b672cae41c7cb9becee96b56a53f/userdata",
          "OCIConfigPath": "/home/pavin/.local/share/containers/storage/overlay-containers/8ae550f0ed4cd9de128227e96eb0661d0af4b672cae41c7cb9becee96b56a53f/userdata/config.json",
          "OCIRuntime": "crun",
          "ConmonPidFile": "/run/user/1000/containers/overlay-containers/8ae550f0ed4cd9de128227e96eb0661d0af4b672cae41c7cb9becee96b56a53f/userdata/conmon.pid",
          "PidFile": "/run/user/1000/containers/overlay-containers/8ae550f0ed4cd9de128227e96eb0661d0af4b672cae41c7cb9becee96b56a53f/userdata/pidfile",
          "Name": "trading",
          "RestartCount": 0,
          "Driver": "overlay",
          "MountLabel": "",
          "ProcessLabel": "",
          "AppArmorProfile": "",
          "EffectiveCaps": [
               "CAP_CHOWN",
               "CAP_DAC_OVERRIDE",
               "CAP_FOWNER",
               "CAP_FSETID",
               "CAP_KILL",
               "CAP_NET_BIND_SERVICE",
               "CAP_SETFCAP",
               "CAP_SETGID",
               "CAP_SETPCAP",
               "CAP_SETUID",
               "CAP_SYS_CHROOT"
          ],
          "BoundingCaps": [
               "CAP_CHOWN",
               "CAP_DAC_OVERRIDE",
               "CAP_FOWNER",
               "CAP_FSETID",
               "CAP_KILL",
               "CAP_NET_BIND_SERVICE",
               "CAP_SETFCAP",
               "CAP_SETGID",
               "CAP_SETPCAP",
               "CAP_SETUID",
               "CAP_SYS_CHROOT"
          ],
          "ExecIDs": [],
          "GraphDriver": {
               "Name": "overlay",
               "Data": {
                    "LowerDir": "/home/pavin/.local/share/containers/storage/overlay/8fe8abec48af325f1921b44063b3f990d8ccd6a41f38f59e6c5a3fbc8e31e763/diff:/home/pavin/.local/share/containers/storage/overlay/f6faf32734e0870d82ea890737958fe33ce9ddfed27b3b157576d2aadbab3322/diff",
                    "UpperDir": "/home/pavin/.local/share/containers/storage/overlay/b53d9dd9aa22f62f2aaa3d54bb7678d7e0e3ee4b496b61ebf11760a7a768715d/diff",
                    "WorkDir": "/home/pavin/.local/share/containers/storage/overlay/b53d9dd9aa22f62f2aaa3d54bb7678d7e0e3ee4b496b61ebf11760a7a768715d/work"
               }
          },
          "Mounts": [],
          "Dependencies": [],
          "NetworkSettings": {
               "EndpointID": "",
               "Gateway": "",
               "IPAddress": "",
               "IPPrefixLen": 0,
               "IPv6Gateway": "",
               "GlobalIPv6Address": "",
               "GlobalIPv6PrefixLen": 0,
               "MacAddress": "",
               "Bridge": "",
               "SandboxID": "",
               "HairpinMode": false,
               "LinkLocalIPv6Address": "",
               "LinkLocalIPv6PrefixLen": 0,
               "Ports": {},
               "SandboxKey": "",
               "Networks": {
                    "pasta": {
                         "EndpointID": "",
                         "Gateway": "",
                         "IPAddress": "",
                         "IPPrefixLen": 0,
                         "IPv6Gateway": "",
                         "GlobalIPv6Address": "",
                         "GlobalIPv6PrefixLen": 0,
                         "MacAddress": "",
                         "NetworkID": "pasta",
                         "DriverOpts": null,
                         "IPAMConfig": null,
                         "Links": null
                    }
               }
          },
          "Namespace": "",
          "IsInfra": false,
          "IsService": false,
          "KubeExitCodePropagation": "invalid",
          "lockNumber": 39,
          "Config": {
               "Hostname": "8ae550f0ed4c",
               "Domainname": "",
               "User": "",
               "AttachStdin": false,
               "AttachStdout": false,
               "AttachStderr": false,
               "Tty": true,
               "OpenStdin": true,
               "StdinOnce": false,
               "Env": [
                    "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
                    "container=podman",
                    "TERM=xterm",
                    "HOME=/root",
                    "HOSTNAME=8ae550f0ed4c"
               ],
               "Cmd": null,
               "Image": "quay.io/toolbx-images/debian-toolbox:12",
               "Volumes": null,
               "WorkingDir": "/",
               "Entrypoint": [
                    "/bin/bash"
               ],
               "OnBuild": null,
               "Labels": {
                    "com.github.containers.toolbox": "true",
                    "io.buildah.version": "1.23.1",
                    "maintainer": "",
                    "name": "debian-toolbox",
                    "summary": "Base image for creating Debian toolbox containers",
                    "usage": "This image is meant to be used with the toolbox command",
                    "version": "12"
               },
               "Annotations": {
                    "io.container.manager": "libpod",
                    "org.opencontainers.image.stopSignal": "15",
                    "org.systemd.property.KillSignal": "15",
                    "org.systemd.property.TimeoutStopUSec": "uint64 10000000"
               },
               "StopSignal": "SIGTERM",
               "HealthcheckOnFailureAction": "none",
               "CreateCommand": [
                    "podman",
                    "run",
                    "-it",
                    "--name",
                    "trading",
                    "--entrypoint=/bin/bash",
                    "6c0d42546081"
               ],
               "Umask": "0022",
               "Timeout": 0,
               "StopTimeout": 10,
               "Passwd": true,
               "sdNotifyMode": "container"
          },
          "HostConfig": {
               "Binds": [],
               "CgroupManager": "systemd",
               "CgroupMode": "private",
               "ContainerIDFile": "",
               "LogConfig": {
                    "Type": "journald",
                    "Config": null,
                    "Path": "",
                    "Tag": "",
                    "Size": "0B"
               },
               "NetworkMode": "pasta",
               "PortBindings": {},
               "RestartPolicy": {
                    "Name": "no",
                    "MaximumRetryCount": 0
               },
               "AutoRemove": false,
               "Annotations": {
                    "io.container.manager": "libpod",
                    "org.opencontainers.image.stopSignal": "15",
                    "org.systemd.property.KillSignal": "15",
                    "org.systemd.property.TimeoutStopUSec": "uint64 10000000"
               },
               "VolumeDriver": "",
               "VolumesFrom": null,
               "CapAdd": [],
               "CapDrop": [],
               "Dns": [],
               "DnsOptions": [],
               "DnsSearch": [],
               "ExtraHosts": [],
               "GroupAdd": [],
               "IpcMode": "shareable",
               "Cgroup": "",
               "Cgroups": "default",
               "Links": null,
               "OomScoreAdj": 0,
               "PidMode": "private",
               "Privileged": false,
               "PublishAllPorts": false,
               "ReadonlyRootfs": false,
               "SecurityOpt": [],
               "Tmpfs": {},
               "UTSMode": "private",
               "UsernsMode": "",
               "ShmSize": 65536000,
               "Runtime": "oci",
               "ConsoleSize": [
                    0,
                    0
               ],
               "Isolation": "",
               "CpuShares": 0,
               "Memory": 0,
               "NanoCpus": 0,
               "CgroupParent": "user.slice",
               "BlkioWeight": 0,
               "BlkioWeightDevice": null,
               "BlkioDeviceReadBps": null,
               "BlkioDeviceWriteBps": null,
               "BlkioDeviceReadIOps": null,
               "BlkioDeviceWriteIOps": null,
               "CpuPeriod": 0,
               "CpuQuota": 0,
               "CpuRealtimePeriod": 0,
               "CpuRealtimeRuntime": 0,
               "CpusetCpus": "",
               "CpusetMems": "",
               "Devices": [],
               "DiskQuota": 0,
               "KernelMemory": 0,
               "MemoryReservation": 0,
               "MemorySwap": 0,
               "MemorySwappiness": 0,
               "OomKillDisable": false,
               "PidsLimit": 2048,
               "Ulimits": [
                    {
                         "Name": "RLIMIT_NOFILE",
                         "Soft": 524288,
                         "Hard": 524288
                    },
                    {
                         "Name": "RLIMIT_NPROC",
                         "Soft": 52715,
                         "Hard": 52715
                    }
               ],
               "CpuCount": 0,
               "CpuPercent": 0,
               "IOMaximumIOps": 0,
               "IOMaximumBandwidth": 0,
               "CgroupConf": null
          }
     }
]

Also can you reboot and see if it happens again after that?

Yep, happens after reboot. It has been happening for a few weeks now, but I just rebooted, waited a few hours and tried to stop with the same hanging issue.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. network Networking related issue or feature
Projects
None yet
Development

No branches or pull requests

2 participants