Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is the shim running on it's own network ns? #476

Open
Mossaka opened this issue Feb 7, 2024 · 12 comments
Open

Why is the shim running on it's own network ns? #476

Mossaka opened this issue Feb 7, 2024 · 12 comments
Assignees
Labels
question Further information is requested

Comments

@Mossaka
Copy link
Member

Mossaka commented Feb 7, 2024

Ref:

setup_namespaces(&spec)
.map_err(|e| shim::Error::Other(format!("failed to setup namespaces: {}", e)))?;

It is not clear to me the reason why the shim start function sets up it's own network ns. I checked out both the Go and Rust implementation of runc-shim, they aren't setting the same ns at shim start.

@jsturtevant pointed me to this youki-dev/youki#2473 from youki but it looks like youki was setting up all the namespaces for containers but somehow the network ns was ignored. I am raising the issue here for asking more investigation.

FYI @jprendes

@cpuguy83
Copy link
Member

cpuguy83 commented Feb 7, 2024

Before youki we did everything ourselves.
It may not be necessary to do this anymore.

@jprendes
Copy link
Collaborator

jprendes commented Feb 7, 2024

From #327 (comment), removing that setup_namespaces call means that the network ns is not set up at all.
At the time I thought that maybe youki wasn't setting up the network ns, but it does.
It would be great if anyone has further insight.

@jsturtevant
Copy link
Contributor

We removed it awhile back and broke everything 🤦 #364

@cpuguy83
Copy link
Member

cpuguy83 commented Feb 7, 2024

youki is only setting the Pid and User namespaces

That doesn't seem right :(

@0xE282B0
Copy link
Contributor

0xE282B0 commented Feb 7, 2024

I removed it once and @jsturtevant needed to add it again here

@jprendes
Copy link
Collaborator

jprendes commented Feb 7, 2024

youki is only setting the Pid and User namespaces

That doesn't seem right :(

That was me misreading the code. Youki sets Pid and User ns at one point, and the rest on a different place.
IIRC, Pid and User ns need to be set up in the intermediate process, while the rest are set up in the init process.

@Mossaka Mossaka added the question Further information is requested label Feb 7, 2024
@Mossaka
Copy link
Member Author

Mossaka commented Feb 7, 2024

Youki sets Pid and User ns at one point, and the rest on a different place.

Okay, if this is true, then the shim doesn't need to set up its own network ns. There must be some root causes that we haven't yet discovered.

@jprendes
Copy link
Collaborator

jprendes commented Feb 7, 2024

I've added some logging to runwasi and youki when setting namespaces, and got:

level=info msg=">>>>> runwasi > unshare(CloneFlags(CLONE_NEWNET))"
level=info msg="server listen started"
level=info msg="server started"
level=info msg="Shim successfully started, waiting for exit signal..."
level=debug msg="Got new client"
...
level=warn msg="Controller rdma is not yet implemented."
level=warn msg="Controller misc is not yet implemented."
level=debug msg="unshare or setns: LinuxNamespace { typ: Pid, path: None }"
level=info msg=">>>>> youki > unshare(CloneFlags(CLONE_NEWPID))"
level=debug msg="sending init pid (Pid(1883648))"
level=debug msg="unshare or setns: LinuxNamespace { typ: Uts, path: None }"
level=info msg=">>>>> youki > unshare(CloneFlags(CLONE_NEWUTS))"
level=debug msg="unshare or setns: LinuxNamespace { typ: Ipc, path: None }"
level=info msg=">>>>> youki > unshare(CloneFlags(CLONE_NEWIPC))"
level=debug msg="unshare or setns: LinuxNamespace { typ: Network, path: None }"
level=info msg=">>>>> youki > unshare(CloneFlags(CLONE_NEWNET))"
level=debug msg="unshare or setns: LinuxNamespace { typ: Cgroup, path: None }"
level=info msg=">>>>> youki > unshare(CloneFlags(CLONE_NEWCGROUP))"
level=debug msg="unshare or setns: LinuxNamespace { typ: Mount, path: None }"
level=info msg=">>>>> youki > unshare(CloneFlags(CLONE_NEWNS))"
level=debug msg="prepare rootfs rootfs="/var/lib/docker/rootfs/overlayfs/7c52b78c3a22d2a00cb67cdab9fd090241ef9cdf8f47bb0d7f20daba465662a2""
level=debug msg="mount root fs "/var/lib/docker/rootfs/overlayfs/7c52b78c3a22d2a00cb67cdab9fd090241ef9cdf8f47bb0d7f20daba465662a2""
level=debug msg="mounting Mount { destination: "/proc", typ: Some("proc"), source: Some("proc"), options: Some(["nosuid", "noexec", "nodev"]) }"
...

The lines with >>>>> is what I've added.
The network namespce is indeed being set twice.
By runwasi when the shim starts, and by youki on the init process.

@Mossaka
Copy link
Member Author

Mossaka commented Mar 5, 2024

It looks like after removing the network ns from shim start, the nginx pod had issues to start:
https://github.com/containerd/runwasi/actions/runs/8148869974/job/22272757552?pr=503#step:8:794

@yihuaf
Copy link
Contributor

yihuaf commented Jun 1, 2024

Pid and User ns need to be set up in the intermediate process

Just a drive by comment. The reason for this is that the pid and user namespace are a bit special and required special treatment. For the pid namespace, one can't simply join the pid namespace. Only the child process can join the new pid namespace, so we have to fork. In youki's case, we do this in the intermediate process to fork the init process.

user namespace is also special. In order to setup user namespace correctly, we need a parent process to setup the uid/gid mapping. This can't be set once inside the user namespace. So we join the user namespace when we fork the intermediate process from the main process. The intermediate process will join the user namespace and the main process will setup the uid/gid mapping.

@jsturtevant
Copy link
Contributor

@yihuaf Thanks for giving some context. Is what we are doing now, correct or do we need to make some changes in runwasi?

@Mossaka Mossaka self-assigned this Jun 4, 2024
@jsturtevant
Copy link
Contributor

@yihuaf do you know if runwasi requires changes like #503 or something else? I am still unclear if the network ns is youki's or runwasi's responsibility

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
Status: No status
Development

No branches or pull requests

6 participants