Skip to content
This repository has been archived by the owner on Jan 9, 2020. It is now read-only.

optional caps and cgroup #23

Open
aep opened this issue Jul 20, 2017 · 3 comments
Open

optional caps and cgroup #23

aep opened this issue Jul 20, 2017 · 3 comments

Comments

@aep
Copy link

aep commented Jul 20, 2017

we're trying railcar on a whole range of hardware, and two things came up:

  • cgroups might be broken on mips
  • caps are broken on android

generally, would you be ok to implement these as feature flags?

Are cgroups nessesary for cleanup of a pid namespace or will killing pid1 cleanup all the other processes in the pid ns anyway?

@vishvananda
Copy link
Contributor

Caps probably just needs the proper syscall numbers in order to work on android. The init model is a bit broken without pid namespaces, but it should be ok without cgroups. There is one place where cgroups are used to find the actual pid of the great-grandchild so it can be waited on:

// get the actual pid of the process from cgroup

A different method will need to be devised to track the pid of the grandchild. If we had that, disabling cgroup would be fairly easy and we could add an option for it --disable-cgroups that skips the various calls into the cgroup module.

@aep
Copy link
Author

aep commented Jul 21, 2017

whats the idea behind getting that pid? shouldnt railcar wait for pid1 only?
should be easy to implement using trap or ebpf or whatever on fork, but i'm unsure if that's actually nessesary. pid1 should not daemonize.
Could we implement it so that when there are no cgroups, and pid1 daemonizes, it'll simply not wait and clean up the container?

pid namespaces should exist i think, i was just wondering if they're sufficient, i.e. what cgroups are needed for in addition.

as for caps, i'm still working on evaluating why they're broken. (it's not just missing numbers)

@vishvananda
Copy link
Contributor

vishvananda commented Jul 21, 2017

The parent process needs the pid of the process as viewed from the outside so that it can either wait on it or write it to disk. The issue is that there are quite a few forks and exits to deal with namespaces and there are some complicated issues that make it impossible to pass the correct pid back via a socket. Specifically, when daemonizing inside a pid namespace, it is necessary to double fork. The second fork receives the pid of the child process inside the pid namespace, but it doesn't know what the external pid is. The cgroup lookup is to find the outside pid. Regarding whether cgroups are necessary, they do provide quite a bit of useful resource control as well as isolation for things like devices. It depends on your usecase, but It might be best to get them working.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants