Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch seccomp filter to whitelist #37

Merged
merged 4 commits into from
Aug 30, 2023
Merged

Switch seccomp filter to whitelist #37

merged 4 commits into from
Aug 30, 2023

Conversation

cd-work
Copy link
Collaborator

@cd-work cd-work commented Aug 30, 2023

Previously the seccomp filter would use a whitelist, denying access only to system calls known to perform network operations. This has the big disadvantage that every new system call would be allowed by default.

To prevent accidentally opening up the sandbox due to not tracking the Kernel appropriately, the filter has been switched to a whitelist instead. This means only system calls which are explicitly present in the list are allowed.

When the network sandbox is disabled, all system calls are allowed regardless of our whitelist. This means that even new unknown system calls will be allowed without having to update Birdcage to track them.

Closes #33.

Previously the seccomp filter would use a whitelist, denying access only
to system calls known to perform network operations. This has the big
disadvantage that every new system call would be allowed by default.

To prevent accidentally opening up the sandbox due to not tracking the
Kernel appropriately, the filter has been switched to a whitelist
instead. This means only system calls which are explicitly present in
the list are allowed.

When the network sandbox is disabled, all system calls are allowed
regardless of our whitelist. This means that even new unknown system
calls will be allowed without having to update Birdcage to track them.

Closes #33.
@cd-work
Copy link
Collaborator Author

cd-work commented Aug 30, 2023

This is only loosely related to this PR since Docker does not do network filtering, but I've decided to compare its list of system calls to the ones supported by Rust's libc crate:

Syscall Comparison
Syscalls excluded by docker:
    acct,
    add_key,
    afs_syscall,             NOTE: NOT IMPLEMENTED -> REMOVED
    arch_prctl,
    bpf,
    chroot,
    clock_settime,
    clone,
    clone3,
    create_module,           NOTE: Kernel modules require priliveges, but still scary -> REMOVED
    delete_module,           NOTE: Kernel modules require priliveges, but still scary -> REMOVED
    fanotify_init,
    finit_module,            NOTE: Kernel modules require priliveges, but still scary -> REMOVED
    fsconfig,                NOTE: Does not exist?
    fsmount,                 NOTE: Does not exist?
    fsopen,                  NOTE: Does not exist?
    fspick,                  NOTE: Does not exist?
    get_kernel_syms,
    get_mempolicy,
    getpmsg,                 NOTE: NOT IMPLEMENTED -> REMOVED
    init_module,             NOTE: Kernel modules require priliveges, but still scary -> REMOVED
    ioperm,                  NOTE: Only for i386, returns error otherwise anyway -> REMOVED
    iopl,                    NOTE: Deprecated alternative to ioperm -> REMOVED
    kcmp,
    kexec_file_load,         NOTE: Do we want to allow loading new kernels?
    kexec_load,              NOTE: Do we want to allow loading new kernels?
    keyctl,
    lookup_dcookie,
    mbind,
    migrate_pages,
    modify_ldt,
    mount,                   NOTE?: Would it be possible to mount some kind of network FS?
    mount_setattr,
    move_mount,
    move_pages,
    nfsservctl,              NOTE: Removed in 3.1 -> REMOVED
    open_by_handle_at,
    open_tree,
    perf_event_open,
    personality,
    pidfd_getfd,             NOTE: Allows cloning other process' FDs -> REMOVED
    pivot_root,
    process_madvise,
    process_vm_readv,
    process_vm_writev,
    ptrace,
    putpmsg,                 NOTE: NOT IMPLEMENTED -> REMOVED
    query_module,            NOTE: Deprecated and removed in 2.6 -> REMOVED
    quotactl,
    quotactl_fd,
    reboot,
    request_key,
    security,                NOTE: NOT IMPLEMENTED -> REMOVED
    setdomainname,
    sethostname,
    set_mempolicy,
    set_mempolicy_home_node,
    setns,
    settimeofday,
    swapoff,
    swapon,
    _sysctl,                 NOTE: Use discouraged since 2.6.24, removed in 5.5 -> REMOVED
    sysfs,                   NOTE: Deprecated
    syslog,
    tuxcall,                 NOTE: NOT IMPLEMENTED -> REMOVED
    umount2,
    unshare,
    uselib,                  NOTE: Deprecated and removed in 3.15 -> REMOVED
    userfaultfd,
    ustat,
    vhangup,
    vserver,                 NOTE: NOT IMPLEMENTED -> REMOVED

Syscalls included by docker:
    (bunch of *32/*64 stuff)
    io_pgetevents,
    ipc,               NOTE: Doesn't exist on x86-64 and ARM
    _llseek,           NOTE: Should use lseek instead
    mmap2,             NOTE: Use mmap instead
    _newselect,        NOTE: Not even mentioned in syscall manpage
    recv,              NOTE: Unsure, but mostly same as recvfrom
    send,              NOTE: Unsure, but mostly same as sendfrom
    sigprocmask,       NOTE: Replaced by rt_sigprocmask
    sigreturn,         NOTE: Replaced by rt_sigreturn
    socketcall,
    socketpair,
    ugetrlimit,
    waitpid,

The only SYSCALL included in aarch64 that is not in x86 is rseq, which
should not be necessary.
src/linux/mod.rs Outdated Show resolved Hide resolved
src/linux/seccomp.rs Outdated Show resolved Hide resolved
@cd-work cd-work merged commit dbfd7b9 into main Aug 30, 2023
9 checks passed
@cd-work cd-work deleted the whitelist branch August 30, 2023 22:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Switch seccomp network filter from blacklist to whitelist
2 participants