Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add an option to register allowed ranges of memory #110

Merged
merged 2 commits into from
Sep 13, 2024

Conversation

wdullaer
Copy link
Contributor

@wdullaer wdullaer commented Sep 5, 2024

Certain kernel helpers return pointers to kernel managed memory, which the ebpf program is allowed to load and read.
In order to implement stubs for these using rbpf, we need a way to mark these ranges of memory as safe for the check_mem function.

The API specifically deals with adresses, because helpers need to be function types and not closures. This means the pointers to objects returned from them need to be static, and dealing with references to static objects gets clunky. So I have chosen to push the obtaining of the addresses into calling code.

I did do other experiments, such as trying to allow closures, but I couldn't figure out how to implement that in the jit. (I'm sure it's doable, but exhausted my yak shaving time budget for this). It also didn't end up getting rid of the 'static' requirement on references returned from helper stubs, due to the Box you have to put the closures in to store them in a hashmap.

The only real way to make this easier is to have actual maps built-in to the VM, so we can transfer the ownership of the objects to the VM. How you make that generic with regards to the map definitions in the ebpf programs is still an open question for me though.

At least with this PR I can run/test my programs and I don't have to disable the memory checks.

My apologies for the format changes: rust-analyzers autoformat seems to have different opinions than the style here.
I can try to minimize the diff if that is important to you.

@qmonnet
Copy link
Owner

qmonnet commented Sep 5, 2024

Thanks a lot!

I can try to minimize the diff if that is important to you.

Yes please, that would really help for the review

Copy link
Owner

@qmonnet qmonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this work!

I'm not sure how much I want to extend the API to add verifier-specific features, as long as the project doesn't have a robust verifier. But allowing some memory address ranges seems like a fairly basic requirement to add some helpers, so let's go for it. Your API changes, and the PR overall, look good.

I have three comments, however:

  • Could you please update the README to mention the new function (probably below register_helper())?
  • I'd like to have a test (under tests, or maybe even an end-to-end test under examples) with a simple helper that uses a range of allowed memory, to avoid any kind of regression on that feature, before I merge.
  • And then one nit, could you please rename the attribute into something more descriptive than allowed? For example, allowed_memory, more verbose but easier to understand. I'd typically expect a boolean flag rather than an address map for something called allowed.

If I may ask, what's the use case behind the PR? Do you have your own implementation of maps? May I also ask what you're doing with rbpf, if you're free to tell?

@wdullaer
Copy link
Contributor Author

wdullaer commented Sep 9, 2024

I will add an end-to-end test in examples. Asserting that we can add u64s to a HashSet doesn't feel like it adds much value.
I will have to cook up a minimal ebpf program first.

My usecase is to have an easy to use way to test ebpf programs. We are building a variety of network programs, which enforce some policy and need to do a bit of packet parsing.
With rbpf I can very easily control the state of the world and inputs, and very easily test corner cases.
Since we write the programs using aya, I can also reuse all my type definitions in the tests.
I'm sure there are different ways to do this, but I liked that with rbpf we just needed a working rust toolchain to run tests.
The map interactions are the hardest bit. Currently this requires me to create static objects, which are then emitted by the helper mocks.
The rust compiler doesn't quite like this pattern, but it works.
It'd be better if there were internal map implementations, but the way the kernel api's are structured makes that harder than it should be (as I'm sure you're aware).
The main benefit of having the maps in rbpf would be that we can transfer ownership of the objects to rbpf, which removes the need for statics and would remove unsafe from caller code.
I haven't really figured out what the correct type setup would be internally though.

@qmonnet
Copy link
Owner

qmonnet commented Sep 9, 2024

Thanks for the details, that's an interesting use case.

It'd be better if there were internal map implementations

No doubt about that. I'd like to have maps in rbpf, but I don't have the time to really work on the project and I don't expect to come up with an implementation myself in the foreseeable future (I understand you're not asking - I'm just clarifying). If anyone can and wants to contribute a PR for that they're welcome, obviously.

@wdullaer
Copy link
Contributor Author

wdullaer commented Sep 9, 2024

I've added an end to end example

I know what it is to be short on time.
For now, this API allows me to move forward. I might eventually give implementing maps a shot, but I also doubt I'll get to it in the coming weeks.

@qmonnet
Copy link
Owner

qmonnet commented Sep 9, 2024

<Rebased with the hope that the Appveyor workflow will pass>

Copy link
Owner

@qmonnet qmonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks nice, thanks!

Besides some more nits on allowed_memory and the case on eBPF, Linux, Rust, my main concern is the license for your example. I'd like to avoid GPL if we can, and I'd like SPDX headers for the new files, please.

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
examples/ebpf-allowed-memory.rs Outdated Show resolved Hide resolved
examples/allowed_memory.rs Outdated Show resolved Hide resolved
src/interpreter.rs Outdated Show resolved Hide resolved
src/interpreter.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
Copy link
Owner

@qmonnet qmonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks all good, thank you!

I'd just like to find a moment to run the examples myself before merging, I'll try to do it tonight.

src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@qmonnet
Copy link
Owner

qmonnet commented Sep 13, 2024

OK so I tried to compile the program myself, but I got mixed results.

File structure

One thing that I observe is that after your changes, we get a new warning when building the crate:

$ cargo test
warning: An explicit [[example]] section is specified in Cargo.toml which currently
disables Cargo from automatically inferring other example targets.
This inference behavior will change in the Rust 2018 edition and the following
files will be included as a example target:

* examples/ebpf-allowed-memory.rs

This is likely to break cargo build or cargo test as these files may not be
ready to be compiled as a example target today. You can future-proof yourself
and disable this warning by adding `autoexamples = false` to your [package]
section. You may also move the files to a location where Cargo would not
automatically infer them to be a target, such as in subfolders.

For more information on this warning you can consult
https://github.com/rust-lang/cargo/issues/5330

I understand this is because we don't set an edition in the Cargo.toml. I don't think editions were a thing when I created the Cargo.toml. We may want to address that in the future, but for now, it would be nice to avoid that warning. Looking at the files you added:

$ git show --pretty="" --name-only --stat -- examples/
examples/allowed-memory.o
examples/allowed_memory.rs
examples/ebpf-allowed-memory.rs

We could maybe pick up the suggestion from the warning and move the auxiliary files into a subdir? It would make it clear that the files are related to the allowed_memory.rs example, but are not example programs calling the lib themselves, and it should tame the warning?

Compiling - Part 1

So I tried to compile the eBPF program by installing the bpf-linker, and then typing the command you provided, cargo build --target=bpfel-unknown-none -Z build-std=core. But I was in rbpf's repository, and cargo tried to compile rbpf itself with this target and option, and of course it complained loudly. How exactly do you compile the program? You need a separate Cargo.toml, right? I eventually managed to run the command by adding one with:

[package]
name = "ebpf-allowed-memory"
version = "0.1.0"
edition = "2021"

[dependencies]
aya-ebpf = "0.1.0"

Coming back to my first item on the file structure, moving the example auxiliary files under their own subdirectory would allow us to add this Cargo.toml to example/allowed_memory and make it easier to recompile the example?

Compiling - Part 2

So once I had my Cargo.toml I tried the command again and hit aya-rs/bpf-linker#220, not sure if you hit that one, too. Anyway, I managed to compile by adding the --release flag.

In the end, I produced an object file that contains more eBPF functions that yours, for some reason. I have ingress_tc, but also for some reason bcmp, memcmp, and memmove in section .text, no idea why. Maybe because of the --release?

But the good thing is that ingress_tc is strictly identical to the one in your .o! 🎉

Summary

So my suggestion would be to make it easier to recompile the eBPF program, by making sure we can just run the compile command (even better, shrink it to cargo build --release, or without the --release if it's not necessary and just me hitting that bug). My changes are here: feel free to add them to your commit (or to adjust, or you can also disagree with the changes 🙂)

And, nice work!

@wdullaer
Copy link
Contributor Author

I'm sorry for the compilation troubles. I didn't get the warning when running locally, or I would have taken care of it.

You are correct that in order to compile the ebpf program you need to create a small cargo project. It can't be compiled as is. I included the source "for reference" more than anything else.

I think your suggestion is good. It makes it easier to recreate the binary.
I would still commit the binary to the repo though. The way you have done it would require you to have a functional aya environment to run the examples test, since you have to compile the ebpf program first. It's sufficiently different from the build environment of rbpf that I wouldn't go there.

I didn't bump into the bpf-linker bug, the program was compiled with --release already: I created it by making an extra entrypoint in my main project and reused its build target.

Certain kernel helpers return pointers to kernel managed memory, which
the ebpf program is allowed to load and read.
In order to implement stubs for these using rbpf, we need a way to mark
these ranges of memory as safe for the check_mem function.

The API specifically deals with adresses, because helpers need to be
function types and not closures. This means the pointers to objects
returned from them need to be static, and dealing with references to
static objects gets clunky. So I have chosen to push the obtaining of
the addresses into calling code.

Signed-off-by: Wouter Dullaert <[email protected]>
@qmonnet
Copy link
Owner

qmonnet commented Sep 13, 2024

I would still commit the binary to the repo though. The way you have done it would require you to have a functional aya environment to run the examples test, since you have to compile the ebpf program first.

I agree we should commit it, I did keep it in my commit but moved it:

image

I'm also fine with your approach of copying it to a more visible path.

I didn't know of Earthly before. Is it intended to commit that file, with the pointer to the Exoscale Docker registry (containing internal in its name)? It's some new components to install if a user wants to run it, but then it's not often people will want to recompile the object file, nor is the EarthFile strictly necessary to compile it, so no objection, but I'm just checking this is your intent.

Can you please sign-off the second commit?

@wdullaer
Copy link
Contributor Author

Sorry, I did not intend to commit that.
I use it as an easy way to get a linux build env on my mac.
Will amend the PR.

Moving the source code for the ebpf program into a subfolder will prevent
it from accidentally being run as an example.
We are still committing the resulting object file, to prevent creating
a hard dependency on the aya build toolchain to run the example tests
in this repository.

Signed-off-by: Wouter Dullaert <[email protected]>
Copy link
Owner

@qmonnet qmonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good this time, thanks a lot!

@qmonnet qmonnet merged commit e4d0542 into qmonnet:main Sep 13, 2024
7 checks passed
@wdullaer wdullaer deleted the allowed-addresses branch September 17, 2024 07:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants