-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(core/remio): add Remote I/O infrastructure #176
base: main
Are you sure you want to change the base?
Conversation
0afe0bf
to
38d7db2
Compare
38d7db2
to
01019b2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generic comment: the namespace for this module can be shortened. I suggest rmio, remio, rm_io, rem_io
some of the comments may be a bit subjective. @josecm feel free to wage in
01019b2
to
b999298
Compare
This new version includes a refactor of the code following @DavidMCerdeira's suggestions.
This refactoring improves the structure and maintainability of the code. |
@josecm What are your thoughts on this? From @DavidMCerdeira's list of suggestions, |
My suggestion would be either |
Update: 01c2cbe |
The namespace change should be reflected in the function, variable names, enums, etc |
Done 00b5acd |
93e33be
to
0c544f9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd still like to test this before approving, but for now it looks good! 👍
You can squash the commits into a single one now
Thanks, @DavidMCerdeira ! Also, please take a look at this commit, where I temporarily resolved the issue by replacing the |
I think for now it is fine. Actually we are aware of at least two platforms that require this behavior to work with bao currently. We should eventually allow platforms to have specific code to handle this sort of platform specific quirks |
9c31e3e
to
7c277bf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Despite the very (!) good work, I think we still need a bit of work on this PR. This is my first pass on it. I think we will need one or two more.
src/core/remio.c
Outdated
*/ | ||
static struct remio_request_event* remio_create_event(void) | ||
{ | ||
struct remio_request_event* event = objpool_alloc(&remio_request_event_pool); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there is at most one event per vcpu at any given time. Couldn't we have that event in the vcpu struct instead of allocating it "dynamically"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this approach is feasible. Even if we modify the remio_device
struct to include pointers to the frontend and backend VMs (replacing the existing remio_device_config
struct), a remote I/O device can run across multiple vCPUs. So, my question is: how would we retrieve the first event in this scenario? Given that the VM struct contains a pointer to its vCPUs, the only solution I’m aware of is iterating over all vCPUs to check for events. However, this approach would prioritize vCPUs with lower IDs over those with higher IDs. Is there a more efficient method that could avoid this bias?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A given event/request is associated with a single vcpu that carries out the access. Neverthless, that wouldn't be possible because the frontend vcpu is not mapped by the hypervisor in the backend cpu.
However, I'd suggest restructuring the remio_request pool in a single array with VCPU_NUM size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done 3e9940e
src/core/remio.c
Outdated
struct list remio_device_list; | ||
|
||
/** Array of Remote I/O requests */ | ||
struct remio_request remio_requests[REMIO_CPU_NUM][REMIO_VCPU_NUM]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be wasting memory since for a given CPU there is at most one vcpu.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More, I don't really like the way were are allocating requests. I would suggest changing this structure. Since each request is associated with an event, we could merge the two concepts and only free the object from the pool once the request was completed.
Then instead of indexing requests with the cpu_id/vcpu_id combination we would have a dynamically allocated id for events/requests that would be passed to the backend for later identification on hypercalls. Also associated with it, we would have a "backend owner VM id" that the hypercall handler would use to check that that event/request is indeed being hanlded by that VM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be wasting memory since for a given CPU there is at most one vcpu.
Yes, you're right. However, I implemented it this way to allow for potential future enhancements in Bao, such as exploring more advanced configurations. These could include 1-to-many mappings between physical and virtual CPUs, or sharing the same vCPU across multiple VMs. In the former case, the vCPU ID would correspond to the actual vCPU, while in the latter, it would be associated with the VM ID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More, I don't really like the way were are allocating requests. I would suggest changing this structure. Since each request is associated with an event, we could merge the two concepts and only free the object from the pool once the request was completed.
From my tests, I attempted to merge the two concepts, but based on my experience, this introduces higher costs in terms of both code complexity and performance. First, since the request event is associated with a specific remote I/O device, when the backend requests a new event, it doesn’t need to iterate through all vCPUs to retrieve it. Instead, it can simply pop the oldest element from the list.
Additionally, the issue of vCPU prioritization that I mentioned earlier still persists when merging the two concepts. Furthermore, as the list is already protected by a spinlock, there’s no need for additional synchronization primitives, which results in simpler code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then instead of indexing requests with the cpu_id/vcpu_id combination we would have a dynamically allocated id for events/requests that would be passed to the backend for later identification on hypercalls.
But how do you locate the request? Are you suggesting iterating over all vCPUs associated with a frontend VM? The same issue I mentioned earlier would persist in this approach.
I do agree that we could dynamically create a unique identifier to replace the CPU and vCPU IDs. Perhaps we could apply a basic mask and concatenate the two numbers into a single unsigned long variable, generating a unique identifier that could then be used to index the request.
Also associated with it, we would have a "backend owner VM id" that the hypercall handler would use to check that that event/request is indeed being hanlded by that VM.
I think I may have misunderstood what you referred to as the "backend owner VM ID." However, the current implementation already verifies that the backend VM performing the hypercall is the owner of the remote I/O backend device. Since the event request list is associated with each remote I/O device, we ensure that only the corresponding backend interacts with the event request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic of handling the request queue would be exactly the same. The only thing that would fundamentally change in my suggestion would be the way you allocate events/requests and the way the backend identifies the specific event/request on an hypercall.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done 3e9940e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think we dont need the event array in the device. We can directly access it via the objpool. We just need some check to make sure that event belongs to the VM calling the ask hypercall and that that VM is the backend for the device associated with that event.
src/core/remio.c
Outdated
unsigned long value = vcpu_readreg(cpu()->vcpu, HYPCALL_IN_ARG_REG(3)); | ||
unsigned long cpu_id = vcpu_readreg(cpu()->vcpu, HYPCALL_IN_ARG_REG(4)); | ||
unsigned long vcpu_id = vcpu_readreg(cpu()->vcpu, HYPCALL_IN_ARG_REG(5)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am thinking we should change the hypercall API to allow more arguments. What do you think @DavidMCerdeira?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you agree, I can add a commit to modify the hypercall
function to retrieve additional arguments (e.g., three in this case).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we would come up with a way of registering hypercall handlers with different numbers of arguments...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think adding an API for reading arguments is a good idea
This way each handler parses the arguments as it sees fit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that idea too! Maybe also an analogous function for setting the return values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I agree!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joaopeixoto13 did you address this?
src/core/remio.c
Outdated
switch (op) { | ||
case REMIO_HYP_WRITE: | ||
case REMIO_HYP_READ: | ||
if (!remio_handle_rw(value, cpu_id, vcpu_id)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't there a major security breach here? Any VM could issue an hypercall with one of these ops and any cpu_id/vcpu_id indexing which and consequently be able to alter the state of that event/request and consequently the response to the backend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not true. We first verify if the VM that performed the hypercall owns the remote I/O device using remio_find_vm_dev_by_id
, and only after confirming ownership do we proceed to call the associated methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this is not checked on the remio_handle_rw
call. The cpu_id
/vcpu_id
are used to access the request without any check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done 3e9940e
After validating and extracting the associated Remote I/O device from the backend VM, we now pass the device pointer to the remio_handle_rw
. Since the requests are tied to the specific Remote I/O device, this ensures that only the correct device on the correct backend VM can modify its state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is still an issue here. Calling remio_handle_rw
will end up in calling remio_get_request
that performs the access to the request array without checking if the id is valid!
src/core/remio.c
Outdated
unsigned long value = vcpu_readreg(cpu()->vcpu, HYPCALL_IN_ARG_REG(3)); | ||
unsigned long cpu_id = vcpu_readreg(cpu()->vcpu, HYPCALL_IN_ARG_REG(4)); | ||
unsigned long vcpu_id = vcpu_readreg(cpu()->vcpu, HYPCALL_IN_ARG_REG(5)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that idea too! Maybe also an analogous function for setting the return values?
f0a2ce4
to
93e0e01
Compare
93106f2
to
78b08a2
Compare
@joaopeixoto13, to keep a clean history, try as much as possible remove the "fix" commits and introduce the change correctly in the first commit that code is introduced. |
Yes, I understand. I split these smaller fixes to make it easier for you to review the changes. At the end of the PR, I’ll rebase all the fix commits into the main "parent" commit that introduced the feature. |
29988f8
to
26a8a46
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only nitpicks
Otherwise looks good to me!
@joaopeixoto13 rebase on main and resolve conflicts please |
Signed-off-by: João Peixoto <[email protected]>
This commit renames the first argument of the hypercall to make the code independent of the specific hypercall type. Signed-off-by: João Peixoto <[email protected]>
This commit introduces a separation of input and output conventions for hypercalls, addressing architecture-specific differences between ARM and RISC-V. On ARM, the x0 register is used to pass both the extid, fid, and the return value, while additional parameters (hypercalls arguments) are passed via registers x1 to xn. On RISC-V, while the return value is passed in a0, the extid and fid are handled by the a7 and a6 registers, respectively. This means that the hypercall input arguments on RISC-V should be passed through registers a0 to a5, while output hypercall arguments are passed via a2 and onward (because all registers except a0 and a1 must be preserved across an SBI call by the callee). To accommodate these differences and provide a uniform interface, two macros were introduced to abstract the handling of input and output hypercall arguments across architectures. Signed-off-by: João Peixoto <[email protected]>
Refactored the vCPU program counter incrementation by introducing the architecture-agnostic method vcpu_writepc, aligning the implementation with the approach used in the ARM architecture. Signed-off-by: João Peixoto <[email protected]>
Implemented support for abort and exception handlers to suspend the current vCPU based on the vCPU active flag. Signed-off-by: João Peixoto <[email protected]>
26a8a46
to
6a55d0e
Compare
Done. If everything looks good, I can proceed to merge all the fix commits into the main parent commit that introduced the feature. |
Please go ahead with that. |
This commit introduces two new methods to the objpool API. The first method, objpool_alloc_with_id, enables the allocation of an object within a pool and returns the index where the object was allocated. The second method, objpool_get_by_id, allows retrieving a previously allocated object using the index of the pool where it was initially allocated. Signed-off-by: João Peixoto <[email protected]>
This commit introduces the foundational support for VirtIO by implementing a mechanism to forward VirtIO requests bidirectionally between the guest VM (or frontend VM) and its associated backend VM. Signed-off-by: João Peixoto <[email protected]>
6a55d0e
to
d56ef75
Compare
Done ✅ |
This PR introduces the foundational support for VirtIO by implementing a mechanism to forward VirtIO requests bidirectionally between the guest VM (or frontend VM) and its associated backend VM.
Validation