Feature: LPMP support to break the enclave number limitation #445
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In the current Keystone implementation, the number of enclaves is limited to less than 16 because of limited hardware PMP resources. This PR breaks the number limitation via an efficient PMP virtualization mechanism. As a proof-of-concept, our implementation enables Keystone to host 32 enclaves simultaneously. Furthermore, theoretically it could support much more enclaves (1,000+) with some extra engineering efforts (enlarging/replacing the bitmap data structure which Keystone uses for Enclave memory management), which is left as a part of future work.
Theory
We introduce the concept of LPMP (Logical PMP) for PMP virtualization. LPMP is implemented in the Security Monitor. Essentially, it keeps track of a list of memory regions for each enclave (the host OS is treated as a special enclave). On context switches to an enclave, it loads as many regions owned by the enclave as possible into real PMP CSRs. Then when the enclave accesses the regions that are not loaded, the system traps into the Security Monitor. If the access is legal, LPMP rearranges the region list according to the most-recently-used policy.
We also introduce two additional tricks for performance improvement. First, on many RISC-V platforms, the PMP checking results are cached in the TLB (We have verified this on SiFive Unmatched. This feature is also supported in Rocket-Chip.) On these platforms, we can use TLB capacity as effective PMP entries if we carefully maintain the TLB status. On LPMP traps, we only flush the TLB entry that triggers the trap, leaving other entries untouched. To ensure this scheme cannot be used to bypass PMP, two principles must be enforced: (1) on switching from/to an enclave, all TLB entries must be flushed (this ensures that all entries in the TLB must belong to the same entity at any moment); and (2) whenever memory ownership changes, for example, on enclave memory allocations and deallocation, an IPI must be sent to all cores to flush all TLBs. Unfortunately, this TLB-caching trick cannot be incorporated with features like ASIDs at the moment. The second trick is simpler and applies to all platforms: Instruction-Data split. We reserve one or two real PMP CSRs for instruction and only other PMP CSRs are for data. This resembles many modern cache designs. As for now, the second trick is not implemented in this PR for Keystone and will be done in the future.
In our own implementation (an independent one, not on top of Keystone), LPMP with two performance improvement tricks can support 2,000+ enclaves simultaneously and the number of memory segments of an enclave is not limited. The performance under heavy physical memory fragmentation is less than 5%.
Brief summary of implementation
We implement a prototype of LPMP for Keystone. We wish to make this prototype as minimal as possible to make it easier to understand. More engineering efforts should be done later to fully unleash the potential of LPMP.
In our prototype, we mainly change the part relevant to the host OS in the Security Monitor. In Keystone's original design, the memory of the host OS is fragmented by enclaves, which consumes the PMP resources. We add a region list to enable LPMP support for the host OS. This proof-of-concept can now support 32 enclaves.
Detailed modifications to Keystone
Added files:
keystone/sm/src/lpmp.h
keystone/sm/src/lpmp.c
keystone/overlays/keystone/patches/opensbi/opensbi-lpmp.patch
: pmp_fault_handler() for LPMP.Modified files (list only essential modifications):
keystone/sm/src/enclave.c
: maintain LPMP region list for the host OS.keystone/sm/src/ipi.c
: ipi to flush tlb for all harts.What's more to do?
This prototype is still pretty limited since:
The first limitation is due to the 64-bit bitmap data structure used in Keystone for memory management. Each enclave consumes 2 bits in the bitmap. As a result, only 32 enclaves are allowed. Simply using another data structure or extending the bitmap should significantly increase the number limit.
Attachments
test_lpmp.sh