-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Caching for MMU translations #28
base: master
Are you sure you want to change the base?
Conversation
For performance testing
Simple MMU translation lookup cache using a ringbuffer.
Given the current heavily simplified MMU implementation, the page table walker might not even traverse into such cases, so I could potentially use tools like stress-ng to generate memory stressing workloads and determine the necessity. Initially, we should consider leaving some FIXME/TODO comments in the source files. Would you agree? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check CONTRIBUTING.md and run clang-format
to ensure the consistent style.
Here is the proposed MMU caching scheme as discussed also here: #26
This is likely not yet final and needs to be discussed. As said, the savings at -O3 are not great, YMMV. If you add extra test delays to RAM access, you can hopefully see that this does as it should.
Given that the MMU logic is one of, if not the most involved parts of the code, I have a couple of questions & doubts about my own code still:
semu
)that nothing changes in terms of runtime behaviour (again, did a bit-for-bit check of the output RAM at the end of ~300Mcycles) of the emulator. What gives? Is that to be expected? Is that simply how Linux assumes everything to be if on a single RISCV32 core? If so, is that something that should potentially be conditionally #ifdef-ed out to optimize for single-core use of the emu? I must really admit that I have no great oversight of how it all fits together with the exact Linux MMU accesses yet, I am really a bit out of my depth here at the moment :D
Likewise, I added a couple of MMU cache invalidations wherever I saw the potential need for them and I assumed that they are necessary (so as to not expose kernel translations to user space, for example) whenever the execution mode changes between user and supervisor mode. Is my set of invalidations sufficient? Is it too much? Or actually correct?
(As an aside: having the EMU exported to python or another scripting language would make all the high level stuff like cmdline parsing and such test jigs for e.g. a regression test doing RAM bit-for-bit checks against a 'golden RAM dump' so much simpler ...)