Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lsu allow spec load exec #92

Merged
merged 64 commits into from
Dec 19, 2023
Merged

Lsu allow spec load exec #92

merged 64 commits into from
Dec 19, 2023

Conversation

h0lyalg0rithm
Copy link
Collaborator

@h0lyalg0rithm h0lyalg0rithm commented Oct 4, 2023

This PR introduces the following

  • Allows loads to perform Virtual address to Physical address translation before older loads complete.
    This can be configured using a parameter in the LSU
  • Allow non blocking cache lookup requests. Depends on Non Blocking Cache Implementation #91 for the cache to support this feature
  • Implements Ready queue - Simulator only structure to speed up instruction lookup in the LSU.
  • Mitigates data hazards introduced by instructions running out of order
  • The length of the different stages of the LSU pipeline can be configured through the parameters.

@h0lyalg0rithm h0lyalg0rithm force-pushed the lsu_allow_spec_load_exec branch 4 times, most recently from 7f2402b to 7e40e55 Compare October 12, 2023 01:32
@h0lyalg0rithm h0lyalg0rithm marked this pull request as ready for review October 12, 2023 08:02
Copy link
Collaborator

@arupc arupc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add some documentation on what are the purpose of various in and out ports of LSU, at least between LSU & MMU and LSU & Dache

LSU / MMU interface has 3 input ports and 1 output:

  • output:
    • out_mmu_lookup_req
  • inputs:
    • in_mmu_lookup_req
    • in_mmu_lookup_ack
    • in_mmu_free_req

Similary, LSU / Dache interface has 3 input ports and 1 output port:

  • output:
    • out_cache_lookup_req
  • inputs:
    • in_cache_lookup_req
    • in_cache_lookup_ack
    • in_cache_free_req

core/LSU.cpp Outdated Show resolved Hide resolved
core/LSU.hpp Outdated Show resolved Hide resolved
core/LSU.hpp Outdated Show resolved Hide resolved
core/LSU.cpp Outdated Show resolved Hide resolved
core/LSU.hpp Outdated Show resolved Hide resolved
Copy link
Collaborator

@arupc arupc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please explain what these priorities / states mean and when they would occur? Not sure I understand what is meant by 'mss ack' and 'another oustanding miss finish'

                HIGHEST = 0,
                __FIRST = HIGHEST,
                CACHE_RELOAD,   // Receive mss ack, waiting for cache re-access
                CACHE_PENDING,  // Wait for another outstanding miss finish
                MMU_RELOAD,     // Receive for mss ack, waiting for mmu re-access
                MMU_PENDING,    // Wait for another outstanding miss finish
                NEW_DISP,       // Wait for new issue
                LOWEST,

Does NEW_DISP simply mean dispatched and waiting to be issued?

Copy link
Collaborator

@arupc arupc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One drawback of having a enumeration of IssuePriority is that it somewhat ties us to having one issue priority policy. If we want to allow multiple issue priority policies as a paramter, it may be difficult to with IssuePriority enumerator.

For that, we should keep the CACHE/MMU_RELOAD/PENDING info as states of the LoadStoreInfo class and allow different policies to decide which one to issue.

This can be done in a different PR after this is merged in.

ghost
ghost previously requested changes Oct 13, 2023
Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great start. A few items here and there to tackle, obviously

core/LSU.hpp Show resolved Hide resolved
core/LSU.hpp Outdated Show resolved Hide resolved
core/LSU.hpp Outdated Show resolved Hide resolved
core/MemoryAccessInfo.hpp Outdated Show resolved Hide resolved
core/MemoryAccessInfo.hpp Show resolved Hide resolved
test/sim/lsu_arch_configs/small_core.yaml Outdated Show resolved Hide resolved
core/LSU.cpp Outdated Show resolved Hide resolved
core/LSU.cpp Outdated Show resolved Hide resolved
core/LSU.cpp Outdated Show resolved Hide resolved
core/LSU.cpp Outdated Show resolved Hide resolved
@h0lyalg0rithm
Copy link
Collaborator Author

h0lyalg0rithm commented Oct 17, 2023

Can you please explain what these priorities / states mean and when they would occur? Not sure I understand what is meant by 'mss ack' and 'another oustanding miss finish'

                HIGHEST = 0,
                __FIRST = HIGHEST,
                CACHE_RELOAD,   // Receive mss ack, waiting for cache re-access
                CACHE_PENDING,  // Wait for another outstanding miss finish
                MMU_RELOAD,     // Receive for mss ack, waiting for mmu re-access
                MMU_PENDING,    // Wait for another outstanding miss finish
                NEW_DISP,       // Wait for new issue
                LOWEST,

Does NEW_DISP simply mean dispatched and waiting to be issued?

The priority values are used to select the instruction that needs to be issued.
NEW_DISP is the priority of an instruction which has its operands ready(depends on the spec_load_exec) and is ready to be issued.
CACHE_RELOAD on the other hand is the priority placed on an instruction which had previously had a cache miss and now the cache is ready so it wake up the instruction by setting its priority to CACHE_RELOAD. As the priority is higher it is selected to be issued.

@arupc
Copy link
Collaborator

arupc commented Oct 18, 2023

Can you please explain what these priorities / states mean and when they would occur? Not sure I understand what is meant by 'mss ack' and 'another oustanding miss finish'

                HIGHEST = 0,
                __FIRST = HIGHEST,
                CACHE_RELOAD,   // Receive mss ack, waiting for cache re-access
                CACHE_PENDING,  // Wait for another outstanding miss finish
                MMU_RELOAD,     // Receive for mss ack, waiting for mmu re-access
                MMU_PENDING,    // Wait for another outstanding miss finish
                NEW_DISP,       // Wait for new issue
                LOWEST,

Does NEW_DISP simply mean dispatched and waiting to be issued?

The priority values are used to select the instruction that needs to be issued. NEW_DISP is the priority of an instruction which has its operands ready(depends on the spec_load_exec) and is ready to be issued. CACHE_RELOAD on the other hand is the priority placed on an instruction which had previously had a cache miss and now the cache is ready so it wake up the instruction by setting its priority to CACHE_RELOAD. As the priority is higher it is selected to be issued.

@arupc arupc closed this Oct 18, 2023
@arupc arupc reopened this Oct 18, 2023
@arupc
Copy link
Collaborator

arupc commented Oct 18, 2023

Can you please explain what these priorities / states mean and when they would occur? Not sure I understand what is meant by 'mss ack' and 'another oustanding miss finish'

                HIGHEST = 0,
                __FIRST = HIGHEST,
                CACHE_RELOAD,   // Receive mss ack, waiting for cache re-access
                CACHE_PENDING,  // Wait for another outstanding miss finish
                MMU_RELOAD,     // Receive for mss ack, waiting for mmu re-access
                MMU_PENDING,    // Wait for another outstanding miss finish
                NEW_DISP,       // Wait for new issue
                LOWEST,

Does NEW_DISP simply mean dispatched and waiting to be issued?

The priority values are used to select the instruction that needs to be issued. NEW_DISP is the priority of an instruction which has its operands ready(depends on the spec_load_exec) and is ready to be issued. CACHE_RELOAD on the other hand is the priority placed on an instruction which had previously had a cache miss and now the cache is ready so it wake up the instruction by setting its priority to CACHE_RELOAD. As the priority is higher it is selected to be issued.

How would we implement an issue policy which prioritizes requests with MMU pending over request with CACHE pending?

How about instead of IssuePriority enum, we keep something equivalent to bitmasks, specifying for each instance of LoadStoreInstInfo, whether MMU_LOOKUP is needed and has been satisfied, CACHE_LOOKUP is needed and has been satisfied.

This would enable implementation of various issue policies, without changing the enum. But this can be done as apart of another MR

core/LSU.cpp Outdated Show resolved Hide resolved
core/LSU.cpp Outdated Show resolved Hide resolved
core/LSU.cpp Outdated Show resolved Hide resolved
core/LSU.cpp Outdated Show resolved Hide resolved
@ghost
Copy link

ghost commented Dec 11, 2023

Did you test this with Kunal L2Cache stuff? Does it work?

@h0lyalg0rithm
Copy link
Collaborator Author

Did you test this with Kunal L2Cache stuff? Does it work?

I havent been able to test it out, I tried to keep the PR in sync with master

@h0lyalg0rithm
Copy link
Collaborator Author

@klingaard I rebuilt the project(cleant cache, cmake build folder) I ran both coremark and drhystone and both of them ran successfully.
I have the logs from valgrind but there were tons of errors but not sure if they are false positive

ERROR SUMMARY: 10000034 errors from 41 contexts

@klingaard
Copy link
Collaborator

I'll take a look this evening.

core/LSU.cpp Outdated Show resolved Hide resolved
@ghost
Copy link

ghost commented Dec 15, 2023

I forgot one more thing. 😄 Can you run run clang format on the LSU since you're the last person to make major changes to it?

@klingaard
Copy link
Collaborator

@h0lyalg0rithm, I hijacked your PR to update the names of the CI regressions (and merged with master). FYI...

@h0lyalg0rithm
Copy link
Collaborator Author

@klingaard I ran the formatter on most of the files I touched.I left some like Inst.hpp and sim/Olympia.cpp.
I found an duplicate config in the clang format file. Could you please check I removed the correct one

@klingaard
Copy link
Collaborator

Thanks for running the formatter. One last, last thing, promise! Can you update the Description field on what this PR did -- the changes it implemented and new expected behaviors. That will be the final commit message.

@h0lyalg0rithm
Copy link
Collaborator Author

I updated the PR description

@klingaard klingaard merged commit f4088b8 into master Dec 19, 2023
4 checks passed
@klingaard klingaard deleted the lsu_allow_spec_load_exec branch December 19, 2023 20:44
LiGaOg pushed a commit to LiGaOg/riscv-perf-model that referenced this pull request Jan 8, 2024
This PR introduces the following
- Allows loads to perform Virtual address to Physical address
translation before older loads complete
  This can be configured using a parameter in the LSU (`allow_speculative_load_exec`)
- Allow non blocking cache lookup requests. Depends on riscv-software-src#91 for the cache
to support this feature
- Implements Ready queue - simulator-only structure to speed up
instruction lookup in the LSU
- Mitigates data hazards introduced by instructions running out of order
- The length of the different stages of the LSU pipeline can be
configured through the parameters.

---------

Co-authored-by: Knute Lingaard <[email protected]>
@danbone danbone mentioned this pull request Jan 8, 2024
8 tasks
PARAMETER(uint32_t, replay_issue_delay, 3, "Replay Issue delay")
// LSU microarchitecture parameters
PARAMETER(
bool, allow_speculative_load_exec, true,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@h0lyalg0rithm is this parameter's boolean meaning inverted? I'm looking at the code and if this is set to true, then the LS will NOT allow a load to bypass an older store.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@klingaard This parameter by default is set to true, so it should allow a load to complete before a store

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants