Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add execute flushing #137

Merged
merged 13 commits into from
Jan 11, 2024

Conversation

danbone
Copy link
Contributor

@danbone danbone commented Jan 4, 2024

  • Fix pipeline flush mechanisms by removing inflight instructions inside functional units, recovering the rename map, and adding trace rewind functionality
  • Add FlushCriteria class for requesting flushes to and from the FlushManager
  • Update FlushManager to handle and arbitrate between multiple flush sources
  • Add flush request port from ExecutePipe to FlushManager intended to be used for mispredicted branches
  • Delay completing instructions in ExecutePipe by a cycle to fix execute flush race conditions
  • Add branch identification methods for Inst
  • Add test to send random flushes from the branch unit

@danbone danbone marked this pull request as draft January 4, 2024 17:09
@danbone
Copy link
Contributor Author

danbone commented Jan 4, 2024

@knute-sifive few questions for you

For the flush testing mode we spoke about in the call yesterday. Were you thinking to use a parameter in the ExecutePipe?

I've run into a race condition where a branch mispredicts, completes and retires in the same cycle, before the flush is seen. Meaning any younger instruction that is retired alongside should have actually been flushed

See this log, where PID: 1975 is correctly refetched following the flush but also retired just before it.

{0000004137 00004137 top.cpu.core0.execute.br0 info} completeInst_: mispredicted branch uid: 2551  COMPLETED 113fe pid: 1974 'bne	x5,x7 +0xb2'  was actually not-taken
{0000004137 00004137 top.cpu.core0.rob info} retireInstructions_: retiring uid: 2551    RETIRED 113fe pid: 1974 'bne	x5,x7 +0xb2' 
{0000004137 00004137 top.cpu.core0.rob info} retireInstructions_: retiring uid: 2552    RETIRED 11402 pid: 1975 'bne	x12,x13 +0x3c' 
{0000004138 00004138 top.cpu.core0.rename info} getAckFromROB_: Retired instruction: uid: 2551    RETIRED 113fe pid: 1974 'bne	x5,x7 +0xb2' 
{0000004138 00004138 top.cpu.core0.rename info} getAckFromROB_: Retired instruction: uid: 2552    RETIRED 11402 pid: 1975 'bne	x12,x13 +0x3c' 
{0000004138 00004138 top.cpu.core0.rob info} handleFlush_: flushing uid: 2574 DISPATCHED 1144a pid: 1997 'slli	x14,x12, SHAMT=0x20' 

I think we need to fix it so that we don't retire instructions that completed that cycle, and there's a few ways of doing this:
a. Add a sharedData object inside Inst that delays the completed status.
b. Add a completed event inside Inst and have retirement check if the event is scheduled before retiring (event should only be scheduled on the cycle it's completing)
c. Delay setting status to complete inside ExecutePipe and LSU

Any input?

@danbone
Copy link
Contributor Author

danbone commented Jan 5, 2024

As the race condition was between the flush coming from the execute and the instruction changing to completed status. I decided to separate changing the status to completed and the execution of the instruction.

@klingaard
Copy link
Collaborator

Thanks for doing this!

For the flush testing mode we spoke about in the call yesterday. Were you thinking to use a parameter in the ExecutePipe?

This is related to the "random" flushing idea (for testing), right? Here are my thoughts:

  1. Since the flushes are instigated mostly (if not always) from the ROB, add the parameter there. It's simple integer-based parameter, enable_random_flushes that takes a random key. If non-zero, it's enabled with that key.
  2. Add a few test with some random keys (maybe using the $RAMDOM variable from bash)

Glad you got the race condition rectified.

core/FlushManager.hpp Outdated Show resolved Hide resolved
core/FlushManager.hpp Outdated Show resolved Hide resolved
core/Inst.hpp Outdated Show resolved Hide resolved
core/Inst.hpp Outdated Show resolved Hide resolved
@klingaard klingaard added the enhancement New feature or request label Jan 5, 2024
@danbone
Copy link
Contributor Author

danbone commented Jan 8, 2024

I hope I've address your comments above.

I played around with the ROB flushing but I found that doing it in the branch unit found more bugs. This should be a temporary thing to plug the gap until branch prediction is implemented.

For the ROB flushing, I could still put it in. What I did based on your comments was to use a hidden parameter that acted like a UID to trigger a flush from. Obviously this would only cause 1 flush per run, so it didn't have a lot of coverage compared to the branch unit flushing.

@danbone danbone marked this pull request as ready for review January 8, 2024 18:31
@danbone
Copy link
Contributor Author

danbone commented Jan 10, 2024

Just noticed I forgot to change the FlushManager::FlushCriteria.flush method as requested

core/Rename.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@knute-mips knute-mips left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM -- Aaron has a comment worth addressing.

// Testing mode to inject random branch misprediction to stress flushing mechanism
if (enable_random_misprediction_)
{
if (ex_inst->isBranch() && (std::rand() % 20) == 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome, but can a failure be reproduced? Probably should add a parameter to set a seed...

Also, can the 20 be parameterized as well? To increase the chances (for interesting experiments).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could change the parameter enable_random_misprediction to random_misprediction_rate and default it to zero.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to reply back to this last night.

With std::rand, if the seed isn't set then it's always seeded with 1 according to google so should be reproducible (for now)

I think the random_misprediction_rate is a good idea. Something like tuning mispredictions per 1k branches.
I don't have a good knowledge of the C++ random libraries, but I saw I could use the discrete_distribution class. Happy to do this in another PR, as I think it'll need some back and forth discussion.

Comment on lines +73 to +79
static const std::map<FlushCause, bool> inclusive_flush_map = {
{FlushCause::TRAP, true},
{FlushCause::MISFETCH, true},
{FlushCause::MISPREDICTION, false},
{FlushCause::TARGET_MISPREDICTION, false},
{FlushCause::POST_SYNC, false}
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine for now... but I wonder if this should be the decision of the instruction causing the flush...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did think about doing it that way, but then I thought against it as I figured it would be more robust tying the cause and inclusiveness of the flush together. I don't feel too strongly about it, but perhaps we could keep it this way for now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I actually like this approach, so feel free to keep it this way for now.

Comment on lines +49 to +50
is_call_(isCallInstruction(opcode_info)),
is_return_(isReturnInstruction(opcode_info)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NICE!

@klingaard
Copy link
Collaborator

Will merge this tomorrow morning CST.

else {
++it;

ILOG("Flush Instruction ID: " << inst_ptr->getUniqueID() << " from issue queue");
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the issue queue age ordered? If so, can't we just break when we find the first instruction that is not included in the flush?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be, I avoided inferring the ordering though. I figured it's worth the extra cycles to make flushing more flexible. i.e. if we wanted to implement multi-threading.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! I didn't think of that. Makes sense.

////////////////////////////////////////////////////////////////////////////////

// Append instruction into issue queue
void ExecutePipe::appendIssueQueue_(const InstPtr & inst_ptr)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Sparta Buffer class might be good to use for the issue queue. It allows for append and erase from anywhere in the buffer. It has its own implementation of the two methods you have here (appendIssueQueue_ and popIssueQueue_) so you wouldn't need to provide your own implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be something @aarongchan can pick up in the issue queue changes


auto flush_inst = criteria.getInstPtr();

// Rewind the tracefile

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove the if-statement here by passing !criteria.isInclusiveFlush() as the second parameter to reset instead.

@kathlenehurt-sifive
Copy link

Will merge this tomorrow morning CST.

None of my comments need to be addressed before merging. They can be resolved in another PR.

@klingaard
Copy link
Collaborator

Daniel, would you like more time to address Kathlene's comments or would you prefer to do that in another PR?

@danbone
Copy link
Contributor Author

danbone commented Jan 11, 2024

Another PR please

@klingaard klingaard merged commit ce29ae2 into riscv-software-src:master Jan 11, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants