Code Anatomy on CSR module #3028
DecodeTheEncoded
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
CSR module is mainly for handling the access of control and status registers; The module also conducts logic to make sure functionality that corresponds to specific csrs actually works. The CSR module has intensive coupling with RocketCore, RocketCore feeds interrupts request collecting from elsewhere into the CSR module(
csr.io.interrupts := io.interrupts
), interrupt arbitration will be conducted(choosing one high priority interrupt among simultaneous firing ones) and the arbitration result is asserted(csr.io.interrupt
) by CSR module and causes the instruction atID
stage being killed(ctrl_killd := !ibuf.io.inst(0).valid || ibuf.io.inst(0).bits.replay || take_pc_mem_wb || ctrl_stalld || csr.io.interrupt
), and all instructions already in pipeline before the one atID
are guaranteed to complete, since the stages that modify microstructural states areWB
and later(scoreboard
), RocketCore therefore supports precise interrupt. The arbitrated interrupt and all other sync exceptions occurring during the pipeline will flow downwards, and sendcsr.io.exception := wb_xcpt
to CSR module with other auxiliary signals likecsr.io.cause := wb_cause
,csr.io.tval := Mux(tval_valid, encodeVirtualAddress(wb_reg_wdata, wb_reg_wdata), 0.U)
, andcsr.io.pc := wb_reg_pc
for further handling atWB
stage. The csr access request are also asserted into csr module atWB
(csr.io.rw.addr := wb_reg_inst(31,20)
csr.io.rw.cmd := CSR.maskCmd(wb_reg_valid, wb_ctrl.csr) csr.io.rw.wdata := wb_reg_wdata
);The csr access request will have its effective response back at the same cycle:
WB
, the response is the old value stored in the csr, and will be stored intord
register:wb_waddr
:Also note that the instruction at
ID
stage will be sent to CSR module for csr-related decoding:csr.io.decode(0).inst := id_inst(0)
, the csr module will decide whether instruction is illegal or needs to be stalled under current csr setting:If the corresponding logic in CSR module decides instruction at
ID
is illegal(for examle, write to a read only csr, a floating-point instruction while no fp module is supported in the system, etc.),id_illegal_insn
is asserted, therefore indicating an exception happens atID
stage(id_xcpt
):As mentioned above, exception happening at any stage of pipeline will flow downwards to
WB
, so isid_xcpt
. It will flow toWB
and assertexception
request to csr module there. If an exception happens at one specific stage, the RocketCore has logic to prevent that instruction(or any instruction after that instruction in program order) from modifying the microstructural state of the core, I actually feel lost in the jungle ofreplay
,kill
,take_pc
, etc.In CSR module,
read_mapping
is a collection of mapping from csr address to the specific registers that holds the value for all supported csrs, part of theread_mapping
is as follows:And
decoded_addr
is a collection of mapping from csr address to a bool indicating if that csr address is accessed by the core, the previous impl ofdecoded_addr
is quite straightforward:val decoded_addr = read_mapping map { case (k, v) => k -> (io.rw.addr === k) }
. The read portion of csr access is easy:io.rw.rdata := Mux1H(for ((k, v) <- read_mapping) yield decoded_addr(k) -> v)
. In terms of csr write access, note that there are situations that a csrrs/c or csrrs/ci will not write to the corresponding csr if thers1
or theuimm
is 0, this has been handled in RocketCore:val id_csr_ren = id_ctrl.csr.isOneOf(CSR.S, CSR.C) && id_expanded_inst(0).rs1 === UInt(0)
means this csr access only reads the old value, the csr access request flowing downwards to WB will beCSR.R
:val id_csr = Mux(id_system_insn && id_ctrl.mem, CSR.N, Mux(id_csr_ren, CSR.R, id_ctrl.csr))
instead ofCSR.C
orCSR.S
. Theval csr_wen = io.rw.cmd.isOneOf(CSR.S, CSR.C, CSR.W)
in CSR module therefore is correct for indicating a csr write. theval wdata = readModifyWriteCSR(io.rw.cmd, io.rw.rdata, io.rw.wdata)
prepares the data to be written, see code below:Note the comment in code section above, this is a sleek, generic way for generating data to be written no matter it's
W
,S
, orC
.The csr write logic is straightforward after the depiction above, there are many csr registers in the RC system that could be written, therefore that's a tedious yet long
when(csr_wen){when(isSpecificCSR){/*conducting write*/}}
, below is code sections for write to some of the csrs:Note that there are register-specific rules in terms of csr write. For instance, some fields of the
sip
can only be writable when corresponding interrupts are delegated to S-mode, etc.. Refer to privilege spec for specific info.It's worth noting that privileged instructions are handled not in the RocketCore pipeline, they are sent to csr module because they just need to dance with some of the csrs. These instructions have the same opcode and encoding with the CSR access instruction, and use the insn[31:20](this field is originally for csr address encoding in terms of normal csr access) to encode which specific instruction it is. Therefore they can be sent to csr module using the same interface with the csr access:
There are 5 privileged instructions,
insn_call :: insn_break :: insn_ret :: insn_cease :: insn_wfi
. ecall, ebreak, insn_ret are exception related and will be depicted below. the Wait for Interrupt instruction (WFI) provides a hint to the implementation that the current hart can be stalled until an interrupt might need servicing. Refer to 3.2.3 of priv spec for detailed info.In general, a wfi instruction will stall the pipeline (
io.csr_stall := reg_wfi || io.status.cease
) and notify other parts of the soc that this hart is waiting for an interrupt, maybe logic elsewhere will direct an interrupt to this hart, like what 's said in the spec: Execution of the WFI instruction can also be used to inform the hardware platform that suitable interrupts should preferentially be routed to this hart.:Note that once a wfi executes,
reg_wfi := true
will be asserted, indicating this hart is waiting for interrupt. In RC impl, any sign of forthcoming interrupt will deassert the wait for interrupt state:when (pending_interrupts.orR || io.interrupts.debug || exception) { reg_wfi := false }
andio.interrupts.nmi.map(nmi => when (nmi.rnmi) { reg_wfi := false } )
Another privileged instructions supported in RC impl while not depicted in priv spec is the
CEASE
instruction, this is a custom instruction, and also uses the SYSTEM encoding space likewfi
. Refer to 7.3 of U54_MC core complex manual for further detail, it's only available in M-mode.CEASE
is mainly for instigating the power-down sequence, after retiring CEASE, hart shall not retire another instruction until reset, and debughaltreq
will not work after a CEASE instruction has retired. CEASE will eventually raise the cease_from_tile_N signalto the outside of the Core Complex, indicating that it is safe to power down. Refer to 14.4 and 14.9 for detailed depictions in terms of system power down. The corresponding code in terms of CEASE is as follows:
One thing to note is that the
CEASE
inWB
will stall the pipeline, meaning that no new instructions will be fetched, my confusion however is that the insn inEX
andMEM
may retire successfully, which violates the depiction in U54 manual that after retiring CEASE, hart shall not retire another instruction until reset. Maybe it's guaranteed by software so that there is no extra instructions after CEASE.Besides the logic for csr access, there is huge amount of code for interrupt and exception handling in csr module. Handling of exception and interrupt is a very essential part of processor core and the RISC-V privilege spec has very confusing depictions in terms of the interrupt and exception, therefore some of my clarifications may not be right.
As depicted before, the interrupts signals(
io.interrupts
) are accumulated from elsewhere and injected into csr module, some form of arbitration will be conducted in the csr module, and one with highest priority will be chosen(io.interrupt
) as the effective interrupt at play. This chosen interrupt and correspondingio.interrupt_cause
will be asserted to RocketCore, causingid_xcpt
andid_cause
at
ID
stage being asserted, all instructions before the one atID
stage will complete, while the one atID
will assertctrl_killd
, and an alternativenop
will flow downwards instead, along with extra indicator signal for exception(interrupt)--ex_reg_xcpt
,mem_reg_xcpt
,wb_reg_xcpt
, these exception indicator signal will assure instruction at the same stage and subsequent instructions fed into pipeline will not cause the microstructural state change; There are lots of complications in terms of not causing microstructural state change, for example, the RocketCore normally initiatesdmem
(dcache) request atEX
stage, therefore there is logic to kill this request one cycle later(io.dmem.s1_kill
) atMEM
if the instruction before(now atWB
) has any unexpected situation happening(reflected intake_pc_wb
), or the MEM stage exception indicator signalmem_reg_xcpt
is asserted. AndWB
stage version of exception indicator signal (wb_xcpt
) will cause thewb_valid
not being asserted, therefore no register will be update at WB(indicated byrf_wen
). These are just very superficial depictions in terms of RocketCore pipeline. Maybe I will post another code anatomy for RocketCore, details will be clarified there.Among many interrupts, we need to find the one with highest priority that can be taken. According to the spec:
Also note the
mip
andmie
also have subset versions, according to the 3.1.9 section of priv spec:The clarification above just means that if an interrupt gets delegated to lower priv levels. Whether that interrupt is pending and enabled should be decided by fields in
xip
andxie
, instead of the corresponding fields inmip
andmie
. However, according to the 4.1.5 of priv spec:That means basically the fields in xip/xie and mip/mie are the same, if corresponding bit exists in lower xip/xie. Therefore, the RC impl just use mip and mie to decide the pending interrupts.
Now, let's dive into the interrupt arbitration code:
There are some clarifications in terms of code above:
io.interrupts
only hasmtip
,msip
,meip
,seip
in terms of pending interrupts inmip
. These interrupts are coming from PLIC(meip
andseip
) and CLINT(mtip
andmsip
), and are read-only in terms of csr access. This complies with the depiction in 3.1.9 of spec that Only the bits corresponding to lower-privilege software interrupts (USIP
,SSIP
), timer interrupts (UTIP
,STIP
), and external interrupts (UEIP
,SEIP
) inmip
are writable through this CSR address; the remaining bits are read-only. Note that the supervisor external interrupt (seip
) can be fired by either from signals coming from PLIC or by writing to theseip
bits in mip csr, the rationale for this has been depicted in the spec: The SEIP field behavior is designed to allow a higher privilege layer to mimic external interrupts cleanly, without losing any real external interrupts. The behavior of the CSR instructions is slightly modified from regular CSR accesses as a result.io.interrupts.buserror
) will be treated as a nmi interrupt if the nmi is supported, otherwise it will be treated as a normal interrupt(as a member ofpending_interrupts
) with highest priority.mie
. There are actually two forms of nmi:unmi
andrnmi
.unmi
means unresumable non-maskable interrupts, where the NMI jumps to a handler in machine mode, overwriting the currentmepc
andmcause
register values. If the hart had been executing machine-mode code in a trap handler, the previous values inmepc
andmcause
would not be recoverable and so execution is not generally resumable. That is to say that the unmi is handled using the m-mode exception facility. There is another type of nmi: rnmi, rnmi has its own interrupt handling facility and 4 extra csrs(mnepc
,mncause
,mnstatus
, andmnscratch
) are added to the csr space. Refer to 8.11 of the u54mc_core_complex manual and this rnmi proposal for further detail. What's worth noting is that there is an internal micro-architectural state bitrnmie
exists to reflect whether there is an on-going rnmi,rnmie
is cleared to indicate that the processor is in an RNMI handler and cannot take a new RNMI interrupt. When clear, all other interrupts are disabled except debug interrupts.Once all pending interrupts are decided, we need to choose one of the interrupts as the firing one. The holistic criteria is simple: choose the pending one(decided via
mie
andmip
, also note that the debug interrupt and rnmi has their own pending indicator signals:nmi_interrupts
andd_interrupts
) with the highest priority that is not global disabled(decided by corresponding fields inmstatus
and current priv level).See code below:Here comes a very important conception: exception(interrupt) delegation; Though all exceptions and interrupts are normally handled in m-mode, the RISC-V provides a way of delegating specific exceptions to lower priv levels, so that these exceptions are handled in a priv level where no context switching is needed. Refer to 3.1.8 of priv spec for further details. In short summary, if corresponding bit in
mideleg
ormedeleg
is set, the exception or interrupt that corresponds to that specific bits will be delegated into lower priv level(S or U,note that if a system has M,S,U, setting the corresponding bits in midelge or medeleg will just delegate the exception to S-mode, there are bits insideleg
andsedeleg
that will be set to delegate that exception to U-mode), andxepc
,xtval
andxcause
, etc. will be updated instead ofmepc
,mtval
andmcause
, thexPP
field of mstatus is written with the active privilege mode at the time of the trap; thexPIE
field ofmstatus
is written with the value of thexIE
field at the time of the trap; and thexIE
field of mstatus is cleared. The mcause and mepc registers and the MPP and MPIE fields of mstatus are not written.. What's worth noting is that a delegated interrupt will cause that interrupt at the delegator priv level being mask. For example, if the supervisor timer interrupt (STI) is delegated to S-mode by setting mideleg[5], STIs will not be taken when executing in M-mode. Also, if an (sync) exception is delegated to a less-priv level(S-mode for example), when that exception happens at the M-mode, that exception will still be handled in m-mode, instead of s-mode. A more priv interrupt shall never be delegated to less priv levels, for instance,msip
,meip
,mtip
shall never be delegated to S-mode. Some exceptions can not happen at less priv levels, therefore corresponding bits inxedeleg
should be hardwired to 0;When deciding which interrupt to fire, the RC impl first finds ones that are delegated and one that are not. Because more priv interrupt shall never be delegated to less priv levels, so interrupts that are in
m_interrupts
(ones that are not delegated) surely have higher priority than ones ins_interrupts
(ones that are delegated to S-mode). Note that there is a holistic order: debug_interrupt(d_interrupts
), rnmi_interrupt(nmi_interrupts
),m_interrupt
,s_interrupts
,vs_interrupts
, that is to say that the debug interrupt are of highest priority, the second is rnmi, etc. In terms of specific implementation,chooseInterrupt
takesmasksIn
of the reverse order, therefore reverses it(val masks = masksIn.reverse
, 0 -> d_interrupts, 1 -> nmi_interrupts 2 -> m_interrupts 3 -> s_interrupts), and for each item inmasks
, there is also an implicit order(for example,MEI
>MSI
>MTI
;SEI
>SSI
>STI
;VSEI
>VSSI
>VSTI
;UEI
>USI
>UTI
). Consequently,val which = PriorityMux(masks.flatMap(m => priority.filter(_ < m.getWidth).map(i => (m(i), i.U))))
will choose the appropriate interrupt that fulfills the spec requirements. Note that thechooseInterrupt
still holds for situations that no s-mode interrupts(SEI
,SSI
,STI
) are delegated, these interrupts therefore are represented in m_interrupt, there is still an order inm_interrupt
:MEI
>MSI
>MTI
>SEI
>SSI
>STI
.After one interrupt are chosen,
io.interrupt
and corresponding causeio.interrupt_cause
are asserted to RocketCore:Note that RC impl also supports single step debugging facility, detailed info can be obtained in section 4.4 of debug spec; The
step
field indcsr
can be modified in debug mode; After it's set and the harts snaps out of debug mode(io.singleStep := reg_dcsr.step && !reg_debug
), it will execute exactly one instruction(io.retire(0)
) and re-trap to debug mode again. The specific mechanism that make this work is as follows:dret
atWB
will asserttake_pc_wb
, therefore flush(kill) the ongoing instructions in pipeline(including the ones inibuf
).io.singleStep
is asserted once thedret
finish executing. New instructions starting atcsr.io.evec
will enter the pipeline, note that once one instruction flows downwards beyondID
successfully, theID
stage will be stalled(val ctrl_stalld = ...csr.io.singleStep && (ex_reg_valid || mem_reg_valid || wb_reg_valid)
), meaning that no extra instruction will enter the ID stage. Once that instruction retires atWB
or an exception happens along with it(io.retire(0) || exception
),reg_singleStepped
will be asserted next cycle; The assertion ofreg_singleStepped
will assert theio.interrupt
to RocketCore, indicating that an (debug) interrupt needs to be handled.io.interrupt
will cause execution flow change to debug entry, entering the debug mode and assertingreg_debug
therefore, the assertion ofreg_debug
will causeio.singleStep
being false, consequently code sequence in the debug rom will not be stalled because of assertion ofctrl_stalld
. Note thatio.singleStep
will keep asserting from the very moment after leaving debug mode in which the dcsr.step is configured, to the moment debug mode is re-trapped. De-assertion ofio.singleStep
will causereg_singleStepped
deasserted, marking a round of single step operation. The corresponding code is as follows:From
io.interrupt := (anyInterrupt && !io.singleStep || reg_singleStepped) && !(reg_debug || io.status.cease)
we can also see that 1,no interrupt is fired under debug mode.--EVEN THE DEBUG INTERRUPT ITSELF; 2,All interrupts will be masked(including debug req) after a CEASE instruction has retired; 3,interrupt is masked if singlestep is enabled.The
io.interrupt
comes into RocketCore, and kills the instruction atID
(ctrl_killd
), and flows downwards through the pipeline. Also note that the exceptions happen at any pipeline stage will also flow downward. The exception or interrupt will finally assertwb_xcpt
at WB, notifying the CSR module that an exception needs to be handled. Besideswb_xcpt
, these signals are sent to csr module:csr.io.cause := wb_cause
;csr.io.pc := wb_reg_pc
;csr.io.tval := Mux(tval_valid, encodeVirtualAddress(wb_reg_wdata, wb_reg_wdata), 0.U)
, etc. Oncecsr.io.exception := wb_xcpt
is asserted, the csr module has to decide where the exception handler is:io.evec := tvec
is asserted at the same cycle ofwb_xcpt
's assertion. And logic in RocketCore will direct the FrontEnd to begin fetching instructions from there(evec
):Specifically, below are key signals in terms of determining property of the coming exception:
The very first scenario that causes trap-to-debug is single stepping, which has been depicted above. Normally, there are 3 cases that will cause a debug interrupt(
causeIsDebugInt
):1, explicithaltreq
request fromdmcontrol
; 2, a hart with halt-on-reset option configured just snaps of reset; 3, one of the hart or external triggers which are in the same halt groups with the specified hart halts:Normally, an
ebreak
will cause an breakpoint exception(0x03) and trap toBASE
ofxtvec
; But if one of theebreakm
,ebreaks
,ebreaku
are set, the ebreak will trap to debug when the hart runs in corresponding priv level:The trigger facility(internal debugging) may also trap to debug if the configured trigger is triggered, further detail will be clarified later when depicting the trigger mechanism. The last scenario that may re-trap-to-debug is the situation of debug exception. Synced exception may happen in debug mode(note that all interrupts are masked in debug mode:
io.interrupt := (anyInterrupt && !io.singleStep || reg_singleStepped) && !(reg_debug || io.status.cease)
), in this casereg_singleStepped
,causeIsDebugInt
,causeIsDebugTrigger
andcauseIsDebugBreak
are all deasserted, the finalreg_debug
insideval trapToDebug = Bool(usingDebug) && (reg_singleStepped || causeIsDebugInt || causeIsDebugTrigger || causeIsDebugBreak || reg_debug)
will take effect, meaning that an exception happens during the execution of debug rom code or code constructed in ABSTRACTS and program buffer.With depictions above in terms of debug interrupt, where to jump is pretty straightforward:
Note that an ebreak in debug mode is used by
ABSTRACT
orprogram buffer
to movepc
to beginning of debug rom.Corresponding logic for nmi trap is follows:
When an exception happens in the rnmi handler,
trapToNmiXcpt
will be asserted(val trapToNmiXcpt = usingNMI.B && !nmie
); Note that the nmi interrupt or exception handler entry point are sent into the RocketCore throughrnmi_interrupt_vector
andrnmi_exception_vector
instead of what's normally indicated inxtvec
csr register. Also note that there is no delegation specification in terms of non-maskable interrupt and exception, the rnmi interrupt and exception are all trap to m-mode.Except for
trapToDebug
andtrapToNmi
, normally the interrupt or exception handler locates in a place whose address is specified by themtvec
orstvec
(if the corresponding exception or interrupt is delegated, note that the RC impl does not implement user mode interrupts, that is no N extension is supported in the system). The related code is as follows:Besides notifying the FrontEnd fetching instructions at the handler entry point. The exception context needs to be constructed so that the execution can be restored once exception handler finishes(marked by
xret
instruction). See code below:In general, once an exception or interrupt is taken, the execution flow will transfer to exception(interrupt) handler, which generally runs under more priv level than the interrupted context(unless that specific interrupt or exception is delegated). Therefore we need places to hold the previous priv level, pc and also the exception
cause
so that the execution flow can be restored after the handler finishes, that's what fieldsmpp
,epc
andcause
are for. Also, according to the spec(3.1.6.1):I still feel confused about the rationale of global interrupt-enable bit MIE, the spec says that These bits(MIE, SIE and UIE) are primarily used to guarantee atomicity with respect to interrupt handlers in the current privilege mode. This just doesn't clarify well for me. Also, the xPIE field is used to restore xIE once xret is executed. What's the rationale for this two-level XPIE and XIE stack?
Debug interrupt and rnmi interrupt has their own stack(in dcsr and corresponding nmx registers) to hold
mpp
,epc
andcause
instead of corresponding fields inmstatus
orsstatus
. Note that debug exception hastrapToDebug & !reg_debug
being true, therefore doesn't re-modify the debug interrupt stack, just notifying the FrontEnd begin fetching instructions at Debug_Exception(0x808). rnmi exception(withnmie
being false) however will always trap m-mode(Somewhere in the rnmi spec asks for this) and will modify the corresponding mstatus fields like MPIE, MPP and MIE, so when a rnmi happens while one m-mode interrupt(interrupt that's not delegated) is being processed, a rnmi exception will smash the mstatus stack, therefore there is no way to return from previous m-mode interrupt.I have some instinct that the
reg_debug
andreg_rnmie
may partially act like thexIE
fields for specific priv level in terms of enabling corresponding interrupt, but this is fuzzy.The logic for
xret
(mret
,sret
,dret
,mnret
)logic is very straightforward.xret
instructions marks end of handler execution. It restores the interrupted execution flow and priv level. Formret
andsret
, corresponding global interrupt enable bits for that m-mode and s-mode are also restored(MPIE->MIE, true->MPIE):Note that
io.evec
is overloaded to indicate the address of the exception handler(whenwb_xcpt
is asserted) and address of the interrupted instruction(whencsr.io.eret
is true).If an s-mode interrupt is being processed, the
SIE
inmstatus
is cleared, indicating that any S-mode interrupt not be taken under this scenario. However, according to the spec, m-mode interrupt will be able to preempt that s-mode interrupt no matter whether the MIE bit is set because the m-mode interrupt is globally enabled when the hart is executing under less priv levels that m-mode . What's cool is that the m-mode interrupt will not smash the s-mode stack:SPIE
,SIE
,SPP
etc., therefore the interrupted S-mode interrupt handler can be resumed. In general, the x mode exception(interrupt) can always preempt y mode exception if and only if x is more priv than y. Note that Waterman confirmed in this issue that interrupts for lower-privilege modes at 3.1.6.1 means interrupts that is delegated to lower privilege levels.Besides csr access and exception configuration & handling logic, CSR module also has logic for hardware performance-monitoring facility. The RISC-V includes a basic hardware performance-monitoring scheme, read 3.1.11, 3.1.12 and 3.1.13 of the RISC-V privilege spec for detailed info; There are two categories of monitoring counters, fixed function monitoring counter and event-programmable monitoring counter.
There are 2 fixed function monitoring counter,
mcycle
andminstret
;mcycle
holds the number of clock cycles the hart has executed since some arbitrary time in the past. While the counterminstret
holds the number of instructions the hart has retired since some arbitrary time in the past. The counter registers have an arbitrary value after system reset, and can be written with a given value.Another category is event-programmable monitoring counter, there are at maximum 29 additional 64-bit event-programmable monitoring counters,
mhpmcounter3–mhpmcounter31
. The event selector CSRs,mhpmevent3–mhpmevent31
, are MXLEN-bit WARL registers that control which event causes the corresponding counter to increment. The meaning of these events is defined by the platform, but event 0 is defined to mean “no event.” All counters should be implemented, but a legal implementation is to hard-wire both the counter and its corresponding event selector to 0.It's worth noting that these counters are 64bits wide in both in RV32 and RV64 systems, on RV32 only, reads of the
mcycle
,minstret
, andmhpmcountern
CSRs return the low 32 bits, while reads of themcycleh
,minstreth
, andmhpmcounternh
CSRs return bits 63–32 of the corresponding counter.The corresponding code in terms of accessing these csrs are as follows:
The spec also defines
mcountinhibit
, the counter-inhibit registermcountinhibit
is a 32-bit WARL register that controls which of the hardware performance-monitoring counters increment. When theCY
,IR
, orHPMn
bit in themcountinhibit
register is clear, thecycle
,instret
, orhpmcountern
register increments as usual. When theCY
,IR
, orHPMn
bit is set, the corresponding counter does not increment. Also note thatmcountinhibit[1]
corresponds to themtime
counter, mtime is not a csr, it's exposed as a memory mapped machine mode read-write register in CLINT, refer to 3.1.10 of priv spec for detailed info.mtime
is shared among cores in the system, therefore RISCV spec decides that it can not be inhibited using the mcountinhibit mechanism.Now, let's delve into the programmable event counter mess; Refer to section 4.9 for a more detailed depiction.
Specifically, the RC impl defines
EventSets
andEventSet
. AnEventSets
has a series ofEventSet
s in it. EachEventSet
corresponds to a set of relating events, there are actually 3EventSet
s in RC, instruction commit events, microstructural events and memory system events. Therefore, RC impl interprets event encoded inmhpmeventx
in a hierachical way: some LSB(mhpmeventX[7:0]
) bits represents which specificEventSet
this event is in(mhpmeventX[7:0]
= 0x0 for instruction commit events, 0x1 for microstructural events and 0x2 for memory system events), and the left MSB bits represents the specific event mask, one or more events can be programmed by setting the respective Event Mask bit for a givenEventSet
. Multiple events(RC impl regards left MSB bits ofmhpmeventx
exceptmhpmeventX[7:0]
as one-hot encoding) will cause the counter to increment any time any of the selected events occur. The specific event selector mask encoding can be found in Table 18 of U54_MC core complex manual.Below is the implementation of
EventSet
andEventSets
:Note that method
check
inEventSet
decides whether events indicated by mhpmeventX[31:8] of a specific event selector happen under current setting ofevents
of a specificEventSet
. Note the first class constructor parameter ofEventSet
:val gate: (UInt, UInt) => Bool
, I feel like this functional trick is overly complicated.gate
directly decides whether specific events(represented by the first parameter of this gate function--mask
, note this is just the event mask part of a whole event selector) happen among a series of events(the second parameter of the function, generated fromhits := events.map(_._2())
). Also note thatprivate def decode(counter: UInt): (UInt, UInt)
inEventSets
decodes a specific event selector encoding to two parts: the event class(counter(eventSetIdBits-1, 0)
) and the specific event mask in that class(counter >> maxEventSetIdBits
); Withcheck
ofEventSet
anddecode
ofEventSets
, methodevaluate
inEventSets
decides whether events indicated by value in onemhpmeventX
actually happen.Since the event signals are in the RocketCore while the
mhpmeventX
andmhpmcounterX
are in CSR module, there exists interconnections among the two module in terms of programmable event counting:Note that for each pair of
mhpmeventX
andmhpmcounterX
, theeventSel
coming out of CSR module is driven by the specific event selector register specified inmhpmeventX
:For each event selector in
mhpmeventX
, the RocketCore has logic toevaluate
whether the selected events happen, and drivePerfCounterIO.inc
back to CSR module, consequently the correspondingmhpmcounterX
will be incremented depending on thatinc
:This post is becoming tedious rigmarole, but there are a lot more to go:
RISC-V Trigger Implementation:
The RISC-V debug spec defined Trigger Module(TM) in Chapter 5(Sdtrig ISA Extension), also refer section 15.3 in U54 MC core complex manual for further details. In general, there may be some number of triggers implemented in a soc, a trigger is fired when an instruction in a specific location is executed, or a load or store from a specific location(the trigger can also be configured to trigger on data values, the data value loaded or stored, or the instruction executed.
select
field inmcontrol
specifies this). Each trigger defines a series of triggering condition besides the specific address or data to compare(tdata2
), for example: under which priv level will this trigger take effect(m
,s
,u
inmcontrol
), the specific comparing operation(match
field inmcontrol
), etc. If one of the triggers fires, a specificaction
is conducted; According to configuration of that trigger, for example:0
inaction
ofmcontrol
will cause a breakpoint exception happening, otherwise execution will trap to debug mode if1
inaction
ofmcontrol
. Triggers in RISC-V can be configured in debug mode or m-mode(toggled viadmode
intdata1
) through 4 csrs,tselect
,tdata1
,tdata2
,tdata3
; Write an index totselect
will cause a specific trigger that corresponds to that index being selected, here being selected means access oftdata1
tdata2
tdata3
will go to that specific trigger instead of other ones. See code below:Note that the RC impl represents a trigger as a triple:
(control,address,textra)
,control
is just aggregate oftype
,dmode
and fields inmcontrol
(anyone who reads this should really read the debug spec first so that these fields can make sense for you), note that the RC only supports trigger of type 2: that ismcontrol
trigger.address
is predefined address or data value that needs to be compared with,textra
adds extra restriction for this trigger(refer totextra32
ortextra64
of debug spec and thecontextMatch
method inClass BP
implementation for further info). When a trigger is selected usingtselect
, access totdata1
,tdata2
,tdata3
is the same as access tocontrol
,address
andtextra
of the corresponding trigger. The depiction in debug spec is holistic while the RC impl make some fields hardwired, for instancesizehi
,sizelo
,hit
,select
(compare only corresponding address instead of data values),timing
(the action of trigger happens before the instruction that triggered it is committed) are all fixed at 0, these hardwired fields will hugely simplify implementation difficulties.Also note RISC-V debug spec defines scheme so that triggers can be chained. A chain of triggers are a series of triggers with
chain
being asserted appended by one that haschain
being0
, the last trigger withchain
being asserted is included in the chain. In RC impl, only two adjacent(adjacency in terms of index) triggers can be chained(!(prevChain || nextChain)
inbp.control.chain := newBPC.chain && !(prevChain || nextChain) && (dMode || !nextDMode)
). The action of the last trigger in a chain will be activated if and only if all of the triggers in a chain get triggered. A typical scenario for chain triggers is to provide breakpoints on an range, two neighboring breakpoints can be combined with thechain
bit. The first breakpoint can be set to match on an address usingmatch
of 2 (greater than or equal). The second breakpoint can be set to match on address usingmatch
of 3 (less than). Setting thechain
bit on the first breakpoint prevents the second breakpoint from firing unless they both match.Though I don't know the rationale, but below is the requirement of debug spec in terms of coupling between
chain
anddmode
:The triggers logic in RC are located in CSR module. Since firing of trigger is based on comparison between the
address
in trigger triple and the address of instruction(or address to load to or store from), trigger related signals therefore are sent to the RocketCore via following interface:These signal along with other auxiliary info like
csr.io.status
,csr.io.mcontext
andcsr.io.scontext
are as follows in RocketCore:Inside RocketCore, there is a
BreakpointUnit
, this Unit decides whether any trigger will match by comparing the triggering criteria (address
in the trigger triple) with the pc or load&store address currently in the pipeline:io.pc
is the instruction fetching address(bpu.io.pc := ibuf.io.pc
) atID
stage whileio.ea
is the load & store address(mem_reg_wdata
) atMEM
(the load&store request initiated atEX
, and the result will pour in atWB
, so it's reasonable to detect trigger match inMEM
),BreakpointUnit
decides whether these address will fire any of the triggers, each item inio.bpwatch
corresponds to the triggering situation for one specific trigger.bpw.valid(0) := true.B
indicates this trigger is firing holistically regardless the reason of match, whilebpw.w[r|i]valid(0) := true.B
specifically means that this trigger is firing as store address(load address| instruction address) match. This signalio.bpwatch
are send to trace modules.The
BreakpointUnit
also decides for current address inio.pc
orio.ea
whether the trigger sets as a whole is fired(no matter some of the triggers are fired, or all triggers in a chain set get fired) and what action will be conducted(RC only supports trap to debug mode and raise a breakpoint exception). Specifically,io.xcpt[debug]_if[ld,st]
indicates there is trigger firing because of if[ld|st] address match, and also specifies the subsequent action(xcpt
->breakpoint exceptiondebug
-> debug interrupt). Thesexcpt[debug]_if[ld,st]
signal will flow downwards in the pipeline, and assertcsr.io.exception := wb_xcpt
into csr module for exception handling, which has been depicted in previous clarification in terms of interrupt & exception handling.I will seal this for now, but there are other contents like impl of PMP which is not covered in this post. The PMP stuff are straightforward once you read the spec.
Beta Was this translation helpful? Give feedback.
All reactions