CodeAnotomy on TL Broadcast Hub #3072

DecodeTheEncoded · 2022-09-24T10:24:20Z

DecodeTheEncoded
Sep 24, 2022

The broadcast hub is the builtin cache coherence manager of RockectChip if there is no l2 cache(for example, the si-five inclusive cache) is present in the system. It uses snooping scheme instead of directory to maintain cache coherence among different clients, although you can implement a directory based scheme using the scaffolding ProbeFilter by storing directory info there.
The bird view of broadcast hub is pretty straightforward. When an original request comes from one client at channel A, the broadcast hub should probe all other possible clients to let them drop the permission(more precisely, the expected permission the broadcast hub wants a specific client to change to after accepting an A-channel message is determined by probe_perms, normally toN is expected, but for A channel messages that does not imply a write op, it's toB meaning that probed client can read the specific cache block) or dirty data they already possessed. After all other clients respond to the probe request with probeAck or probeAckData, the original A channel message is good to go, and that A channel message will flow downwards to lower memory hierarchy. When the D channel response comes back, and is sent back to the initiator(sent_d), unless that D response needs to have an E channel ack(got_e), this marks end of an A-channel transaction. The broadcast hub has configurable number of tracker sub-modules, and each tracker traces whether all probes has been done for an initiated A channel transaction so that it can flow that A channel message downwards(io.out_a.valid of tracker), each tracker is identified by the source(from which client) and line(to which cache block) pair fields, the source is used to route a downward D-channel response(d_trackerOH) to that tracker while the line is used to route C-channel probeAcks(c_trackerOH). There are some detailed insights to note here:
Since broadcast hub is the lowest cache coherent layer, therefore some TL-C message needs to be converted to TL-UH ones. Specifically, the tracker module converts every val acquire = opcode === TLMessages.AcquireBlock || opcode === TLMessages.AcquirePerm messages to TLMessages.Get and all other A-channel messages are bypassed:

o_data.ready := io.out_a.ready && probe_done
io.out_a.valid := o_data.valid && probe_done
io.out_a.bits.opcode  := Mux(acquire, TLMessages.Get, opcode)
io.out_a.bits.param   := Mux(acquire, 0.U, param)
io.out_a.bits.size    := size
io.out_a.bits.source  := Cat(Mux(acquire, transform, PASS), source)
io.out_a.bits.address := address
io.out_a.bits.mask    := o_data.bits.mask
io.out_a.bits.data    := o_data.bits.data
io.out_a.bits.corrupt := false.B
io.out_a.bits.user   :<= user
io.out_a.bits.echo   :<= echo

What's worth noting here is that the tracker also add extra MSBs to the A channel source field:io.out_a.bits.source := Cat(Mux(acquire, transform, PASS), source). Since all A channel Acquire are converted to Get, therefore the available messages coming from out.d channel are all AccessAck(Data)(and of course HIntAck for Hint messages), consequently we need to differ different AccessAcks so that we know one AccessAck is actually responding to previous Acquire so that we can send GrantAck back to the client instead of AccessAck.
When a probed client has dirty data and therefore sends ProbeAckData to the broadcast hub, the broadcast hub will simply just put that dirty data back to lower memory hierarchy, therefore initiate an out A channel transaction. This is also true for the voluntary giving away of data for a specific client, if a client initiates ReleaseData, the broadcast hub will put that block back by initiating the PUT in out channel A. This brings complexity of distinguishing different D channel responses, since some of the AccessAck may be responding to the ProbeAckData or ReleaseData, and therefore needs further process, see code below:

/*
* hjr
* the DROP means c_probeackdata
* scenarios that the broadcast hub will initiate an A channel transations:
* 1, whatever clients itself initiates a non-coherent request(get,put,atomics,etc.),it's send via a specific tracker.
* 2, the clients itself initiates an acquire. Since the broadcast hub is the lowest cache coherence level, therefore we need to modify the
*   acquire message to get. it's handled in the tracker logic either.
* 3, the clients itself releases voluntarily a cache line with data. In this situation, we need to put the corresponding data to the lower
*   memory hierachy using Putfull. Consequently, when the d-channel response comes back, we need to modify the AccessAck to ReleaseAck.
*  Also note that if the client just release permission instead of data, no BH<->manager transaction is needed. The ReleaseAck is directed sent
*  back to the initiator client.
* 4, the clients itself responds a probe with corresponding data probeDataAck(the probe message may be initiated by the bh to echo an acquire message initiated
*   by another client; We need to put the corresponding data back to lower manager using putfull. It's worth noting that however the response to this
*   putfull: AccessAck should be dropped since ProbeAck(Data) is the operation end.
* */

/*
* hjr
* Since the bh is the lowest cache coherent layer, therefore the available messages coming from out.d channel are AccessAck(Data) and HIntAck
* consequently we need to have a way to differ one AccessAck from another so that we know an AccessAck is for the ProbeAckData from the client
* while another AccessAck comes to respond a previous probeAckDat. We add two bits to the MSB of the source field for this.
* for scenario 1 above, the bits added are PASS, the AccessAck(Data) or HIntAck will pass directly to the corresponding client
* for scenario 2, bits added are TRANSFORM_B or TRANSFORM_T that's in accordance with the initial param requirement of Acquire. The AccessAck(Data)
* will be changed to Grant(Data)--todo seems that in RC impl, there is no logic handling the pure GRANT(WITHOU DATA) message.
* for scenario 3, the bits added are TRANSFORM_B, so that when the AccessAck comes back at D channel, we know this message should be replaced to ReleaseAck
* and send ReleaseAck back to client. The Release(Without Data) from client will not fire any bh<->manager A channel transaction. A releaseAck will be
* sent back to client directly.
* for scenario 4, the bits added are DROP, a probeAckData should put the data back to lower memory, and when the AccessAck comes back, no further notifycation
* to the client is needed. This AccessAck message is dropped.
* */
/*
* hjr
* todo my confusion is that for example: both the scenario 2 and 3 may add TRANSFORM_B to the source MSB, how can we differ the corresponding AccessAck with another
* --answer: by d_what: when d_what(1) is asserted: d_normal.bits.opcode := Mux(d_hasData, TLMessages.GrantData, TLMessages.ReleaseAck)
* */
val put_what = Mux(c_releasedata, TRANSFORM_B, DROP)
//hjr todo what if it's c_release??
val put_who  = Mux(c_releasedata, in.c.bits.source, c_trackerSrc)
putfull.valid := in.c.valid && (c_probeackdata || (c_releasedata && (filter.io.release.ready || !c_first)))
putfull.bits := edgeOut.Put(Cat(put_what, put_who), in.c.bits.address, in.c.bits.size, in.c.bits.data)._2
putfull.bits.user.lift(AMBAProt).foreach { x =>
x.fetch       := false.B
x.secure      := true.B
x.privileged  := true.B
x.bufferable  := true.B
x.modifiable  := true.B
x.readalloc   := true.B
x.writealloc  := true.B
}

The bits added to the putfull.bits.source is val put_what = Mux(c_releasedata, TRANSFORM_B, DROP). With this info added, we can exactly tell what specifically D channel response AccessAck is for:

If the 2MSBs of out.d.bits.source(d_what) are DROP, meaning that this AccessAck is responding to the Put initiated by ProbeAckData. Since the ProbeAckData doesn't need response, therefore this AccessAck should be literally dropped.
If the MSB of d_what is true.B(TRANSFORM_B or TRANSFORM_T), it means this D response is for a Get that itself converted from an original Acquire, or a PUT that originates from a ReleaseData, Ack infos should be sent back to the client under both these two scenarios. For ReleaseData case, it's ReleaseAck, while for Acquire, it's GrantData. The difference between these two AccessAcks is that AccessAck for Acquire will have d_hasData asserted.

val d_allow = Wire(Bool())
out.d.ready := (d_normal.ready && d_allow) || d_drop
d_normal.valid := out.d.valid && d_allow && !d_drop
d_normal.bits := out.d.bits // truncates source hjr todo how this one liner can truncate the source.
when (d_what(1)) { // TRANSFORM_*
  d_normal.bits.opcode := Mux(d_hasData, TLMessages.GrantData, TLMessages.ReleaseAck)//hjr this ReleaseAck echoes the ReleaseData
  d_normal.bits.param  := Mux(d_hasData, Mux(d_what(0), TLPermissions.toT, TLPermissions.toB), 0.U)
}
val d_mshr = OHToUInt(d_trackerOH)
d_normal.bits.sink := d_mshr
//hjr A ReleaseData will need to initiate a A channel putfull transactions to manager, and the  d_normal.bits.opcode
//hjr should be ReleaseAck. This ReleaseData doesn't have to be controlled by any of the trackers.
//hjr above depiction echoes the rationale for d_normal.bits.opcode === TLMessages.ReleaseAck
assert (!d_normal.valid || (d_trackerOH.orR() || d_normal.bits.opcode === TLMessages.ReleaseAck))

My confusion on code above is if the original A channel transaction is AcquirePerm, the 2MSBs of out.d.bits.source is TRANSFORM_B or TRANSFORM_T(in tracker logic: val transform = Mux(shared, TRANSFORM_B, TRANSFORM_T)), and the d_hasData is false.B, from d_normal.bits.opcode := Mux(d_hasData, TLMessages.GrantData, TLMessages.ReleaseAck) , seems that the ack info sent back to client is TLMessages.ReleaseAck instead of the TLMessages.Grant, this is clearly wrong. Am I miss something here?
Also note that the data-free version of ProbeAck and Release will not initiate any out A channel transactions. That ProbeAck will only cause the corresponding tracker's probe counter decremented. The Release however needs to initiate a ReleaseAck back to the client according to the Tilelink protocol. It's wroth noting that the ProbeAckData will only decrement the probe counter when AccessAck for that ProbeAckData is received(when probedack is asserted), while the ProbeAck decrements the counter immediately(fbfa15e)
Since the Release will combinationally cause the in.D channel ReleaseAck being initiated and Release event is somehow async to the out.D channel response(which will combinationally cause GrantData or ReleaseAck in in.D channel being initiated), it's necessary therefore to arbitrate among these two signals. And the ReleaseAck for data-free Release wins arbitration. Arbitration is also needed between the out.A transactions initiated by effective tracker module and a ProbeAckData or ReleaseData.

val releaseack = Wire(in.d)
releaseack.valid := in.c.valid && (filter.io.release.ready || !c_first) && c_release
releaseack.bits  := edgeIn.ReleaseAck(in.c.bits)
// Combine ReleaseAck or the modified D
TLArbiter.lowest(edgeOut, in.d, releaseack, d_normal)
// Combine the PutFull with the trackers
TLArbiter.lowestFromSeq(edgeOut, out.a, putfull +: trackers.map(_.out_a))

Also note that each Grant(Data) in in.d channel needs an accompanying GrantAck in in.e channel. The sink fields in d(e).bits should be id of the select tracker, so that when that tracker sees the ack coming back, it can assert the got_e signal, make this tracker idle and ready to accept new transactions. Extra Note: got_e is introduced in this commit, the dcache had been modified to support early grant ack in this commit . Early grant ack means that the e channel message will be sent in the very first beat of D channel grant. Therefore, once a grantAck is received by the broadcast hub, this does not mark the end of an acquire transaction. An acquire operation is finished successfully only when all the d channel grantAck beats has been sent(marked by sent_d). So, the idle-ness of a tracker is combinationally decided by val idle = got_e && sent_d.
The broadcast hub also supports plugable Probe filters(introduced in this pr), for now the default implementation is just nothing(in<->out). From the filter io I can guess that the effect of this module is for literally filtering out some of the probes sending to each clients based on specific criteria, when a out.d transaction happens(a tracker will be idle again), the filter should be updated in some way.
This is a short anatomy but I guess this is useful to grasp a holistic understanding of the broadcast hub. Anyone who wants to dive into details definitely should read this unmerged commit from Sequencer to learn about the further details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CodeAnotomy on TL Broadcast Hub #3072

{{title}}

Replies: 0 comments

Select a reply

CodeAnotomy on TL Broadcast Hub #3072

DecodeTheEncoded Sep 24, 2022

Replies: 0 comments

DecodeTheEncoded
Sep 24, 2022