Skip to content

Commit

Permalink
Start atomic operation support
Browse files Browse the repository at this point in the history
  • Loading branch information
dpretet committed Dec 23, 2023
1 parent 5245260 commit e3cce76
Show file tree
Hide file tree
Showing 3 changed files with 171 additions and 3 deletions.
98 changes: 98 additions & 0 deletions doc/atomic_ops.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# Atomic Operations Support

## Overview

The aim of this dev (made from v1.6.1) is to support atomic operation instructions. Atomic
operations will bring synchronization techniques required by kernels. The goal for FRISCV is to be
able to boot a kernel like FreeRTOS or Linux (without MMU) and makes the core a platform for real
worl usecases.

From [OS dev wiki](https://wiki.osdev.org/Atomic_operation):

An atomic operation is an operation that will always be executed without any other process being
able to read or change state that is read or changed during the operation. It is effectively
executed as a single step, and is an important quality in a number of algorithms that deal with
multiple independent processes, both in synchronization and algorithms that update shared data
without requiring synchronization.

For single core system:

If an operation requires multiple CPU instructions, then it may be interrupted in
the middle of executing. If this results in a context switch (or if the interrupt handler refers
to data that was being used) then atomicity could be compromised. It is possible to use any
standard locking technique (e.g. a spinlock) to prevent this, but may be inefficient. If it is
possible, disabling interrupts may be the most efficient method of ensuring atomicity (although
note that this may increase the worst-case interrupt latency, which could be problematic if it
becomes too long).

For multi core system:

n multiprocessor systems, ensuring atomicity exists is a little harder. It is still possible to
use a lock (e.g. a spinlock) the same as on single processor systems, but merely using a single
instruction or disabling interrupts will not guarantee atomic access. You must also ensure that
no other processor or core in the system attempts to access the data you are working with.

In summary, an atomic operation can be useful to:
- synchronize threads among a core
- synchronize cores in a SOC
- ensure a memory location can be read-then-update in any situation, including exceptions handling

Atomic operations will be implemented in a dedicated processing unit and in load/store stage
(`memfy`). Atomic Operation unit (`AMO`) will issue read/write request to load/store stage (`MEMFY`)
with a specific and unique ID. dCache stage will also be updated to better support `ACACHE`, slighlty
change `AID` handling and put in place exclusive access support.

## Design Plan

- Document and list all AXI usage and limitations in the IP.

### AXI Ordering

The core, `memfy` and `dCache` stages, will be updated on `AID` usage. Please refer
to [AMBA spec](./axi_id_ordering.md) for further details of `AID` usage and ordering model.


### Atomic Operation Execution Overview

When `amo` unit receives an atomic operation:
- it reserves its `rs1`/`rs2`/`rd` registers in processing scheduler
- it issues to `memfy` a read request to a memory register with:
- a specific `AID` (e.g. `0x50`), dedicated to exclusive access
- `ALOCK=0x1` making the request an `exclusive access`
- `ACACHE=0x0` making the request `non-cachable` and `non-bufferable`
- it executes the atomic operation
- it issues to `memfy` a request with the same attributes than read operation
- a write request to update the memory register
- a read request to release the memory register


### AMO Unit

`AMO` will be able:
- to execute only one exclusive access at a time, `device` access
- to support al RISCV atomic operation

### Processing Unit

TBD

### Memfy Unit

Issue a dedicated single ID, so in-order, `non-cachable` and `non-bufferable`
Can handle exclusive access and normal access for best bandwidth
Should be able to manage completion reodering (possible enhancement)

### dCache Unit

Needs to support exclusive access
- OoO stage should manage exclusive access in a dedicated LUT
- Don't replace ID for exclusive access
- Exclusive access = `device` access (no cache)

Out of exclusive access scope, the cache should be able to manage different IDs and don't
substitute all the time them. Better performance. Reordering should be done only on different IDs.

## Test Plan

- An atomic operation can't be stopped if control unit manages async/sync exceptions

71 changes: 71 additions & 0 deletions doc/axi_id_ordering.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# AMBA AXI ID & Ordering

## AXI Transaction Identifier

### Overview

The AXI protocol includes AXI ID transaction identifiers. A Manager can use these to identify
separate transactions that must be returned in order. All transactions with a given AXI ID value
must remain ordered, but there is no restriction on the ordering of transactions with different ID
values.

A single physical port can support out-of-order transactions by acting as a number of logical ports,
each handling its transactions in order.

By using AXI IDs, a Manager can issue transactions without waiting for earlier transactions to
complete. This can improve system performance, because it enables parallel processing of
transactions.

There is no requirement for Subordinates or Managers to use AXI transaction IDs. Managers and
Subordinates can process one transaction at a time. Transactions are processed in the order they are
issued.

Subordinates are required to reflect on the appropriate BID or RID response an AXI ID received from
a Manager.

### Read Data Ordering

The Subordinate must ensure that the RID value of any returned data matches the ARID value of the
address that it is responding to.

The interconnect must ensure that the read data from a sequence of transactions with the same ARID
value targeting different Subordinates is received by the Manager in the order that it issued the
addresses.

The read data reordering depth is the number of addresses pending in the Subordinate that can be
reordered. A Subordinate that processes all transactions in order has a read data reordering depth
of one. The read data reordering depth is a static value that must be specified by the designer of
the Subordinate.

There is no mechanism that a Manager can use to determine the read data reordering depth of a
Subordinate.

### Write data ordering

A Manager must issue write data in the same order that it issues the transaction addresses.

An interconnect that combines write transactions from different Managers must ensure that it
forwards the write data in address order.


### Interconnect use of transaction identifiers

When a Manager is connected to an interconnect, the interconnect appends additional bits to the
ARID, AWID and WID identifiers that are unique to that Manager port. This has two effects:

- Managers do not have to know what ID values are used by other Managers because the interconnect
makes the ID values used by each Manager unique by appending the Manager number to the original
identifier.
- The ID identifier at a Subordinate interface is wider than the ID identifier at a Manager
interface.

For response, the interconnect uses the additional bits of the xID identifier to determine which
Manager port the response is destined for. The interconnect removes these bits of the xID
identifier before passing the xID value to the correct Manager port.


#### Master

#### Slave

#### Interconnect
5 changes: 2 additions & 3 deletions doc/project_mgt_hw.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,6 @@
- [X] Support U-mode
- [X] Support PMP/PMA
- [X] https://github.com/eembc/coremark
- [ ] Advanced Interrupt controller
- [ ] AXI ERR handling
- [ ] AXI EXOKAY handling
- [ ] Atomic operations
- stage to execute the instruction, controlling ldst Stages
- memfy exposes two interfaces for requests.
Expand All @@ -26,6 +23,7 @@ Any new features should be carefully study to ensure a proper exception and inte

## Memory

- [ ] Bus fault to route on exceptions https://lists.riscv.org/g/tech-privileged/topic/80351141
- [ ] Better manage ACACHE attribute
- [ ] Correct value driven from memfy
- [ ] Use it correctly across the cache
Expand Down Expand Up @@ -55,6 +53,7 @@ Any new features should be carefully study to ensure a proper exception and inte

## Cache Stages

- [ ] Add dedicated RAM for cache, not connected thru AXI interconnect
- [ ] AXI4 + Wrap mode for read
- [ ] Support datapath adaptation from memory controller
- [ ] Narrow transfer support?
Expand Down

0 comments on commit e3cce76

Please sign in to comment.