-
I was taking a look at how addition and subtract are implemented in this CPU. neorv32/rtl/core/neorv32_cpu_alu.vhd Line 131 in 8bf4b70 This result gets used during instruction execution: neorv32/rtl/core/neorv32_cpu_control.vhd Lines 1035 to 1037 in 49fdd28 During the next clock cycle, we will read the result from neorv32/rtl/core/neorv32_cpu_regfile.vhd Lines 97 to 98 in 49fdd28 and store it into the destination register specified in the instruction we executed: neorv32/rtl/core/neorv32_cpu_regfile.vhd Lines 115 to 116 in 49fdd28 Assuming my understanding is correct: From my very limited knowledge of VHDL, the implementation of Thanks for making this amazing project! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
You are right. But those simple ALU operations like "add" are processed within a single cycle: on the first rising edge of the clock the operands are output from the register file, applied to the ALU (and thus, the adder). The data propagates through the circuit and arrives the input of the register file ( In the next cycle, the CPU moves to the
This is a great question I never really thought about 😄 Each library like ADD is probably one of the most common operations, so most FPGAs provide optimized logic cells that allow a small and fast (in terms of delay = high propagation speed of the electric signals) mapping of those ADD-related function (like a "carry chain" to propagate the carry from one bit position to another). The synthesis tool is aware of those special FPGA features and can create an efficient hardware for the addition.
I would see it the other way around. The synthesis creates a circuit that implements the addition. Let's assume the tool is "allowed to do whatever it wants" (no constraints, see below), so it will create some circuit. The longest path (= worst case path) an electric signal can take from the circuit's input to the circuit's output defines the critical path. This path is specified by a time, since electric signals have a limited propagation speed. Hence, this critical path defines the maximum frequency the circuit can reliable operate at ( There are options to give the synthesis tools some constraints like specifying a minimum clock speed that has to be reached or more implementation-specific options like defining how to actually build the addition circuit (using which FPGA primitives).
❤️ |
Beta Was this translation helpful? Give feedback.
-
For clarification: the synthesis tool might use a dedicated adder, it might create the adder from LUTs/Logic Elements or it might produce any other result; however, the logical operation will always complete within a single clock cycle, because that is what the designer described in (V)HDL. If the synthesis tool decided to introduce a register that delays the path, that'd be incorrect, because that is not the described behaviour. Therefore, in practice, the difference between using a dedicated adder or non-so-optimised resources will be the resulting maximum frequency of the clock. In other words, the delay between registers (as explained by @stnolting). The behaviour needs to be the same, regardless. |
Beta Was this translation helpful? Give feedback.
You are right. But those simple ALU operations like "add" are processed within a single cycle: on the first rising edge of the clock the operands are output from the register file, applied to the ALU (and thus, the adder). The data propagates through the circuit and arrives the input of the register file (
rf_wdata
). Since the register file's write enable (ctrl_i(ctrl_rf_wb_en_c)
) is also applied on this first edge, the computation result is written back to the register file with the next rising edge.In the next cycle, the CPU moves to the
DISPATCH
state to get the next instruction and to prepare the output of the operation's operands from the regis…