This repository contains all the information needed to build a RISC-V pipelined core, which has support of base interger RV32I instruction format using TL-Verilog on makerchip platform. This was done as part of a 5 day workshop organised by VSD and Redwood EDA
- Introduction to RISC-V ISA
- Types of Instructions
- GNU Compiler Toolchain
- ABI
- MakerChip
- Combinational Logic
- Sequential Logic
- Pipelined Logic
- Validity
- RISC-V Micro Architecture
- Fetch Cycle
- Decode Cycle
- ALU
- Register File read and write
- Control Logic
- Pipelining
- Hazards
- Load Store Instructions
- Jump Instructions
- Complete Risc-V CPU core
- Acknowledgements
ISA-Instruction Set Architecture ISA acts as an interface between the hardware and the software(e.g. C) Whenever you run an app on your computer, its function (high level program) is converted to its respective assembly language program using a compiler. It is then converted into machine level code using an assembler. Finally, this machine code (also known as a bitstream) is fed into the chip layout, which will perform its function accordingly.
This project aims to design a RISC V microprocessor core and understand its various functions. We will start with a simple c program then look into its assembly and machine language code. Before that, let us see some basic commands we need to know to implement this in a Linux based system
- Pseudo Instructions
- Base Integer Instructions
- Multiply Extension
- Single and Double precision floating point extension
- Application Binary Interface(ABI)
- Memory Allocation and Stack Pointer
To view and edit your program file use the following command
leafpad <filename> & Leafpad is a texteditor.
To compile your program use the following command:
gcc <filename>
To view the output of the program use the following command:
./a.out
To use the risc-v gcc compiler use the following command:
riscv64-unknown-elf-gcc -o1 -mabi=lp64 -march=rv64i -o sum1ton.o sum1ton.c
More generic command with different options:
riscv64-unknown-elf-gcc <compiler option -O1 ; Ofast> <ABI specifier -lp64; -lp32; -ilp32> <architecture specifier -RV64 ; RV32> -o <object filename> <C filename>
To view assembly code use the below command:
riscv64-unknown-elf-objdump -d <object filename>
To use SPIKE simualtor to run risc-v obj file use the below command,
spike pk <object filename>
To use SPIKE as debugger
spike -d pk <object Filename>
with debug command as
until pc 0 <pc of your choice>
Output of program to compute the sum of numbers 1 to 9
-
Application Binary Interface(System call interface)
-
The application program can directly access the registers of the RISC V architecture using system calls.
-
XLEN-Width of the registers
-
RV64 and RV32, the widths are 64 bits and 32 bits, respectively.
-
RISC V belongs to the little endian memory addressing system, which means that the least significant byte of a word is stored in the lowest memory address.
-
Instructions which operate only on regs are called R type
-
Instructions which operate on immediate and regs are called I type
-
Instructions which operate on source regs and storing instr are called S type
- Makerchip is a free online IDE for developing in Verilog or TL-Verilog. It can be used directly in your browser. You can code, simulate and view block diagrams and waveforms.
- It also features multiple examples and templates.
- Using TL Verilog in makerchip saves a lot of time and code as compared to other HDLs
-
Starting with basic example in combinational logic is an inverter. To write the logic of inverter using TL-verilog is $out = ! $in;.
-
Implementation of other logic gates and other basic designs such as a 2:1 MUX
-
And Gate output
-
Mux Output
-
Basic Calculator Output
Starting with basic example in sequential logic is Fibonacci Series with reset. To write the logic of Series using TL-Verilog is $num[31:0] = $reset ? 1 : (>>1$num + >>2$num)
.
- Here >>1 and >>2 provides the output value from previous cycles.
- Sequential calculator with reset and memory
Pipelining is done to operate the circuit at a higher frequency.
- The pipeline is divided into stages, and each stage can execute its operation concurrently with the other stages.
- In TL-Verilog we declare a pipeline as
|<pipline name>
and its stages as@<stage>
- Timing Abstract Representation is a powerful feature of TL-Verilog which makes pipelining and retiming a lot easier compared to other HDLS
- Pipelined Calculator
Validity is the notion of when values of signals are meaningful.
-
It makes waveform easier to debug and provides clock gating.
-
Clock gating is a technique used in many sequential circuits to reduce power dissipation. It removes the clock signal when the circuit is not in use .
-
Clock Gating is not a well defined concept in other HDLs
-
Code under
|pipe
with stages defined as@?
is used for valid when? check -
Calculator with memory and recall
We will implement this design using 4 stages:
- Fetch
- Decode
- Read and write
- ALU
-
PC(Program Counter) contains the address for the next instruction.
-
IMEM contains the set of instructions.
-
The processor will fetch the instruction whose address is stored in PC.
-
Fetch Logic Implemented output
There are 6 types of Instructions:
-
R-type - Register
-
I-type - Immediate
-
S-type - Store
-
B-type - Branch (Conditional Jump)
-
U-type - Upper Immediate
-
J-type - Jump (Unconditional Jump)
-
We will decode
instr[6:2]
which is the opcode of the instruction, we are ignoring the first 2 bits since they are always 1. -
We will also consider immediate value, source address, destination address, funct7, funct3.
-
Once we have all of the above fields we can now extract which type of instruction is being fetched
-
Final decoding operation
-
After getting the values from register file read, we store
$rf_rd_data1[31:0]
and$rf_rd_data2[31:0]
in$src1_value[31:0]
and$src1_value[31:0]
respectively. -
For now we will perform the arithematic operations 'add' and 'addi' on them. We will add more operations in the final code.
-
add : will perform addition operation on the two operands.
-
addi : will add the first operand value with immediate value
$imm
. -
Final ALU code
Inputs:
-
$rf_rd_en1
: enable signal for the first read operation. -
$rf_rd_en2
: enable signal for the second read operation. -
$rf_rd_index1[4:0]
: first address from where data has to be read. -
$rf_rd_index2[4:0]
: second address from where data has to be read. -
$rf_wr_en
: enable signal for write operation. -
$rf_wr_index[4:0]
: address where data has to be written. -
$rf_wr_data[31:0]
: data to be written in the write address. Outputs: -
$rf_rd_data1[31:0]
: data from read index 1. -
final code for reg file read and write
In RISC V branches are defined as conditional branches, i.e, they will only execute when certain conditions are met whereas jumps are unconditional.
- Converting non-piepleined CPU to pipelined CPU using timing abstract feature of TL-Verilog. This allows easy retiming wihtout any risk of funcational bugs
- A waterfall logic diagram makes it easier for us to see the flow of logic and makes pipelining simpler.
- Pipelining is performed to increase speed and throughput, but while pipelining problems arise,
Due to interdependent nature of data between cycles, issues arise, these are known as hazards. In the figure above, 2 kinds of hazards are shown
- First is related to the branch instruction, this is known as a control flow hazard. Here PC is expecting a value from branch target two cycles before that value is actually calculated.
- Second is related to the register file write, this is known as a read after write hazard. We dont need to read a value that is written until stage 4 in the first instruction but we might need to read it for the second instruction in stage 2, which is a one cycle early.
- Instruction cycle explanantion
One simple solution is to make the circuit run every 3 cycles, but this leads to slower performance
- This resolves the control flow hazard
- We will use the following code to implement this:
$start = ((>>1$reset)&&(!$reset))? 1'b1: 1'b0;
$valid = $start ? 1'b1: >>3$valid ? 1'b1 : 1'b0;
- To avoid the read after write hazard use the following code:
$src1_value[31:0] = ((>>1$rd == $rs1) && (>>1$rf_wr_en))? >>1$result : $rf_rd_data1[31:0];
$src2_value[31:0] = ((>>1$rd == $rs2) && (>>1$rf_wr_en))? >>1$result : $rf_rd_data2[31:0];
- Here we take 2 cycle penalty for branch instructions
$dmem_wr_en
:Enable signal for write operation$dmem_rd_en
:Enable signal for read operation$dmem_addr[3:0]
:Address where we have to read/write$dmem_wr_data[31:0]
:Data to be written in address (store)
Jumps are unconditional branches We deal with 2 kinds of jump instructions
Finally after adding all the functionality and instructions, the Pipelined RISC-V CPU is ready.
- Kunal Ghosh , Co-founder, VSD Corp. Pvt. Ltd.
- Steve Hoover, Founder, Redwood EDA
- Shivam Potdar, TA
- Shivani Shah, TA