Floating-Point Extension #15

stnolting · 2021-03-25T16:05:58Z

stnolting
Mar 25, 2021
Maintainer

After some struggles ~~reading~~ understanding the IEEE-754 specs and several FPU IP data sheets I have nearly finished implementing the NEORV32 floating point extension. The FPU was written from scratch without using any IP blocks at all so it can be synthesized for any platform.

I have decided to implement the RISC-V Zfinx floating-point extension rather than the standard F extension to focus on size-optimization. Zfinx uses the "normal" integer register file x instead of a dedicated floating-point register file f, which is required by the F extension. Using the x registers for floating-point operations makes context save/restore much faster since there is no need to push/pop f on the stack and also lowers hardware requirements by avoiding another full-scale 32*32-bit register file.

Software Support

The upstream RISC-V gcc port does not support the Zfinx yet. So I decided to implement an intrinsic library (see sw/example/floating_point_test) that provides access to the Zfinx floating-point operations using standard C functions (example for the FADD.S instruction):

uint32_t __attribute__ ((noinline)) riscv_intrinsic_fadds(uint32_t rs1, uint32_t rs2) {

  register uint32_t result __asm__ ("a0");
  register uint32_t tmp_a  __asm__ ("a0") = rs1;
  register uint32_t tmp_b  __asm__ ("a1") = rs2;

  // dummy instruction to prevent GCC "constprop" optimization
  asm volatile ("add x0, %[input_i], %[input_j]" : : [input_i] "r" (tmp_a), [input_j] "r" (tmp_b));

  // fadd.s a0, a0, a1
  CUSTOM_INSTR_R2_TYPE(0b0000000, a1, a0, 0b000, a0, 0b1010011);

  return result;
}

Might seem a little bit complicated, but it works fine. More details on the Zfinx intrinsic library can be found here: hackaday.io

I am verifying the Zfinx instructions against the pure-software gcc builtin floating-point library functions, which are also part of the intrinsic library:

float riscv_emulate_fadds(float rs1, float rs2) {

  float res = rs1 + rs2;
  return subnormal_flush(res);
}

Status

So far, everything looks fine: 👍

<<< Zfinx extension test >>>
SILENT_MODE enabled (only showing actual errors)
Test cases per instruction: 50000000


#0: FCVT.S.WU (unsigned integer to float)...
Errors: 0/50000000 [ok]

#1: FCVT.S.W (signed integer to float)...
Errors: 0/50000000 [ok]

#2: FCVT.WU.S (float to unsigned integer)...
Errors: 0/50000000 [ok]

#3: FCVT.W.S (float to signed integer)...
Errors: 0/50000000 [ok]

#4: FADD.S (addition)...
Errors: 0/50000000 [ok]

So far, the FPU implements the following instructions: FADD.S FSUB.S FMUL.S FSGNJ[N/X].S FCLASS.S FMIN.S FMAX.S FEQ.S FLT.S FLE.S FCVT.W.S FCVT.WU.S FCVT.S.W FCVT.S.WU

Division, square root and fused multiply-add instructions are not supported yet. Also, the FPU does not support subnormal numbers (they are "flushed to zero") a I am still trying to figure out the "round to nearest, ties to max magnitude" rounding mode...

On a Intel Cyclone IV FPGA the FPU requires 1270 LUTs, 652 FFs and 7 DSPs (1x DSP9x9, 3x DSP18x18) and does not affect the CPU's critical path (currently running at 122 MHz).

There are still some small tasks to finish before I upload the FPU VHDL code - so stay tuned.

Help Wanted 👋

Right now, I am just verifying the actual floating-point processing results - no the accrued exception flags. I have tried using the fenv.h library to add an exception environment to the software-only builtin float libraries - but it seems like this is not supported by my current toolchain...

Maybe anyone out there has a nice idea how to check the exception flags?! 🤔

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Floating-Point Extension #15

{{title}}

Replies: 0 comments

Select a reply

Floating-Point Extension #15

stnolting Mar 25, 2021 Maintainer

Software Support

Status

Help Wanted 👋

Replies: 0 comments

stnolting
Mar 25, 2021
Maintainer