RealTimeSwitch

Table of Contents Real-time Ethernet Switch Project summary Download Installation Project Contents Building and Loading the driver Building and installing the software utilities System Overview Network Code Language The Network Code Processor The Network Code Switch Memory Map System Usage Software utilities Examples Ongoing Work Further Information

Real-time Ethernet Switch

The Network Code Switch (NCS) is a synthesizable Ethernet switch featuring dedicated time-triggered mechanisms to guarantee bounded switching delays, enabling hard-real time communication between the nodes connected to it. The system offers two operational modes: the soft-mode, which operates the system as a generic Ethernet switch using a best-effort approach, and the hard-mode, which perform packets transactions using a time-triggered approach based on Network Code schedules. The executed schedules can change the operational mode of each port at runtime, so both type of traffics (best-effort and real-time) can coexist in the same network.

The system is based on the Reference Ethernet Switch included in the NetFPGA base package, and uses multiple instances of the Network Code Processor (NCP) to achieve the time driven communication through the execution of flexible and verifiable Network Code schedules.

Project summary

Status :

Released

Version :

2.0

Authors :

Gonzalo Carvajal, Sebastian Fischmeister and Robert Trasumuth.

Download

Download the Network Code Switch implementation tarball NCS_project.tar.gz

Installation

There are two videos summarizing the setup procedure from unpacking the board to the driver and utilities installation on a desktop machine running Ubuntu.

Installing the hardware: http://www.youtube.com/watch?v=-VuAycE-bAM
Installing the framework: http://www.youtube.com/watch?v=Ilyr4tIxPdQ&feature=related

The following guide presents a more detailed description in case you find any problem following the videos.

Copy the tarball NCS_project.tar.gz in the folder of your preference and extract it using the command: tar xzvf NCS_project.tar.gz

Once uncompressed, you should see the following directory structure inside the targeted folder:

your_directory/ docs/ nc-langspec/ netfpga/ quickguide/ hardware/ bin/ src/ software/ driver/ common/ kernel/ utilities/ bin/ examples/ src/

Project Contents

The three main directories inside the project folder are:

Doc folder: contains some documentation. It includes a quickguide for the installation and basic usage of the system, some documentation about the design of the general NetFPGA architecture, and the Network Code Language Specification document with formalities and definitions for the programming language to implement communication schedules.
Hardware folder: contains all the files related to the hardware design. You will find the following subfolders:

- bin: contains the bit file to program the Virtex2 chip on the NetFPGA board.
- src: contains the project files and HDL sources for the system. You will need the Xilinx ISE Foundation 10.1.3 tools to open and modify the project files, and regenerate the .bit file.

Software folder: high level tools necessary to set, configure and debug the system functionality. You will find the following subfolders:

- driver: a slightly modified version of the original NetFPGA driver provided in the base package which runs in Ubuntu 10.04.
- utilities: software tools for accessing the board internal registers from the workstation to program, set, and debug the system functionality.

Building and Loading the driver

To build the driver you will need standard development tools such as the GNU compiler (gcc) and GNU Make Tool (make). These tools are usually included in most Linux distributions. The following procedure was tested using the Linux kernel version 2.6.32, the newer version packed with Ubuntu 10.04 by the time. The compatibility with previous kernel versions is implemented but not yet tested. Please let us know if you face problems when compiling the driver using an older kernel version.

To build the driver, just go into the kernel directory and type make cd your_directory/software/driver/kernel/ make

Once the driver is compiled, you have to load it into memory. You need root permission for doing this. From the console, type su and enter the root password. Then, from inside the kernel directory execute the following line:

After loading, you can check the kernel output by executing: <pre> dmesg | grep nf2 </pre> If it shows a message similar to this nf2: cannot remap mem region 8000000 @ f0000000''

then you will need to add the following line to the kernel options at boot-loader stage: vmalloc=256M

The procedure for doing this would depend on your bootloader. If you are using the Grub bootloader in Ubuntu, you will need to edit the /boot/grub/menu.lst file as follows:

i title Ubuntu 9.04, kernel 2.6.28-16-generic uuid (BIG number here) kernel /vmlinuz-2.6.28-16-generic root=UUID=(BIG number here) ro quiet splash <strong>vmalloc=256M</strong>

You can check the module insertion by typing

dmesg | grep nf2

If the driver was correctly loaded, you should see a message similar to this:

[414.456516]nf2 0000:02:01.0:PCI INT A->GSI 22(level,low)->IRQ 22 [414.456540]nf2: Found an NF2 device (cfg revision 0)...

You can also verify that the network interfaces have been successfully loaded. By typing

ifconfig -a | grep nf2

you should see something similar to:

nf2c0 Link encap:Ethernet HWaddr 00:4E:46:32:43:00 nf2c1 Link encap:Ethernet HWaddr 00:4E:46:32:43:01 nf2c2 Link encap:Ethernet HWaddr 00:4E:46:32:43:02 nf2c3 Link encap:Ethernet HWaddr 00:4E:46:32:43:03

To install the module permanently in the system, you will need to log in as root and copy the nf2.ko file to the /lib/modules/(KERNEL_VERSION) folder. After that, execute the command

depmod

Remember that you can get the kernel version number by executing uname -r.

Finally, you need to add a line with the text nf2 to the /etc/modules file.

Building and installing the software utilities

The software utilities package contains functions which allow the user to program, configure and control the sytem from the workstation by reading and writing specific registers on the Virtex2 chip. The delivered functions are based on the more basic commands included in the base package. The reprogramming utilities are unmodified versions of the ones delivered with the base package, and they are delivered here in case you do not have or do not want to install the base package. If you have already installed the NetFPGA base package, you do not need to install the reprogramming utilities again.

There is an script for building and installing the utilities, making them accessible from the linux system. Just execute make from inside the utilities subfolder, and the functions will be available from the OS to be used as any generic command line instruction. If you have a previous installation of the base package, this will overwrite the tools with the same name.

System Overview

The presented system expands the Reference Ethernet switch included in the NetFPGA base package by adding a time-based arbiter mechanism to each port. The arbiter entities use the definitions of the Network Code Framework to coordinate communication between the ports through the execution of Time-Division Multiple Access (TDMA) schedules.

The Network Code framework consists of: a domain-specific language to represent state-based TDMA schedules, a compiler with a verification engine which translates the programs into checked executable schedules, and the interpreter entity which executes the schedule at runtime.

Network Code Language

Network Code is a domain-specific programming language to write predictable, verifiable distributed communication for distributed real-time applications. It aims to provide a powerful, expressive programming environment for communication among distributed software components. One salient aspect of Network Code programs is that its programs can be translated to formal specifications, which can be model-checked to verify aspects of reliability such as absence of collisions, overhead, schedulability, and integrity (e.g., sender/receiver pairing, content typing, over/underflows).

The Network Code language consists of just a few core instructions which control timing, data flow, and control flow. The create() instruction creates a message from data stored in a specific data buffer. The send() instruction encapsulates a message into a network packet and signals the physical layer to start transmitting. The receive() instruction stores the data of an incoming packet into a specific data buffer. The branch() instruction implements conditional jumps based on buffer values, counters, or other status information. The instruction mode() controls the mode of operation of the runtime system (soft or hard). The instructions future() and halt() implement temporal control using timers that resume execution at particular program labels at specified points in time.

For a more detailed description about the formalities, examples, tools and related publications, visit the project's page at the ESG website.

The Network Code Processor

The Network Code Processor (NCP) is the entity which executes the verified schedules at runtime. We prepared a customized version of the NCP specially targeted to enable easy integration to the datapath of the Reference Switch. The following figure shows a block diagram of the implemented architecture which is attached to each port of the switch:

In the memory space, the PROG_ROM stores the programmed schedule, and the CFG-ROM defines the specific communication buffers through an initial address and length which are mapped to locations on the Shared Data Memory. The shared data memory block stores the actual data and serves as the communication interface between the multiple ports. The original definition of the framework considers an atomic unit of 32 bits words for all the memory blocks; therefore, the program instructions are coded in 4 bytes and the length of the variables in the shared memory is multiple of 4 bytes. In the current implementation we needed to redefine the atomic unit for the data buffers to 64 bits to make them compatible with the generic datapath of the Reference Switch. This modification does not require any modification in the processor; however, a variable defined with length 2 in the CFG_ROM will read 128 bits from the data memory instead of 64.

The NCP features a super-scalar architecture which implements each Network Code instruction as an independent hardware unit, enabling true concurrent execution of multiple non-blocking instructions. The controller reads the instructions from the Program ROM, and processes them using a MIPS-like fetch-decode-execute approach to trigger the execution of the individual blocks. It also calculates dependencies between consecutive instructions, triggering concurrent execution whenever it is possible.

The autoreceiver block is an additional asynchronous block which parse the incoming traffic to verify the validity of the packets and store the useful information in the receive buffers, which can be later read by a receive instruction. Network Code messages are restricted to the following packet structure:

Destination MAC:

must be broadcast (FFFFFFFFFFFF).

Source MAC:

MAC address of the node sending a message.

EtherType:

Network Code frames are tagged with type 5CE0.

Datalen:

the length of the transmitted data in 64 bits words.

Channel:

logical communication channel. There is one receiver buffer for each logical channel, and the autoreceiver uses this field to store the data in the corresponding buffer.

Var ID:

the buffer ID from where the transmitted data was taken. The ID is just the address in the local CFG_ROM block that generates the frame. Used for debugging porpouses.

Telegram Counter:

keep a register of the number of transmitted messages.

Unused:

Reserved for future use.

Data:

the actual data. The atomic unit for the buffers in the shared memory is a 64 bits word. Therefore, the transmitted data must be multiple of 8 bytes.

The send instruction of the NCP automatically encapsulates the data into a valid Network Code frame. If the transmitter node is not based on the NCP, the system must include some kind of mechanism to generate the correct encapsulation. The autoreceiver block will reject any incoming message which does not complain with the previous definition.

The Network Code Switch

The following figure illustrates how we used multiple instances of the Network Code Core to implement time-triggered communication in the ports of the Ethernet switch.

We followed the highly modular approach of the NetFPGA reference designs to create entities which can be attached to the Ethernet switch with minimal modifications in the generic datapath. The real-time modules just tap the interface between the MAC and the user-datapath on each port, without requiring any modification in the original modules. The flow of the incoming and outgoing traffic on each port is determined by the operational mode of the corresponding NCC entity. In soft-mode, the traffic is directed through the unmodified generic datapath, so the system operates as the generic Ethernet switch. In the hard mode, all the traffic is directed to and processed by the corresponding Network Code entity, blocking any communication with the generic datapath and providing a dedicated channel for time-driven communication.

To illustrate how the time-triggered switching process works, let us consider a typical setup consisting on a set of distributed sensors transmitting periodical readings of certain variables, and a processing node using these readings to perform a time-critical task. In the time-triggered mode, everything must be specified offline: the schedule, the buffers, and the timing. This is a necessary restriction to assure the integrity of the communication in critical tasks. The sensors must send their corresponding readings encoded as valid Network Code messages. The autoreceiver block on each port will extract the useful data from the received message and store the reading value it in a receive buffer. The Network Code program specifies the time when these values are read from the buffer and stored into an specific location of the shared memory. The schedules must be checked offline to assure that only one port will access the shared memory at an specific time. In the port connected to the processing node, the programmed schedule specifies the time when the received values are read from the shared memory, turned into messages, and transmitted to the node. The customized system allows us to predict the switching delays with an accuracy of a single cycle.

The mode instruction allows the switch to operate with standard Ethernet traffic. For example, if the sensor transmit a critical reading with a certain period, we could calculate the switching time for this specific reading and switch to the soft-mode in the remaining of the period, allowing to operate with standard traffic in a single port while still preserving the timeliness requirements for the critical variables. Also, because the time-triggered mechanisms are decoupled between ports and are independent from the generic datapath, we could have some ports dedicated to time-critical data and the rest operating with generic traffic.

Memory Map

There is an individual NC core connected to each one of the ports. Each core has its own memory space to store the programmed schedule and the map to the shared data memory, which is shared among all the instances. In the workstation, the assigned memory space for the NetFPGA board goes from address 0x400000 to 0x800000. We distributed this space as shown in the following table:

Base Address	Length (in 32 bits words)	Alias
0X00400000	1	CONTROL_REG
0X00500000	512 (*)	DATA_MEMORY
	MC_NCP_0
0X00600000	256	PROG_ROM_0
0X00640000	256	CFG_ROM_0
0X00660000	256	CFG_ROM_QUEUE_0
	MC_NCP_1
0X00680000	256	PROG_ROM_1
0X006C0000	256	CFG_ROM_1
0X006E0000	256	CFG_ROM_QUEUE_1
	MC_NCP_2
0X00700000	256	PROG_ROM_2
0X00740000	256	CFG_ROM_2
0X00760000	256	CFG_ROM_QUEUE_2
	MC_NCP_3
0X00780000	256	PROG_ROM_3
0X007C0000	256	CFG_ROM_3
0X007E0000	256	CFG_ROM_QUEUE_3

(*): length referred to 64 bits words.

The PROG_ROM is divided in 32 bits words, each one corresponding to a coded Network Code instruction. The CFG_ROM is also divided in 32 bits words, each one defining a data buffer mapped to the shared memory. The higher 16 bits indicate the initial position, and the lower 16 bits indicates the length of the stored data (in 64 bits words). The ID of each variable is simply defined by the position in the CFG_ROM. For example, if you want to create a message from buffer 4, the system will map the contents of the fourth word in the CFG_ROM to the shared data memory to extract the data.

The CFG_ROM_QUEUE block allows us to manage each buffer as a queue (this characteristic is still under test).

The CONTROL_REG is an additional block consisting on a 32 bits register which sets the functionality for the NC instances. Each instance has an associated hexadecimal digit (4 bits) according to the following distribution:

xxxxxxxxxxxxxxxxddddccccbbbbaaaa      a: control bits for MC_NCP_0
                                      b: control bits for MC_NCP_1 
                                      c: control bits for MC_NCP_2  
                                      d: control bits for MC_NCP_3 
                                      x: not used

The configuration bits for the instance MC_NCP_0 have the following meaning:

a3a2a1a0 	a0 : sets the running status of the core (start:1/stop:0) 
                a1 : set the value(1/0) of the user_bit1 (see NC language specification) 
                a2 : set the value(1/0) of the user_bit2 (see NC language specification)
                a3 : not used

The previous description is similar for the rest of the core instances.

System Usage

With the hardware board and software utilities installed, you can use the workstation console to program the board and perform read and write operations to the internal registers to set, modify and debug the system functionality. We prepared a short video illustrating a simple procedure for programming the board and running a very basic example which transmit real-time data from one port and receive it in another.

Running basic examples: http://www.youtube.com/watch?v=E8q6JvprCBg

Software utilities

nf_download: Load an specified hardware design (.bit file) into the Virtex2 chip included in the NetFPGA . This is an unmodified version of the one offered in the NetFPGA base package. For example, by executing:

nf_download ncm_netfpga_beta.bit

you will load the ncm_netfpga_beta.bit design contained on the local folder. Note that the file name can also be a complete path. You will need root permission to execute the application (execute su before executing it). A correct execution will show something like this:

Found net device: nf2c0 Bit file built from: nf2_top.ncd;HW_TIMEOUT=FALSE Part: 2vp50ff1152 Date: 2009/06/02 Time: 17:21:55 Error Registers: 1000000 Good,after resetting programming interface the FIFO is empty Download completed - 2377668 bytes. (expected 2377668). DONE went high - chip has been successfully programmed. WARNING: NF2 device info not found. Cannot verify that the CPCI version matches. Once the FPGA has been programmed, you will be able to perform read and write operations on the internal registers of the system.

regread: Reads an specified register by indicating its address. Execute the command without arguments to see the usage description.
regwrite: Writes an specified register by indicating its address. Execute the command without arguments to see the usage description.
readall: It is based on the regread command. It reads a set of continuous registers according to the supplied option parameters. It allows the user to read a set of addresses by indicating either an absolute memory location or an alias. For example, the instruction

readall -a PROGROM_0 PROGROM_1

will show the contents of the Program ROM sections of instances 0 and 1. This is equivalent to

readall 0x600000 0x680000

Execute the command without arguments to see the usage description.

memfill: It is based on the regwrite command. It fills an specified portion of memory with the contents of a text file. Each line of the text file must be a 32 bits hex word ended by a semicolon (;). The exception to this definition is the Shared Data Memory, where each line of the file must be a 64 bits hex word. You can insert a comment after each ";". For example, a valid text file (let's say file.txt) to write in the PROG_ROM section will have the following format (please check the language specification in the doc folder to learn about the bytecode generation of the schedules):

10000200; create message 20000000; send using channel 0 5040FFFF; delay statements 40000000; 5060FFFF; 40000000; 5000FFFF; jump to 0 and start again 40000000;

To load the previous schedule in the Program ROM section of instance 0, we execute:

memfill 0x600000 file.txt

or equivalently

memfill -a PROG_ROM_0 file.txt

Execute the command without arguments to see the usage description.

Considerations:

The workstation increments the memory addresses by bytes. Because of this, the user must take special care to avoid writing over non-aligned positions (the start address for writing operations must be multiple of 8 for the shared memory data and multiple of 4 for the rest of the blocks). To avoid problems regarding this, we recommend using the block alias (option -a) instead of absolute addresses.

Examples

The examples folder contains a set of examples to verify the system functionality. For a simple test, use a network cable to connect ports 0 and 1 in the NetFPGA . You should see the green lights steady in both ports. Then, from inside the examples folder, execute the following script

./test_rx_tx_ports_0_1 N

where _N_ is an hexadecimal number that can take any value between 0 and F. While executing, you should see a set of messages on the screen reporting the result of the configuration tasks, and the orange lights of both ports will start to blink. Now, if you perform the following read operation

readall 0x500000 ||more

you should see that the block of N words starting from the base address 0x0 on the data memory is replicated starting from address 0xF.

Note: A more detailed description, automatic scripts for testing functionality and regression tests are under development.

Ongoing Work

We are currently working on improving the system usability for the end user. One important goal is to develop high-level tools to automatically generate the schedules and configuration data for the real-time nodes from high-level descriptions and QoS requirements.

Further Information

The doc folder includes a quickguide reference document with some more detailed information about the system usage. For further information about the complete Network Code framework, including publications, guides and references, or if you have any question, please visit the ESG website of the University of Waterloo. You can also contact us through e-mail.

Please fell free to check, modify and/or reuse the provided source files for both hardware and software designs.

Information about detected bugs, comments or general feedback will be always more than welcome.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly