Using the RFNoC Replay Block in UHD 4

From Ettus Knowledge Base
Revision as of 09:19, 7 February 2022 by MartinBraun (Talk | contribs)

Jump to: navigation, search

Application Note Number

AN-642b

Abstract

This application note guides a user through basic use of the RFNoC Replay block in UHD 4.x and explains how to run the UHD Replay example. This example covers the USRP X410, X310/X300, N300/N310/N320 and E320 devices. For UHD 3.x, please refer to the UHD 3.x replay block Application Note.

An introduction to RFNoC with UHD 4.0 and above can be found here.

Overview

The Replay block is an RFNoC block that allows recording and playback of arbitrary data using DRAM on the USRP hardware as a buffer. To use the Replay block, it must be instantiated in the design and connected to the DRAM interface.

The replay block is a standard feature of UHD and RFNoC, and a custom compile of UHD is not required. By default, UHD ships the Replay block with the default images for the X410, X310/X300, N300/N310/N320 series of USRPs, so when using these images, the following examples can be run without any manual builds of UHD or FPGA images.

Prerequisites

To follow this application note, you need a device with a replay block instantiated, and a UHD version recent enough to have the full support for the replay block (UHD 4.2 and beyond). To test for the replay block capabilities, connect and enable your USRP device, and run the following command:

   uhd_usrp_probe --args <device args>

Depending on your device, the output could look like this (truncated):

    _____________________________________________________
   /
   |       Device: X400-Series Device
   |     _____________________________________________________
   |    /
   |   |       Mboard: ni-x4xx-<serial>
   |   |   pid: 1040
   |   |   rev: 4
   |   |   rev_compat: 4
   |   |   serial: <serial>
   |   |   MPM Version: 4.0
   |   |   FPGA Version: 7.6
   |   |   FPGA git hash: <hash>.clean
   |   |   RFNoC capable: Yes
   |   |
   |   |   Time sources:  internal, external, qsfp0, gpsdo
   |   |   Clock sources: mboard, internal, external, nsync, gpsdo
   |   |   Sensors: ...
   |     _____________________________________________________
   |    /
   |   |       RFNoC blocks on this device:
   |   |
   |   |   * 0/DDC#0
   |   |   * 0/DDC#1
   |   |   * 0/DUC#0
   |   |   * 0/DUC#1
   |   |   * 0/Radio#0
   |   |   * 0/Radio#1
   |   |   * 0/Replay#0
   |     _____________________________________________________
   |    /
   |   |       Static connections on this device:
   |   |
   |   |   * 0/SEP#0:0==>0/DUC#0:0
   |   |   * 0/DUC#0:0==>0/Radio#0:0
   |   |   * 0/Radio#0:0==>0/DDC#0:0
   |   |   * 0/DDC#0:0==>0/SEP#0:0
   |   |   * 0/SEP#1:0==>0/DUC#0:1
   |   |   * 0/DUC#0:1==>0/Radio#0:1
   |   |   * 0/Radio#0:1==>0/DDC#0:1
   |   |   * 0/DDC#0:1==>0/SEP#1:0
   |   |   * 0/SEP#2:0==>0/DUC#1:0
   |   |   * 0/DUC#1:0==>0/Radio#1:0
   |   |   * 0/Radio#1:0==>0/DDC#1:0
   |   |   * 0/DDC#1:0==>0/SEP#2:0
   |   |   * 0/SEP#3:0==>0/DUC#1:1
   |   |   * 0/DUC#1:1==>0/Radio#1:1
   |   |   * 0/Radio#1:1==>0/DDC#1:1
   |   |   * 0/DDC#1:1==>0/SEP#3:0
   |   |   * 0/SEP#4:0==>0/Replay#0:0
   |   |   * 0/Replay#0:0==>0/SEP#4:0
   |   |   * 0/SEP#5:0==>0/Replay#0:1
   |   |   * 0/Replay#0:1==>0/SEP#5:0
   |   |   * 0/SEP#6:0==>0/Replay#0:2
   |   |   * 0/Replay#0:2==>0/SEP#6:0
   |   |   * 0/SEP#7:0==>0/Replay#0:3
   |   |   * 0/Replay#0:3==>0/SEP#7:0

The output tells us that this USRP has a Replay block instantiated (0/Replay#0). It has four static connections to stream endpoints, which also tells us that this is a four-port replay block.

If your device does not report a replay block, then you need to build and load an FPGA image which includes this block. See ... for instructions on how to do this before you proceed. If it does report a block, you can move to the next section.

Running the Example

The rfnoc_replay_samples_from_file example assumes that you have a file containing the samples you wish to replay. This could be generated in advance or recorded using rx_samples_to_file or another method. For this demonstration, we'll create a simple Python program (sample_gen.py) to generate some samples to use:

   import math
   import struct
   
   SAMPLE_RATE = 200.0e6        # Sample rate in Hz
   FREQUENCY   = 500.0e3        # Frequency of sinusoid to generate, in Hz
   NUM_SAMPLES = 16000          # Number of samples to generate
   AMPLITUDE   = 0.5            # Amplitude of the signal (from 0 to 1.0)
   FILE_NAME   = 'samples.dat'
   
   file = open(FILE_NAME, 'wb')
   
   for i in range(NUM_SAMPLES):
       I = int((2**15-1) * AMPLITUDE * math.cos(i / (SAMPLE_RATE / FREQUENCY) * 2 * math.pi))
       Q = int((2**15-1) * AMPLITUDE * math.sin(i / (SAMPLE_RATE / FREQUENCY) * 2 * math.pi))
       file.write(struct.pack('<2h', I, Q))
   
   file.close()

This program generates a file named samples.dat that contains 16000 samples (40 periods) of a 500 kHz tone sampled at a rate of 200 MHz. Each sample is saved in sc16 format (signed complex with 16-bit real and 16-bit imaginary components). We can run the program by invoking python from the command line.

   $ python ./sample_gen.py

To run the UHD Replay example, enter a command like the following (the path to the examples depends on your installation method, for a normal installation via apt-get, examples will be located in /usr/lib/uhd/examples or /usr/local/lib/uhd/examples).

   $ cd /path/to/examples
   $ ./replay_samples_from_file --args <device args> --freq 915e6 --gain 10 --file samples.dat --rate 200e6

This example would stream the samples from the file to the Replay block on the FPGA, where they are recorded into the USRP's on-board DRAM. Then, the Replay block will play the samples to the radio continuously with a base frequency of 915 MHz, creating a tone at 915.5 MHz. Press Ctrl+C to stop transmitting. Alternatively, use the --nsamps command line argument to transmit a certain number of samples before returning to the command line. Use the --help argument to see a full list of arguments.

The advantage of this example compared to directly streaming the file to the device is twofold:

  • The initial upload to DRAM can happen at any link rate. Even when using 1 GbE, this example will work.
  • Once uploaded, the host computer is basically idle. The FPGA will handle the streaming to the radio front-end.

The source code for rfnoc_replay_samples_from_file may be considered an example for best practices on how to use the replay block.


Using the Replay Block

This block works like a record and playback buffer that uses DRAM on the USRP to store samples. Data can be streamed to the block, like to any other RFNoC block.

Refer the manual of the replay block controller for a comprehensive description of its features and API calls.

In the following, we shall use the C++ API to demonstrate the most important API calls. We will assume there is a replay block controller called replay_ctrl available in the current context, and it is of type uhd::rfnoc::replay_block_control::sptr.

Recording Data

Before streaming data to the replay block, it needs to be configured for recording:

   replay_ctrl->record(buffer_start_byte_address, buffer_size_in_bytes, replay_chan);

This tells the Replay block that it should start recording any data it receives on port port into the DRAM at byte offset buffer_start_byte_address and should use up to buffer_size_in_bytes bytes. Once the buffer is filled, recording automatically stops. Care should be taken to configure the memory buffers so that they do not overlap if more than one Replay block or buffer is being used simultaneously.

Call this once for every port that you expect to stream data into.

Note: Care should be taken to not transfer more data to the Replay block than the size of the record buffer. Additional data is not accepted or dropped by the replay block, but flow control will cause data to back up in the RF network on the FPGA.

Note: The amount of memory available to the Replay block is limited by the size of the DRAM on the USRP and how the memory interface is configured on the USRP. For example, the axi_intercon_2x64_128_bd IP used by the X310 is configured to give each connected device an address space of 32 MiB. As a result, the Replay block will be limited to this amount of memory. The axi_intercon_2x64_128_bd file must be modified if more than 32 MiB needs to be buffered.

To restart recording from the same offset, the following API call can be used:

   replay_ctrl->record_restart(replay_chan);

This resets the record pointer to point back to the beginning of the buffer and it resets the internal counters that track how much data has been recorded. If a previous recording has taken place then it is a good idea to ensure that stale data was not queued up in the RF network on the FPGA from a previous run. The replay_samples_from_file example does this by calling record_restart() then waiting to see if any new data shows up unexpectedly in the record buffer. If so, it restarts recording then waits again to see if data continues to appear.

You can determine when all data has been received by checking the status of the record fullness.

   // Wait for recording to complete
   while (replay_ctrl->get_record_fullness(replay_chan) < num_bytes_expected)
       std::this_thread::sleep_for(100ms);

Playing back Data

Prior to playing back recorded data, it is necessary to configure the base address and size of the playback buffer. To play back previously recorded data, set the start address to the same address that was used for the record buffer and set the size of the playback buffer to the match the amount of data that was recorded. Note that the record and playback buffers do not need to be the same, allowing a single Replay block to both record and playback to different regions of memory simultaneously.

   // Configure the Replay block to play back everything that was recorded
   num_bytes_recorded = replay_ctrl->get_record_fullness(replay_chan);
   replay_ctrl->config_play(buffer_start_byte_address, num_bytes_recorded, replay_chan);

To play back the data in the playback buffer, issue the appropriate UHD stream command. Playback automatically wraps around to the start of the buffer if more data is requested than the size of the playback buffer.

   uhd::stream_cmd_t stream_cmd(uhd::stream_cmd_t::STREAM_MODE_START_CONTINUOUS);
   stream_cmd.num_samps  = words_to_replay;
   stream_cmd.stream_now = true;
   replay_ctrl->issue_stream_cmd(stream_cmd, replay_chan);

or

   uhd::stream_cmd_t stream_cmd(uhd::stream_cmd_t::STREAM_MODE_NUM_SAMPS_AND_DONE);
   stream_cmd.num_samps  = words_to_replay;
   stream_cmd.stream_now = true;
   replay_ctrl->issue_stream_cmd(stream_cmd, replay_chan);

STREAM_MODE_START_CONTINUOUS causes playback to continue indefinitely until explicitly stopped. STREAM_MODE_NUM_SAMPS_AND_DONE causes only the specified number of samples to be played once. The num_samps parameter is a 28-bit value, limiting this mode of playback to 228 words at a time. Playback can be stopped by issuing a stop command.

   stream_cmd.stream_mode = uhd::stream_cmd_t::STREAM_MODE_STOP_CONTINUOUS;
   replay_ctrl->issue_stream_cmd(stream_cmd);

This will stop playback at the end of the next DRAM read after the command is received (DRAM reads are not aborted mid-transaction). As a result, some data will continue to stream from the Replay block after the stop command is issued while waiting for the DRAM read to complete and for all the internal buffers to empty.

When using the C++ API, the play() API call is a useful shorthand for configuring playback and submitting the stream command at the same time:

   replay_ctrl->play(buffer_start_byte_address, num_bytes_to_play, replay_chan, start_time, repeat);

In either case, the start time is optional. When given, the first sample to leave the block on playback is tagged with this timestamp.


Memory alignment and word sizes

There are two memory alignment values that need to be considered when dealing with the replay block. The first is the word size, which is the minimum number of bytes per transaction. For all default configurations, the word size is 64 bits, which means that only even numbers of samples can be recorded or played back when using 16-bit complex samples (at 4 bytes per sample). Use the get_word_size() API call to identify the correct word size.

The second alignment value is the memory's page boundaries. The start addresses for record and replay must fall onto a 4 kiB memory boundary to ensure correct alignment of data.


Building Custom FPGA Images with a Replay Block

Configure the Default Shell

Before you begin, make sure you are using the Bash shell. See Reconfigure Default Shell in AN-315 for detailed instructions.

Cloning the Repository

Note: Cloning the repository is only required when building custom FPGA images. If the replay block is already built into the USRP's bit file, this is not required. By default, UHD ships the Replay block with the default images for the X410, X310/X300, N300/N310/N320 series of USRPs.

If you do require access to the source code, e.g. to build an FPGA image with a custom replay block configuration, run the following command:

   $ git clone https://github.com/EttusResearch/uhd.git

For the rest of the Application Note, we assume the repository was cloned into the location ~/src/uhd.

Installing the FPGA Tools

In order to build the FPGA image for the intended USRP product, you will need to have the Xilinx development tools installed. The specific version required depends on the UHD version. Refer to the manual for the correct version for your UHD version, and the installation instructions for Vivado in order to install these tools. It is recommended that you use the default install location of /opt/Xilinx/Vivado.

Building the FPGA

In order to use the Replay block, it must be built into the FPGA image for the USRP you plan to use. This is currently a manual step. The instructions below are for the X310, but similar instructions apply to other RFNoC-capable devices.

To create a custom FPGA image with a replay, you need to create an image core file. The following lines are the relevant lines from the X310 default image core file:


   stream_endpoints:
     # ... all the other stream endpoints...
     ep4:                       # Stream endpoint name
       ctrl: False                     # Endpoint passes control traffic
       data: True                      # Endpoint passes data traffic
       buff_size: 4096                 # Ingress buffer size for data
     ep5:                       # Stream endpoint name
       ctrl: False                     # Endpoint passes control traffic
       data: True                      # Endpoint passes data traffic
       buff_size: 4096                 # Ingress buffer size for data
   
   noc_blocks:
     # ... all the other blocks...
     replay0:
       block_desc: 'replay.yml'
       parameters:
         NUM_PORTS: 2
   
    connections:
     # ...connections for all the other blocks...
     # ep4 to replay0(0)
     - { srcblk: ep4,     srcport: out0,  dstblk: replay0, dstport: in_0 }
     # replay0(0) to ep4
     - { srcblk: replay0, srcport: out_0, dstblk: ep4,     dstport: in0  }
     # ep5 to replay0(1)
     - { srcblk: ep5,     srcport: out0,  dstblk: replay0, dstport: in_1 }
     # replay0(1) to ep5
     - { srcblk: replay0, srcport: out_1, dstblk: ep5,     dstport: in0  }
     # BSP Connections
     - { srcblk: replay0, srcport: axi_ram, dstblk: _device_, dstport: dram }     MEM_ADDR_W: 30
   clk_domains:
       # ...all other clock domains...
       - { srcblk: _device_, srcport: dram,  dstblk: replay0, dstport: mem  }

As you can see, the replay block requires configuration in up to four sections:

  • For maximum flexibility, every port of the replay block will receive its own stream endpoint. This is not a requirement of the replay block, but allows its flexible use.
  • Of course, the block needs to be declared in the noc_blocks section.
  • It must be connected to the stream endpoints in the connections section, as well as to the DRAM banks (BSP connection).
  • Finally, the clock domain needs to be connected.

When the YAML image core file is complete, save it, e.g., as x310_replay_image_core.yml, and pass it to the image builder:

   rfnoc_image_builder -y x310_replay_image_core.yml

This will create a bitfile that contains the replay block. It can be loaded onto the device using the uhd_image_loader tool:

   uhd_image_loader --args type=x300,addr=<ip address> --fpga-path=/path/to/usrp_x310_fpga_HG.bit

When this is complete, the replay block is ready to use.

Source files

For reference, the following files implement the replay block: