

# **RFNoC 4 Workshop** Part 1

Jonathon Pendlum – Ettus Research

Neel Pandeya – Ettus Research GRCon 2020

# Schedule



#### Part 1

- RFNoC 4 Framework Overview
- Hands on Demos
- Part 2
  - FPGA Architecture
  - Software Implementation
  - GNU Radio Integration
  - Hands on RFNoC Block Development



#### PC + Flexible RF Hardware + SDR Framework





- PC + Flexible RF Hardware + SDR Framework
  - GPP: Multi-core + SIMD -- GNU Radio



### **GNU Radio**

#### Open source toolkit for developing software radios

| File Edit View Run                                                                                                                                                                                                                                                                                                                     | Tools Help                                                                                                                                              | i i i i i i i i i i i i i i i i i i i                                                                                                |              |                                                                                                                                             |                                | ۹ ৫ ۹                                                                                                                                                                                                                                                                                                                                                                                                                |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------|--------------|---------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Options<br>Titlie: UHD WBFM Receive<br>Author: Example<br>Description: WBFM Receive<br>Output Language: Python<br>Generate Options: QT GUI<br>UHD: USRP Source<br>Device Arguments: ad<br>Sync: Unknown pPS<br>Samp rate (25): 400k<br>Ch0: Center Freq (H2): 9<br>Ch0: AGC: Default<br>Ch0: Gain Yulue: 5<br>Ch0: Gain Type: Absolute | 8.40.2<br>J.3.3M Decimation: 1<br>Gain: 1<br>Sample Rate: 400k<br>Cutoff Freq: 115k<br>Transition Width: 30k<br>Window: Hann                            | Default Value: 0<br>Start: -100m<br>Stop: 100m<br>Step: 10m<br>WBFM Ru<br>Quadrature I<br>Audio Decim<br>QT GUI Frr<br>FFT Size: 512 | equency Sink | QT GUI Range<br>Id: volume<br>Label: Volume<br>Default Value: 100m<br>Start: 0<br>Stop: 10<br>Step: 10m<br>Multiply Const<br>Constant: 100m | Audio Sink<br>Sample Rate: 40k | Q         • Core         • Audio         • Boolean Operators         • Byte Operators         • Channelizers         • Channelizers         • Channel Models         • Coding         • Control Port         • Debug Tools         • Deprecated         • Digital Television         • Equalizers         • Error Coding         • File Operators         • Filters         • Fourier Analysis         • GUI Widgets |
| INFO] [X300] X300 initial<br>INFO] [X300] Maximum<br>INFO] [X300] Radio 1x cl<br>WARNING] [RFNOC::BLC<br>INFO] [MULTI_USRP] 1                                                                                                                                                                                                          | frame size: 8000 bytes.<br>ock: 200 MHz<br>DCK_FACTORY] Could not find bloc<br>) catch time transition at pps edge<br>) set times next pps (synchronous | k with Noc-ID 0xb16, 0                                                                                                               | -            |                                                                                                                                             |                                | <ul> <li>Impairment Models</li> <li>Instrumentation</li> <li>Level Controllers</li> <li>Math Operators</li> <li>Measurement Tools</li> <li>Message Tools</li> <li>Misc</li> <li>Modulators</li> <li>Networking Tools</li> <li>OEDM</li> </ul>                                                                                                                                                                        |

0 9 0

Research

Ettus

Ó



- PC + Flexible RF Hardware + SDR Framework
  - GPP: Multi-core + SIMD -- GNU Radio





- PC + Flexible RF Hardware + SDR Framework
  - GPP: Multi-core + SIMD -- GNU Radio
  - GPU: High performance FP -- OpenCL, gr-fosphor





- PC + Flexible RF Hardware + SDR Framework
  - GPP: Multi-core + SIMD -- GNU Radio
  - GPU: High performance FP -- OpenCL, gr-fosphor
  - RF HW: Wide bandwidth, large FPGA -- Rate change DSP



#### **Universal Software Radio Peripheral**

|              | Gen 1     | Gen 2     | Gen 3 (E310) | Gen 3 (X310) |
|--------------|-----------|-----------|--------------|--------------|
| FPGA         | Cyclone 1 | Spartan 3 | Zynq         | Kintex 7     |
| Logic Cells  | 12K       | 53K       | 85K          | 406K         |
| Memory       | 26KB      | 252KB     | 560KB        | 3180KB       |
| Multipliers  | NONE!     | 126       | 220          | 1540         |
| Clock Rate   | 64 MHz    | 100 MHz   | 200 MHz      | 250 MHz      |
| RF Bandwidth | 8 MHz     | 50 MHz    | 128 MHz      | 640 MHz      |
| Free Space   | NONE!     | ~50%      | ~60%         | ~75%         |





- Massive processing requirements
  - Welsh's algorithm for Power Spectrum Estimation
  - 1024 FFT + |X|<sup>2</sup> + Moving Average at 200 MSPS





- Massive processing requirements
  - Welsh's algorithm for Power Spectrum Estimation
  - 1024 FFT + |X|<sup>2</sup> + Moving Average at 200 MSPS





- Massive processing requirements
  - Welsh's algorithm for Power Spectrum Estimation
  - 1024 FFT + |X|<sup>2</sup> + Moving Average at 200 MSPS
- Overloaded transport
  - 200e6 samp/sec \* 32 bits/samp => 6.4 Gb/sec





- Massive processing requirements
  - Welsh's algorithm for Power Spectrum Estimation
  - 1024 FFT + |X|<sup>2</sup> + Moving Average at 200 MSPS
- Overloaded transport
  - 200e6 samp/sec \* 32 bits/samp => 6.4 Gb/sec
- Latency and Determinism
  - Ethernet latency, OS scheduling, precise timing



# **Opportunity: Use the FPGA!**



- Everything USRP is open source, available online (code, firmware, schematics)
- Contains big and expensive FPGA!
- Why do customers not use it?



#### **FPGAs are Hard**



#### **FPGA Design Process**



#### **Domain vs FPGA Experts**



- FPGA development is not a requirement of a communications engineering curriculum
- Math in FPGAs is hard
- Complicated system architecture

atmost pure-noise channels. This intuition is clarified more by the following inequality. It is shown in [1] that for any B-DMC W,

$$1 - I(W) \le Z(W) \le \sqrt{1 - I(W)^2}$$
 (2)

where I(W) is the symmetric capacity of W.

Let  $W^N$  denote the channels that results from N independent copies of W i.e. the channel  $\langle \{0,1\}^N, \mathscr{Y}^N, W^N \rangle$  given by

$$W^{N}(y_{1}^{N}|x_{1}^{N}) \stackrel{\text{def}}{=} \prod_{i=1}^{N} W(y_{i}|x_{i})$$
 (3)

where  $x_1^N = (x_1, x_2, \ldots, x_N)$  and  $y_1^N = (y_1, y_2, \ldots, y_N)$ . Then the *combined* channel  $\langle \{0, 1\}^N, \mathscr{Y}^N, \widetilde{W} \rangle$  is defined with transition probabilities given by

$$\widetilde{W}(y_1^N|u_1^N) \stackrel{\text{def}}{=} W^N(y_1^N|u_1^NG_N) = W^N(y_1^N|u_1^NR_NG^{\otimes n})$$



# **RFNoC: RF Network on Chip**

- Make USRP FPGA acceleration more accessible
- Software API + FPGA infrastructure
  - Handles FPGA Host communication / dataflow
- Provides users simple software and HDL interfaces
  - Infrastructure transparent to user -- reusable code
  - Trade off flexibility versus resource utilization
- Open source
- Fully supported in GNU Radio
  - Modularity and composability

Ettus



#### **RFNoC Architecture** 0 Q 0 Ettus Research Ó **User Application – GNU Radio RFNoC RX Radio** Number of Channels: 1 **Block Args:** QT GUI Vector Sink Device Select: -] FFT Sampl Users implement custom FPGA logic in "RFNoC Blocks" Anteni P C C Autom Bandw Architecturally independent of other logic HOST DC Off IQ Bala Easy to add and remove RFNoC Blocks **USRP FPGA** NoC Core **RFNoC Radio RFNoC Block RFNoC Block** Block

#### **RFNoC** Architecture 0 Q Ettus Research Ó **User Application – GNU Radio RFNoC RX Radio Block Args:** QT GUI Vector Sink Device Select: -1 FFT Users implement custom FPGA logic in "RFNoC Blocks" Anten Р С Autom Bandw Architecturally independent of other logic HOST IQ Bala Easy to add and remove RFNoC Blocks Ettus provides a library of pre-made RFNoC Blocks **USRP FPGA** RFNoC Radio Block connects to the RF Frontend, SPI, GPIO NoC Core **RFNoC Radio RFNoC Block RFNoC Block** Block

Ettus

Research Ó 

9 <sub>0</sub>





0 Q

IN NOC DIOCK

Block



NI NUC DIUCK

Ettus

9 O 0



0 9 0



#### **RFNoC RX Radio** Number of Channels: 1 Block Args: QT GUI Vector Sink Device Select: -1 FFT Vector Size: 1.024k Instance Select: -1 FFT Size: 1.024k Log10 X-Axis Start Value: 0 Sample Rate (Hz): 0 Complex to Mag<sup>2</sup> Forward/Reverse: Forward n: 20 X-Axis Step Value: 1 RFNoC Rx Streamer Antenna Select: RX2 Vec Length: 1.024k Window: window.blackmanhar.. **k:** 0 X-Axis Units: Center Frequency (Hz): 0 Р С Shift: Yes Vec Length: 1.024k Gain: 0 Y-Axis Units: Num. Threads: 1 Automatic Gain Control: Default Ref Level: 0 Bandwidth (Hz): 0 HOST DC Offset Correction: False IQ Balance: False **Example: Plotting Frequency Spectrum USRP FPGA NoC Core RFNoC Radio RFNoC Block RFNoC Block** Block



9 O













Ettus

 $\diamond$   $\diamond$   $\diamond$ 





Ettus











- Block to block communication:
  - FIFO to FIFO, packetized, flow control (unless static route)
  - Transparent to user built into RFNoC infrastructure

0 9 0



- User interfaces to RFNoC via AXI-Stream
  - Industry standard (ARM), easy to use
  - Large library of existing IP cores

 $\circ \circ \circ$ 



- User writes their own custom HDL or drops in IP
  - VHDL, Verilog, SystemVerilog, Vivado HLS
  - Xilinx IP, Vivado Block Diagram

 $\circ \circ \circ$ 



- Each block is in their own clock domain
  - Improves throughput
  - Easier timing closure

0 0 0















- Make FPGA acceleration more accessible on USRPs
- Tightly integrated with GNU Radio
- Library of existing RFNoC Blocks
  - FFT, FIR, Signal Generator, Fosphor
- Portable between all third generation USRPs
  - X3x0, E3xx, N3xx
- Completely open source
- kb.ettus.com/RFNoC\_Getting\_Started\_Guides
- Next: FPGA & Software Development

# What's New in RFNoC 4



- Better Documentation
  - Spec: http://files.ettus.com/app\_notes/RFNoC\_Specification.pdf
- Software Enhancements
  - Stability and Testing
  - Python support for RFNoC
- FPGA Improvements
  - Scalable to faster sampling rates (250 MSPS+)
  - Instantiate far more RFNoC Blocks
  - Static routing between RFNoC blocks
    - Trade off latency and resource utilization versus flexibility
- GNU Radio 3.8 Support

# **RFNoC 4 Demo - Fosphor**



- Fosphor is a real-time GPU-accelerated or FPGA-accelerated spectrum display tool
- Running on a USRP X310 with a WBX daughterboard
- The system is running Ubuntu 20.04 with GNU Radio 3.8.2.0
- All calculations for the FFT and waterfall are being done on the FPGA, not on the CPU
- The CPU is minimally loaded, even for large bandwidths