Difference between revisions of "5G OAI Neural Receiver Testbed with USRP X410"

Revision as of 22:52, 30 October 2025

Application Note Number and Authors

AN-829

Authors

Bharat Agarwal and Neel Pandeya

Executive Summary

Overview

This Application Note presents a practical, system-level benchmarking platform leveraging NI USRP software-defined radios (SDRs) and the OpenAirInterface (OAI) 5G/NR stack for evaluating AI-enhanced wireless receivers in real-time. It addresses one of the key challenges in deploying AI/ML at the physical layer-ensuring reliable system performance under real-time constraints.

Motivation and Context

AI and ML techniques hold promise for improving both wireless and non-wireless KPIs across the stack, from core-level optimization (e.g., load balancing, power savings), to tightly-timed PHY/MAC innovations such as:

ML-based digital predistortion to improve power efficiency.
Neural receivers for channel estimation and symbol detection with improved SNR tolerance.
Intelligent beam and positioning prediction, even under fast channel dynamics.

Consortia such as 3GPP (Release-18/19) and O-RAN are actively defining how AI/ML can be incorporated into future cellular network standards.

Neural Receiver Model

We demonstrate a real-time implementation of a neural receiver that is based on a published model architecture called DeepRX, which replaces the traditional OFDM receiver blocks (channel estimation, interpolation, equalization, detection) with a single neural network that treats the time-frequency grid data as image-like input. Model training and validation are performed using the NVIDIA Sionna link-level simulator, and training data is stored using the open SigMF format for reproducibility.

More information about the SigMF file format can be found on the project website, on the Wikipedia page, and on the GitHub page.

Real-Time Benchmarking Platform

To validate the performance of the neural receiver in real hardware, the prototype integrates:

The OAI real-time 5G protocol stack (complete core, RAN, and UE) running on commodity CPUs.
NI USRP SDR hardware as the RF front-end.
An optional O-RAN Near-RT RIC (via FlexRIC) integration.
Neural receiver inference performed on a GPU (e.g., Nvidia A100, RTX 4070, RTX 4090), accessed via TensorFlow RT C-API for seamless integration within OAI.

This setup enables a direct comparison between the traditional receiver baseline against the neural receiver in an end-to-end real-time system.

Benchmarking Results

Initial testing focuses on uplink performance using various MCS levels (MCS-11, MCS-15, MCS-20 are specifically highlighted in this document) and SNR ranges (5 dB to 18 dB) under a realistic fading channel profile (urban micro, 2 m/s, 45ns delay spread). Each measurement is averaged over 300 transport blocks.

Some of the key findings are listed below.

The neural receiver shows a clear Bit Error Rate (BER) advantage at lower MCS and lower SNR.
At higher MCS levels, the performance gap narrows (a trade-off that merits further analysis).
A reduced uplink bandwidth was used to meet strict real-time latency requirements (500 μs slot duration with 30 KHz SCS).
The neural receiver model complexity was reduced by 15 times (from 700K to 47K parameters) to achieve real-time GPU inference.

These results underscore the crucial balance between complexity, latency, and performance in AI-enhanced wireless physical-layer deployments.

Conclusions and Implications

The testbed demonstrates a realistic path from simulation to real-time deployment of neural receiver models. This workflow supports rapid prototyping, robust AI model validation, and exploration of architecture-performance trade-offs.

Some key takeaways are listed below.

AI/ML models can be efficiently integrated into real-time wireless stacks using SDR hardware and GPU inference.
Low-complexity models offer promising performance improvements while satisfying real-time constraints.
Synchronized dataset generation and automated test workflows enable scalable ML benchmarking across scenarios.
The framework allows researchers to investigate unexpected behaviors and robustness in AI-native wireless systems.

Ultimately, the methodology bridges AI/ML conceptual research and realistic deployment, advancing trust and utility in AI-powered future wireless systems.

Hardware Overview

The Universal Software Radio Peripheral (USRP) devices from NI (an Emerson company) are software-defined radios which are widely used for wireless research, prototyping, and education. The hardware specifications for the various USRP devices are listed elsewhere on this Knowledge Base (KB). For this Neural Receiver implementation described in this document, we use the USRP X410. The USRP X440 may also be used, with some further adjustments to the system configuration.

The resources for the USRP X410 are listed below.

The Hardware Resource page for the USRP X410 can be found here.

The product page for the USRP X410 can be found here.

The User Manual for the USRP X410 can be found here.

The resources for the USRP X440 are listed below.

The Hardware Resource page for the USRP X410 can be found here.

The product page for the USRP X410 can be found here.

The User Manual for the USRP X410 can be found here.

The USRP X410 is connected to the host computer using a single QSFP28 100 Gbps Ethernet link, or using a QSFP28-to-SFP28 breakout cable, which provides four 25 Gbps SFP28 Ethernet links. On the host computer, a 100 Gbps or 25 Gbps Ethernet network card is used to connect to the USRP.

The USRP X410 devices are synchronized with the use of a 10 MHz reference signal and a 1 PPS signal, distributed from a common source. This can be provided by the OctoClock-G (see here and here for more information).

For control and management of the USRP X410, a 1 Gbps Ethernet connection to the host computer is needed, as well as a USB serial console connection.

Further details of the hardware configuration will be discussed later in this document.

Software Overview

The software stack running on the computers used in this implementation are as listed below.

Ubuntu 22.04.5, running on-the-metal, and not in any Virtual Machine (VM)
UHD version 4.8.0.0
Nvidia drivers version 535
Nvidia CUDA version 12.2
TensorFlow 2.14

For the OAI gNB, the OAI UE, and the FlexRIC, there will be NI-specific versions used, and these will be obtained from an NI repository on GitHub.

Note that the Data Plane Development Kit (DPDK) is not used in this implementation. However, it may may be helpful when using higher sampling rates.

Further details of the software configuration will be discussed later in this document.

AI in 6G

The figure listed below highlights the vision for sixth-generation (6G) wireless systems. Beyond incremental improvements, 6G introduces three major advances, as listed below.

Spectrum Expansion: Extending from traditional sub-6 GHz and mmWave bands into FR3 (7 to 24 GHz) and sub-THz (up to 300 GHz), enabling ultra-wide bandwidths and unprecedented data rates.

New Applications: Integration of non-terrestrial networks (NTN) with terrestrial infrastructure and joint communication-and-sensing (JCAS) functionalities, supporting use cases such as connected vehicles, satellite-augmented IoT, and immersive XR.

Network Optimization: Advancements in massive MIMO, multi-user beamforming, and Open RAN disaggregation, improving spectral efficiency, flexibility, and energy sustainability.

Key pillars of 6G development. (Left) Spectrum expansion into FR3 (7–24 GHz) and sub-THz (up to 300 GHz) to support wider bandwidths. (Center) New applications such as non-terrestrial networks (satellite and UAV integration) and integrated communications and sensing (ICAS). (Right) Network optimization through next-generation MIMO and Open RAN evolution. Across all pillars, embedded and trustworthy AI acts as a central enabler

Across these pillars, embedded and trustworthy AI is the key enabler, providing intelligence for spectrum management, adaptive receivers, and end-to-end optimization.

These trends highlight that 6G will operate in highly challenging environments with wideband sub-THz channels, dynamic non-terrestrial links, and complex multi-user MIMO topologies. Traditional linear detection techniques such as ZF or MMSE struggle to cope with hardware non-idealities, nonlinear channel distortions, and the stringent latency and reliability targets of 6G. To address these limitations, the concept of a Neural Receiver has emerged. By embedding deep learning models directly into the receiver chain, neural receivers can learn from real measured impairments, jointly optimize channel estimation and detection, and deliver significant performance gains over classical approaches. This makes neural receivers a key building block for realizing the vision of embedded, trustworthy AI in 6G physical layer design.

5G To 6G Roadmap

The figure listed below illustrates the expected timeline from ongoing 5G research through to the first 6G deployments.

5G (Release-16 to Release-18): 3GPP initiated 5G specification development in Release-15 and Release-16, followed by commercial deployments from 2019 onward. Work on Release-17 and Release-18 (2021 to 2024) extends 5G capabilities in areas such as URLLC, industrial IoT, and positioning.

5G-Advanced (Release-18 to Release-20): Industry research and specification development converge to define 5G-Advanced features. Deployments are expected around 2025 to 2027, focusing on improved energy efficiency, AI/ML-native functions, and expanded NTN integration.

6G (Release-20 onward): Formal 6G technology studies will begin with Release-20 in H2 2025, marked by the first official 6G workshop in March 2025. Standardization of 6G specifications is planned for Release-21 in 2027, with early 6G deployments projected for the end of the decade (around 2030).

Roadmap from 6G research to standardization. The timeline shows the progression from 5G and 5G-Advanced to 6G, aligned with 3GPP releases and industry milestones.

The figure above highlights the transition from 5G deployments to the research and standardization cycles of 5G-Advanced and 6G. This staged process ensures backward compatibility, while paving the way for disruptive innovations in spectrum use, AI-native networks, and new application domains.

As shown in the figure above, the transition from 5G to 6G is not only a matter of spectrum expansion and new use cases, but also of embedding AI-native functionalities into the air interface itself. Release-20 (2025) will mark the start of 6G technology studies, providing an opportunity to evaluate disruptive physical layer techniques such as Neural Receivers. These receivers directly integrate deep learning models into the detection chain, enabling them to cope with nonlinearities, hardware impairments, and the extreme bandwidths expected in FR3 and sub-THz bands. By Release 21 (2027), as 6G specifications are defined, neural receivers and other AI-based PHY innovations will play a crucial role in realizing the vision of AI-native 6G, where intelligence is embedded from the physical layer up to the application layer.

Three Challenges of Data for AI in Wireless

This section highlights the three main challenges hindering the seamless integration of AI into wireless communication systems. The challenges are listed with increasing levels of AI readiness.

Data Scarcity
- Meaning:
  - Wireless networks often lack sufficient labeled and diverse datasets.
- Why it's a problem:
  - Collecting and labeling large wireless datasets is expensive and time-consuming.
  - Real-time data is sparse or kept proprietary by operators/vendors.
  - Rare but critical scenarios (handover failures, deep fades, interference spikes) are underrepresented.
- Impact:
  - Models trained on limited data risk poor generalization and biased decision-making.

Data Quality
- Meaning:
  - Available data may not be clean, representative, or consistently labeled.
- Why it's a problem:
  - Measurements are noisy due to sensors or network logging errors.
  - Labeling mistakes propagate errors into AI models.
  - Data is biased toward specific environments (e.g., urban, indoor) and not generalizable to others.
- Impact:
  - Low-quality data reduces model reliability, leading to unstable or inaccurate predictions.

Data Relevance
- Meaning:
  - Even when data exists, it may not directly match the target AI task or deployment scenario.
- Why it's a problem:
  - LTE datasets may not transfer well to 5G/6G systems.
  - Lab-collected data ignores mobility, blockage, or coexistence effects.
  - Training distributions drift away from real-time operational data.
- Impact:
  - AI performs well in simulation but degrades in live networks (gap between simulation and real-world).

The takeaway is that the three challenges of Scarcity, Quality, and Relevance form the key bottleneck for wireless AI, and that addressing them requires:

Synthetic data generation (digital twins, simulators, ray-tracing),
Federated learning (distributed training without data centralization),
Data curation pipelines (cleaning, validation, domain adaptation).

Overview of Neural Receivers for 5G/6G Systems

A neural receiver is a machine learning-based physical layer receiver that replaces or augments traditional signal processing blocks—such as channel estimation, equalization, and detection—with a unified, data-driven model. In contrast to conventional receivers that rely on handcrafted algorithms and strict mathematical models of the wireless channel, neural receivers learn to perform these operations jointly by training on large datasets of labeled I/Q samples or OFDM resource grids.

Comparison with Traditional Receivers

The table below explains some of the differences between components in traditional receiver architectures and components in neural receiver architectures.

Comparison of Traditional vs Neural Receiver Architectures
Component	Traditional Receiver	Neural Receiver
Channel Estimation	Least Squares (LS), MMSE estimators	Learned directly from pilot and data patterns
Equalization	Zero-Forcing, MMSE equalizers	Implicitly learned during training
Symbol Detection	QAM demodulation, hard/soft decision	Jointly learned with other tasks
Architecture	Modular, deterministic	End-to-end differentiable neural network
Input	OFDM resource grid or raw IQ samples	IQ tensors or pilot+data grid
Output	Estimated bits or LLRs	Bits or probabilities

Typical Neural Receiver Architecture

A commonly used architecture like DeepRx treats the resource grid as a 2D input, similar to an image, where time and frequency correspond to axes. This allows the use of convolutional neural networks (CNNs), recurrent neural networks (RNNs), or transformer-based models.

Input: Complex-valued OFDM resource grid, with pilot and data symbols.
Layers: Convolutional or attention-based layers extract spatial and temporal features.
Output: Recovered symbols or bits with associated confidence scores.

Training Process

The items below describe the training process that was used in this implementation.

The use of the SigMF data format allows for the storage of raw IQ data with comprehensive metadata, which provides context.

Dataset: Generated from link-level simulation tools, such as Sionna, under various channel models (AWGN, LOS, NLOS, 3GPP, etc.).

Format: Datasets are stored in the SigMF format, containing raw IQ samples with metadata.

Loss Function: Cross-entropy or binary cross-entropy; optionally soft LLR loss for reliability-aware decoding.

Optimizer: Adam, SGD, or custom schedulers suitable for low-SNR scenarios.

Deployment Aspects

Trained models are deployed in real-time using TensorRT run-times on edge hardware such as:

GPUs: NVIDIA A100, RTX 4090, RTX 4070, RTX 4060.

Integration: Plugged into OAI physical-layer receiver chains.

Latency: Achieves under 500 μs processing delay for 30 KHz SCS, meeting real-time subframe timing requirements.

Performance Summary

The table below shows a comparison of various performance metrics between traditional receiver architectures and neural receiver architectures.

Performance Comparison of Traditional vs Neural Receiver Architectures
Metric	Traditional Receiver	Neural Receiver
Block Error Rate (BLER) under low SNR	Higher	Lower (up to 3 dB gain)
Complexity	Fixed, low	Tunable, moderate
Latency	Very low	Real-time, under 500 μs
Generalization	Poor to unseen channels	Better with diverse training data
Interpretability	High (white box)	Lower (black box)

Use Cases

The list below highlights some common use-cases for Neural Receivers.

Uplink Neural PHY Receiver: Real-time decoding at gNB.
Channel Tracking: Adaptive to fast-fading and mobility scenarios.
Joint Equalization and Detection: Reduces end-to-end BER and BLER.
Massive MIMO: Scalable to high-dimensional antennas with deep models.

Challenges

There are several challenges to the practical realization of Neural Receivers, as listed below.

Requires extensive datasets for generalization.
Less interpretable compared to traditional receiver implementations.
Hardware deployment must meet strict real-time constraints.
Careful calibration and interface with existing stacks (such as OAI) are needed.

Summary

The neural receiver presents a promising paradigm for future 6G systems by enabling adaptive, intelligent, and performance-enhancing physical-layer decoding using machine learning. When paired with platforms like the NI USRP radios and AI accelerators, it opens the path for real-time AI-native physical-layer design.

System Validation Checklist

To ensure a consistent and reproducible set-up for our AI-enabled wireless testbed, we employ a systematic validation checklist. The table below summarizes the key checks, commands, and expected outputs that must be verified before conducting experiments. This process guarantees that both the hardware (e.g., USRPs, GPUs) and the software (e.g., operating systems, drivers, TensorFlow, UHD) are correctly installed and aligned with the requirements.

The checklist covers three broad areas:

Operating system and hardware readiness: Includes verification of the installed OS, BIOS version, kernel, and GPU drivers.

USRP connectivity and configuration: Ensures that USRPs are discovered, their file systems are compatible with UHD, and network parameters (e.g., MTU size, socket memory) are tuned for high-throughput streaming.

Software stack and runtime optimization: Covers validation of TensorFlow, NVIDIA TensorRT, and TF C-API installation, as well as disabling unnecessary system services (e.g., updates, GNOME display manager) that may negatively impact performance.

This structured approach minimizes setup errors and improves reproducibility across different machines and deployments.

Listing of checks for system bring-up and set-up
Check Item	Command	Desired Output
Operating System	`hostnamectl`	Ubuntu 22.04.5
BIOS version	`sudo dmidecode -s bios-version`	Check system vendor for latest version
Verify GPU	`lspci \| grep -i nvidia`	Example: RTX 4090 may appear as `17:00.0 VGA compatible controller: NVIDIA Corporation Device 2684 (rev a1)`
Nvidia driver version and CUDA version	`nvidia-smi`	Nvidia driver version 535.183.01 and CUDA version 12.2
GPU Load	`nvidia-smi`	Load in percentage
Kernel version	`uname -r`	`6.5.0-44-generic`
Cores operation mode	`cat /sys/devices/system/cpu/cpu*/cpufreq/scaling governor`	All cores should show `performance`
Cores clock rate	`watch -n1 "grep ^[c]pu MHz" /proc/cpuinfo`	Should be larger than base clock rate and less than turbo clock rate
mmm	mmm	mmm
mmm	mmm	mmm
mmm	mmm	mmm
mmm	mmm	mmm
mmm	mmm	mmm
mmm	mmm	mmm
mmm	mmm	mmm
mmm	mmm	mmm
mmm	mmm	mmm
mmm	mmm	mmm

@@ Line 312: / Line 312: @@
 | Kernel version || <code>uname -r</code> || <code> 6.5.0-44-generic </code>
 |-
-| Cores Operation Mode || <code> cat /sys/devices/system/cpu/cpu*/cpufreq/scaling governor </code> || All cores should list with <code> performance </code>
+| Cores operation mode || <code> cat /sys/devices/system/cpu/cpu*/cpufreq/scaling governor </code> || All cores should show <code> performance </code>
 |-
-| mmm || mmm || mmm
+| Cores clock rate || <code> watch -n1 "grep ^[c]pu MHz" /proc/cpuinfo </code> || Should be larger than base clock rate and less than turbo clock rate
 |-
 | mmm || mmm || mmm