Difference between revisions of "5G OAI Neural Receiver Testbed with USRP X410"
NeelPandeya (Talk | contribs) |
NeelPandeya (Talk | contribs) |
||
| Line 312: | Line 312: | ||
| Kernel version || <code>uname -r</code> || <code> 6.5.0-44-generic </code> | | Kernel version || <code>uname -r</code> || <code> 6.5.0-44-generic </code> | ||
|- | |- | ||
| − | | Cores | + | | Cores operation mode || <code> cat /sys/devices/system/cpu/cpu*/cpufreq/scaling governor </code> || All cores should show <code> performance </code> |
|- | |- | ||
| − | | | + | | Cores clock rate || <code> watch -n1 "grep ^[c]pu MHz" /proc/cpuinfo </code> || Should be larger than base clock rate and less than turbo clock rate |
|- | |- | ||
| mmm || mmm || mmm | | mmm || mmm || mmm | ||
Revision as of 21:52, 30 October 2025
Contents
Application Note Number and Authors
AN-829
Authors
Bharat Agarwal and Neel Pandeya
Executive Summary
Overview
This Application Note presents a practical, system-level benchmarking platform leveraging NI USRP software-defined radios (SDRs) and the OpenAirInterface (OAI) 5G/NR stack for evaluating AI-enhanced wireless receivers in real-time. It addresses one of the key challenges in deploying AI/ML at the physical layer-ensuring reliable system performance under real-time constraints.
Motivation and Context
AI and ML techniques hold promise for improving both wireless and non-wireless KPIs across the stack, from core-level optimization (e.g., load balancing, power savings), to tightly-timed PHY/MAC innovations such as:
- ML-based digital predistortion to improve power efficiency.
- Neural receivers for channel estimation and symbol detection with improved SNR tolerance.
- Intelligent beam and positioning prediction, even under fast channel dynamics.
Consortia such as 3GPP (Release-18/19) and O-RAN are actively defining how AI/ML can be incorporated into future cellular network standards.
Neural Receiver Model
We demonstrate a real-time implementation of a neural receiver that is based on a published model architecture called DeepRX, which replaces the traditional OFDM receiver blocks (channel estimation, interpolation, equalization, detection) with a single neural network that treats the time-frequency grid data as image-like input. Model training and validation are performed using the NVIDIA Sionna link-level simulator, and training data is stored using the open SigMF format for reproducibility.
More information about the SigMF file format can be found on the project website, on the Wikipedia page, and on the GitHub page.
Real-Time Benchmarking Platform
To validate the performance of the neural receiver in real hardware, the prototype integrates:
- The OAI real-time 5G protocol stack (complete core, RAN, and UE) running on commodity CPUs.
- NI USRP SDR hardware as the RF front-end.
- An optional O-RAN Near-RT RIC (via FlexRIC) integration.
- Neural receiver inference performed on a GPU (e.g., Nvidia A100, RTX 4070, RTX 4090), accessed via TensorFlow RT C-API for seamless integration within OAI.
This setup enables a direct comparison between the traditional receiver baseline against the neural receiver in an end-to-end real-time system.
Benchmarking Results
Initial testing focuses on uplink performance using various MCS levels (MCS-11, MCS-15, MCS-20 are specifically highlighted in this document) and SNR ranges (5 dB to 18 dB) under a realistic fading channel profile (urban micro, 2 m/s, 45ns delay spread). Each measurement is averaged over 300 transport blocks.
Some of the key findings are listed below.
- The neural receiver shows a clear Bit Error Rate (BER) advantage at lower MCS and lower SNR.
- At higher MCS levels, the performance gap narrows (a trade-off that merits further analysis).
- A reduced uplink bandwidth was used to meet strict real-time latency requirements (500 μs slot duration with 30 KHz SCS).
- The neural receiver model complexity was reduced by 15 times (from 700K to 47K parameters) to achieve real-time GPU inference.
These results underscore the crucial balance between complexity, latency, and performance in AI-enhanced wireless physical-layer deployments.
Conclusions and Implications
The testbed demonstrates a realistic path from simulation to real-time deployment of neural receiver models. This workflow supports rapid prototyping, robust AI model validation, and exploration of architecture-performance trade-offs.
Some key takeaways are listed below.
- AI/ML models can be efficiently integrated into real-time wireless stacks using SDR hardware and GPU inference.
- Low-complexity models offer promising performance improvements while satisfying real-time constraints.
- Synchronized dataset generation and automated test workflows enable scalable ML benchmarking across scenarios.
- The framework allows researchers to investigate unexpected behaviors and robustness in AI-native wireless systems.
Ultimately, the methodology bridges AI/ML conceptual research and realistic deployment, advancing trust and utility in AI-powered future wireless systems.
Hardware Overview
The Universal Software Radio Peripheral (USRP) devices from NI (an Emerson company) are software-defined radios which are widely used for wireless research, prototyping, and education. The hardware specifications for the various USRP devices are listed elsewhere on this Knowledge Base (KB). For this Neural Receiver implementation described in this document, we use the USRP X410. The USRP X440 may also be used, with some further adjustments to the system configuration.
The resources for the USRP X410 are listed below.
The Hardware Resource page for the USRP X410 can be found here.
The product page for the USRP X410 can be found here.
The User Manual for the USRP X410 can be found here.
The resources for the USRP X440 are listed below.
The Hardware Resource page for the USRP X410 can be found here.
The product page for the USRP X410 can be found here.
The User Manual for the USRP X410 can be found here.
The USRP X410 is connected to the host computer using a single QSFP28 100 Gbps Ethernet link, or using a QSFP28-to-SFP28 breakout cable, which provides four 25 Gbps SFP28 Ethernet links. On the host computer, a 100 Gbps or 25 Gbps Ethernet network card is used to connect to the USRP.
The USRP X410 devices are synchronized with the use of a 10 MHz reference signal and a 1 PPS signal, distributed from a common source. This can be provided by the OctoClock-G (see here and here for more information).
For control and management of the USRP X410, a 1 Gbps Ethernet connection to the host computer is needed, as well as a USB serial console connection.
Further details of the hardware configuration will be discussed later in this document.
Software Overview
The software stack running on the computers used in this implementation are as listed below.
- Ubuntu 22.04.5, running on-the-metal, and not in any Virtual Machine (VM)
- UHD version 4.8.0.0
- Nvidia drivers version 535
- Nvidia CUDA version 12.2
- TensorFlow 2.14
For the OAI gNB, the OAI UE, and the FlexRIC, there will be NI-specific versions used, and these will be obtained from an NI repository on GitHub.
Note that the Data Plane Development Kit (DPDK) is not used in this implementation. However, it may may be helpful when using higher sampling rates.
Further details of the software configuration will be discussed later in this document.
AI in 6G
The figure listed below highlights the vision for sixth-generation (6G) wireless systems. Beyond incremental improvements, 6G introduces three major advances, as listed below.
- Spectrum Expansion: Extending from traditional sub-6 GHz and mmWave bands into FR3 (7 to 24 GHz) and sub-THz (up to 300 GHz), enabling ultra-wide bandwidths and unprecedented data rates.
- New Applications: Integration of non-terrestrial networks (NTN) with terrestrial infrastructure and joint communication-and-sensing (JCAS) functionalities, supporting use cases such as connected vehicles, satellite-augmented IoT, and immersive XR.
- Network Optimization: Advancements in massive MIMO, multi-user beamforming, and Open RAN disaggregation, improving spectral efficiency, flexibility, and energy sustainability.
Across these pillars, embedded and trustworthy AI is the key enabler, providing intelligence for spectrum management, adaptive receivers, and end-to-end optimization.
These trends highlight that 6G will operate in highly challenging environments with wideband sub-THz channels, dynamic non-terrestrial links, and complex multi-user MIMO topologies. Traditional linear detection techniques such as ZF or MMSE struggle to cope with hardware non-idealities, nonlinear channel distortions, and the stringent latency and reliability targets of 6G. To address these limitations, the concept of a Neural Receiver has emerged. By embedding deep learning models directly into the receiver chain, neural receivers can learn from real measured impairments, jointly optimize channel estimation and detection, and deliver significant performance gains over classical approaches. This makes neural receivers a key building block for realizing the vision of embedded, trustworthy AI in 6G physical layer design.
5G To 6G Roadmap
The figure listed below illustrates the expected timeline from ongoing 5G research through to the first 6G deployments.
- 5G (Release-16 to Release-18): 3GPP initiated 5G specification development in Release-15 and Release-16, followed by commercial deployments from 2019 onward. Work on Release-17 and Release-18 (2021 to 2024) extends 5G capabilities in areas such as URLLC, industrial IoT, and positioning.
- 5G-Advanced (Release-18 to Release-20): Industry research and specification development converge to define 5G-Advanced features. Deployments are expected around 2025 to 2027, focusing on improved energy efficiency, AI/ML-native functions, and expanded NTN integration.
- 6G (Release-20 onward): Formal 6G technology studies will begin with Release-20 in H2 2025, marked by the first official 6G workshop in March 2025. Standardization of 6G specifications is planned for Release-21 in 2027, with early 6G deployments projected for the end of the decade (around 2030).
The figure above highlights the transition from 5G deployments to the research and standardization cycles of 5G-Advanced and 6G. This staged process ensures backward compatibility, while paving the way for disruptive innovations in spectrum use, AI-native networks, and new application domains.
As shown in the figure above, the transition from 5G to 6G is not only a matter of spectrum expansion and new use cases, but also of embedding AI-native functionalities into the air interface itself. Release-20 (2025) will mark the start of 6G technology studies, providing an opportunity to evaluate disruptive physical layer techniques such as Neural Receivers. These receivers directly integrate deep learning models into the detection chain, enabling them to cope with nonlinearities, hardware impairments, and the extreme bandwidths expected in FR3 and sub-THz bands. By Release 21 (2027), as 6G specifications are defined, neural receivers and other AI-based PHY innovations will play a crucial role in realizing the vision of AI-native 6G, where intelligence is embedded from the physical layer up to the application layer.
Three Challenges of Data for AI in Wireless
This section highlights the three main challenges hindering the seamless integration of AI into wireless communication systems. The challenges are listed with increasing levels of AI readiness.
- Data Scarcity
- Meaning:
- Wireless networks often lack sufficient labeled and diverse datasets.
- Why it's a problem:
- Collecting and labeling large wireless datasets is expensive and time-consuming.
- Real-time data is sparse or kept proprietary by operators/vendors.
- Rare but critical scenarios (handover failures, deep fades, interference spikes) are underrepresented.
- Impact:
- Models trained on limited data risk poor generalization and biased decision-making.
- Meaning:
- Data Quality
- Meaning:
- Available data may not be clean, representative, or consistently labeled.
- Why it's a problem:
- Measurements are noisy due to sensors or network logging errors.
- Labeling mistakes propagate errors into AI models.
- Data is biased toward specific environments (e.g., urban, indoor) and not generalizable to others.
- Impact:
- Low-quality data reduces model reliability, leading to unstable or inaccurate predictions.
- Meaning:
- Data Relevance
- Meaning:
- Even when data exists, it may not directly match the target AI task or deployment scenario.
- Why it's a problem:
- LTE datasets may not transfer well to 5G/6G systems.
- Lab-collected data ignores mobility, blockage, or coexistence effects.
- Training distributions drift away from real-time operational data.
- Impact:
- AI performs well in simulation but degrades in live networks (gap between simulation and real-world).
- Meaning:
The takeaway is that the three challenges of Scarcity, Quality, and Relevance form the key bottleneck for wireless AI, and that addressing them requires:
- Synthetic data generation (digital twins, simulators, ray-tracing),
- Federated learning (distributed training without data centralization),
- Data curation pipelines (cleaning, validation, domain adaptation).
Overview of Neural Receivers for 5G/6G Systems
A neural receiver is a machine learning-based physical layer receiver that replaces or augments traditional signal processing blocks—such as channel estimation, equalization, and detection—with a unified, data-driven model. In contrast to conventional receivers that rely on handcrafted algorithms and strict mathematical models of the wireless channel, neural receivers learn to perform these operations jointly by training on large datasets of labeled I/Q samples or OFDM resource grids.
Comparison with Traditional Receivers
The table below explains some of the differences between components in traditional receiver architectures and components in neural receiver architectures.
| Component | Traditional Receiver | Neural Receiver |
|---|---|---|
| Channel Estimation | Least Squares (LS), MMSE estimators | Learned directly from pilot and data patterns |
| Equalization | Zero-Forcing, MMSE equalizers | Implicitly learned during training |
| Symbol Detection | QAM demodulation, hard/soft decision | Jointly learned with other tasks |
| Architecture | Modular, deterministic | End-to-end differentiable neural network |
| Input | OFDM resource grid or raw IQ samples | IQ tensors or pilot+data grid |
| Output | Estimated bits or LLRs | Bits or probabilities |
Typical Neural Receiver Architecture
A commonly used architecture like DeepRx treats the resource grid as a 2D input, similar to an image, where time and frequency correspond to axes. This allows the use of convolutional neural networks (CNNs), recurrent neural networks (RNNs), or transformer-based models.
- Input: Complex-valued OFDM resource grid, with pilot and data symbols.
- Layers: Convolutional or attention-based layers extract spatial and temporal features.
- Output: Recovered symbols or bits with associated confidence scores.
Training Process
The items below describe the training process that was used in this implementation.
The use of the SigMF data format allows for the storage of raw IQ data with comprehensive metadata, which provides context.
- Dataset: Generated from link-level simulation tools, such as Sionna, under various channel models (AWGN, LOS, NLOS, 3GPP, etc.).
- Format: Datasets are stored in the SigMF format, containing raw IQ samples with metadata.
- Loss Function: Cross-entropy or binary cross-entropy; optionally soft LLR loss for reliability-aware decoding.
- Optimizer: Adam, SGD, or custom schedulers suitable for low-SNR scenarios.
Deployment Aspects
Trained models are deployed in real-time using TensorRT run-times on edge hardware such as:
- GPUs: NVIDIA A100, RTX 4090, RTX 4070, RTX 4060.
- Integration: Plugged into OAI physical-layer receiver chains.
- Latency: Achieves under 500 μs processing delay for 30 KHz SCS, meeting real-time subframe timing requirements.
Performance Summary
The table below shows a comparison of various performance metrics between traditional receiver architectures and neural receiver architectures.
| Metric | Traditional Receiver | Neural Receiver |
|---|---|---|
| Block Error Rate (BLER) under low SNR | Higher | Lower (up to 3 dB gain) |
| Complexity | Fixed, low | Tunable, moderate |
| Latency | Very low | Real-time, under 500 μs |
| Generalization | Poor to unseen channels | Better with diverse training data |
| Interpretability | High (white box) | Lower (black box) |
Use Cases
The list below highlights some common use-cases for Neural Receivers.
- Uplink Neural PHY Receiver: Real-time decoding at gNB.
- Channel Tracking: Adaptive to fast-fading and mobility scenarios.
- Joint Equalization and Detection: Reduces end-to-end BER and BLER.
- Massive MIMO: Scalable to high-dimensional antennas with deep models.
Challenges
There are several challenges to the practical realization of Neural Receivers, as listed below.
- Requires extensive datasets for generalization.
- Less interpretable compared to traditional receiver implementations.
- Hardware deployment must meet strict real-time constraints.
- Careful calibration and interface with existing stacks (such as OAI) are needed.
Summary
The neural receiver presents a promising paradigm for future 6G systems by enabling adaptive, intelligent, and performance-enhancing physical-layer decoding using machine learning. When paired with platforms like the NI USRP radios and AI accelerators, it opens the path for real-time AI-native physical-layer design.
System Validation Checklist
To ensure a consistent and reproducible set-up for our AI-enabled wireless testbed, we employ a systematic validation checklist. The table below summarizes the key checks, commands, and expected outputs that must be verified before conducting experiments. This process guarantees that both the hardware (e.g., USRPs, GPUs) and the software (e.g., operating systems, drivers, TensorFlow, UHD) are correctly installed and aligned with the requirements.
The checklist covers three broad areas:
- Operating system and hardware readiness: Includes verification of the installed OS, BIOS version, kernel, and GPU drivers.
- USRP connectivity and configuration: Ensures that USRPs are discovered, their file systems are compatible with UHD, and network parameters (e.g., MTU size, socket memory) are tuned for high-throughput streaming.
- Software stack and runtime optimization: Covers validation of TensorFlow, NVIDIA TensorRT, and TF C-API installation, as well as disabling unnecessary system services (e.g., updates, GNOME display manager) that may negatively impact performance.
This structured approach minimizes setup errors and improves reproducibility across different machines and deployments.
| Check Item | Command | Desired Output |
|---|---|---|
| Operating System | hostnamectl |
Ubuntu 22.04.5 |
| BIOS version | sudo dmidecode -s bios-version |
Check system vendor for latest version |
| Verify GPU | lspci | grep -i nvidia |
Example: RTX 4090 may appear as 17:00.0 VGA compatible controller: NVIDIA Corporation Device 2684 (rev a1)
|
| Nvidia driver version and CUDA version | nvidia-smi |
Nvidia driver version 535.183.01 and CUDA version 12.2 |
| GPU Load | nvidia-smi |
Load in percentage |
| Kernel version | uname -r |
6.5.0-44-generic
|
| Cores operation mode | cat /sys/devices/system/cpu/cpu*/cpufreq/scaling governor |
All cores should show performance
|
| Cores clock rate | watch -n1 "grep ^[c]pu MHz" /proc/cpuinfo |
Should be larger than base clock rate and less than turbo clock rate |
| mmm | mmm | mmm |
| mmm | mmm | mmm |
| mmm | mmm | mmm |
| mmm | mmm | mmm |
| mmm | mmm | mmm |
| mmm | mmm | mmm |
| mmm | mmm | mmm |
| mmm | mmm | mmm |
| mmm | mmm | mmm |
| mmm | mmm | mmm |