Intelligence Report: China's Leap in High-Precision Analog Computing (40 nm RRAM) – Architectural Evasion and Dual-Use Superiority in 6G and Electronic Warfare

Gabriele Iuvinale
2 giorni fa
Tempo di lettura: 10 min

I. Executive Summary: The Analog Breakthrough and Sanctions Evasion

The innovation presented by the team led by Dr. Sun Zhong, affiliated with the Peking University Artificial Intelligence Research Institute, marks a fundamental turning point in computer architecture. The team successfully developed a scalable, high-precision analog matrix processing chip based on Resistive Random-Access Memory (RRAM), resolving the century-old "precision bottleneck" that had limited analog computing's applicability. The results, published (1) in the international academic journal Nature Electronics, demonstrate the feasibility of analog computation with precision comparable to high-end digital computing.

Computer chip with Chinese flag - GettyImages

This chip is not an incremental improvement but an architectural transformation with profound strategic implications. It experimentally demonstrated the ability to solve matrix inversion problems with 24-bit fixed-point precision (comparable to 32-bit floating point, FP32). Performance evaluations show a potential increase in computational throughput up to 1,000 times higher and energy efficiency over 100 times better than current high-end digital GPUs for solving complex linear algebra problems.

Geopolitical Context and Self-Sufficiency

In the context of global technological competition and the "chip war" between the United States and China, where Washington imposes strict export restrictions on advanced semiconductors (such as Nvidia H100 and A100 GPUs) and cutting-edge fabrication equipment, this discovery holds exceptional strategic value for Beijing.The Chinese government has responded by accelerating its technological autonomy strategy, for instance, requiring public data centers to use at least 50% domestic chips by 2025.

The Value of Domestic High-End Chips:

The discovery proves that it is practicable to domestically develop and produce specialized, ultra-high-performance accelerators for critical linear algebra tasks—the core of AI and 6G communications—without relying on Western fabrication technology.

Leveraging Mature Nodes: The Peking University chip was fabricated on a commercial 40-nm CMOS platform. This node is considered legacy (mature), is widely accessible, and is not subject to the most severe US export restrictions.
Vertical Architectural Innovation: By utilizing the alternative architecture of analog in-memory computing (IMC), China can leapfrog the need for advanced nanometer nodes to gain a massive performance advantage (up to 1,000× throughput) in specific algorithmic classes. This success validates the national strategy (PRC) of achieving indigenous resilience and architectural superiority, bypassing process technology restrictions.

Dual-Use Implications:

Military (MI): The primary advantage is decision speed. The atomic operational latency for matrix inversion, measured at approximately 120 nanoseconds (ns) for the Low-Precision Inversion (LP-INV) operation , combined with the near O(1) time complexity , is a critical enabler for next-generation Electronic Warfare (EW) and adaptive radar systems, where delays measured in milliseconds are unacceptable .
Commercial (CI): The 100× energy efficiency gain directly translates into a profound transformation of the economics of data centers and 6G Massive MIMO infrastructure. It forecasts a drastic reduction in the Total Cost of Ownership (TCO) for operators (OpEx), as energy and cooling costs are significantly minimized.

The successful demonstration in Massive MIMO signal detection (128×8 in just three iterations) confirms the potential of this architecture to reshape the global computing landscape.

II. The Digital Crisis and the Return of Analog Computing

A. The Von Neumann Bottleneck and the Matrix Burden

Modern computing is built around the von Neumann architecture, characterized by the physical separation between the processing unit (CPU or GPU) and memory. This separation creates the "von Neumann bottleneck," where the incessant transfer of large amounts of data limits system speed and energy efficiency.

Most computationally intensive tasks in the Big Data era, including signal processing in communication base stations, solving differential equations in scientific computing, and optimizing parameters for training large Artificial Intelligence models, essentially boil down to solving complex matrix equations (Ax=b). Specifically, matrix inversion is a computationally demanding task. Standard digital methods to achieve high-precision inversion scale with typically polynomial complexity, such as O(N3), where N is the matrix size. Facing rapidly growing data volumes, this computational burden and high energy requirement have become an insurmountable obstacle.

B. Analog Computing: Intrinsic Advantages and Historical Challenges

In response to the limits of digital computing, analog technology has re-emerged as a vital alternative. Unlike digital systems that represent data in discrete increments (bits 0 and 1), analog systems use continuous physical quantities, such as voltage, current, or conductance, to represent and manipulate data.

The fundamental advantage of analog computing, particularly for linear algebra, is its ability to directly exploit physical laws (Ohm's and Kirchhoff's laws) to perform parallel calculations, eliminating the need for discrete iterations for multiplication and accumulation. Matrix-Vector Multiplication (MVM), the dominant operation, is performed in a single physical step in a resistive memory array, drastically reducing latency. For systems fitting within the array size, the time complexity of an MVM operation can be considered O(1).

However, the historical flaw that relegated analog computing was its inherent lack of precision and scalability. Analog operations are sensitive to noise, non-linearities, and device variability. Making analog processing sufficiently precise and scalable for modern scientific computing tasks had been considered an unresolved problem for a century.

C. RRAM as a Platform for In-Memory Computing

Resistive Random-Access Memory (RRAM) is the core technology chosen by the Peking University team to implement Analog Matrix Computing (AMC). RRAM is a non-volatile memory where the conductance of each device (controllable via voltage pulses) acts as a matrix element, storing weights.

The demonstration was realized on TaOx-based RRAM chips, fabricated on a commercial 40-nm CMOS platform. The cells use a 1T1R (One-Transistor-One-Resistor) structure. A crucial element for precision was achieving eight discrete conductance levels per cell, equivalent to 3 bits of resolution. The reliability of these levels was guaranteed via a write-verify method, achieving 100% programming success across 400 tested cells.

III. The Peking University Breakthrough: Achieving Digital Precision

The true innovation of Dr. Sun Zhong’s team lies in the co-design of devices, circuits, and algorithms, culminating in the HP-INV (High-Precision INVersion) scheme, which is fully analog yet enhanced by digital-like precision techniques.

A. The Iterative HP-INV Scheme: LP-INV and Bit-Slicing for HP-MVM

The HP-INV scheme is conceptually an iterative refinement algorithm implemented in the analog domain. It combines two main operations: Low-Precision Inversion (LP-INV) and High-Precision Matrix-Vector Multiplication (HP-MVM).

1. LP-INV Operation

An 8×8 RRAM array chip is configured as a closed-loop circuit to execute inversion in a single step, providing the approximate initial solution in about 120 ns. Although the operation itself may have limited accuracy (experimentally, around −2.4 bits of precision in some cases), its role is to provide a rapid starting point, significantly reducing the total number of cycles needed for final convergence.

2. HP-MVM Operation via Bit-Slicing

To ensure that iterative refinement converges to the desired accuracy, the residual error must be calculated with high precision.

The team achieved HP-MVM using the bit-slicing technique. The high-precision matrix A (up to 24 bits) is sliced into several low-precision matrices (3 bits each). These slice matrices are mapped onto separate RRAM arrays (a 1-Mb RRAM chip was used for this). Each array performs a Low-Precision MVM (LP-MVM), and the partial results are then combined via digital shift-and-add operations to synthesize the high-fidelity HP-MVM output.

This combination allowed the experimental solution of a 16×16 real-valued matrix inversion problem with a 24-bit fixed-point precision (FP32 equivalent). After 10 matrix equation solving iterations, the relative error was reduced to the order of 10−7

B. Scalability Achieved via BlockAMC

To extend the chip's application to larger matrices, the BlockAMC (Block Analog Matrix Computing) algorithm was integrated. Since physical RRAM arrays for the LP-INV circuit have size limitations (the demonstrated circuit was 8×8), BlockAMC partitions large matrices into smaller blocks, enabling sequential or parallel processing across multiple arrays.

The asymptotic complexity analysis shows that the number of operations required for inversion increases in proportion to O(N1.59) and for HP-MVM in proportion to O(N2). While MVM on a single array is O(1), scalability through BlockAMC is still drastically better than the inherent O(N3) complexity of digital processors.

IV. Performance Benchmarking and Commercial Impact

A. Quantitative Benchmarking and Energy Advantages

Comparative benchmarking highlighted a significant performance gap between the Peking University analog architecture and standard digital processors, while maintaining parity in the final achieved precision (equivalent to FP32).

The comparison was made against the single-core performance of high-end digital GPUs, such as Nvidia H100 and AMD Vega 20, for matrix inversion workloads.

Peak Performance and Efficiency (FP32 Equivalent Precision)

Scenario	Metric	PKU Analog Chip (BlockAMC)	High-End Digital GPU (Single Core)	Advantage (Factor)
INV Resolution ()	Computing Power	Surpasses single-core GPU	1x
INV Resolution ()	Equivalent Throughput	Over Higher	1x (GPU Core)
Energy Efficiency	Operations per Watt	Over Better	1x
Algorithmic Complexity (INV)	Scaling	(asymptotic)	¹	Structural Advantage

For high-intensity linear algebra problems, this indicates a transformation in productivity. The chip's computing capability can complete the work of a traditional GPU's entire day in just one minute. The measured transient response time for the LP-INV operation was approximately 120 ns, and for MVM, about 60 ns.

B. Commercial Intelligence: 6G Infrastructure and Data Center TCO

1. 6G Infrastructure and Operational Costs (OpEx)

The immediate and strategically resonant application is signal detection in Massive MIMO systems, crucial for enhancing wireless communications in the 5G-A and 6G eras.

The team successfully applied HP-INV to a 128×8 Massive MIMO system using 256-QAM modulation. Results showed that after only three iterations , the Bit Error Rate (BER) was comparable to that obtained with 32-bit digital computing (FP32).

The documented 100× energy efficiency gain drives a deep economic disruption in telecommunications. 6G networks require extreme energy efficiency. The 100-fold reduction in energy needed for Massive MIMO processing translates into billions of dollars in operational cost (OpEx) savings for carriers, as energy and cooling costs are primary components of TCO.

2. Data Center Efficiency and PUE

The 100× efficiency directly impacts the key data center metric: Power Usage Effectiveness (PUE). Since AI processing and scientific computing are matrix-intensive, a 100-fold reduction in energy required per IT compute unit significantly reduces the total energy consumed per workload. This operational advantage allows for data centers with significantly reduced physical footprints (CapEx), as cooling and space requirements are minimized. This capability is critical for achieving high-efficiency cloud computing centers.

V. Military and Geopolitical Intelligence: Dual-Use Superiority

A. Military Intelligence: Dominance in Low-Latency Signal Processing

The HP-INV's combination of sub-microsecond speed, ultra-high efficiency, and FP32 precision makes it a fundamental enabling technology for next-generation military systems operating in the electromagnetic spectrum.

1. Criticality of Speed in Electronic Warfare (EW)

Electronic Warfare (EW) systems have extremely strict latency requirements. Unlike radar, EW must respond to a threat in nanoseconds, whereas radar can tolerate latencies measurable in milliseconds . The operational goal is for cognitive EW systems to adapt and react to new threats at waveform speeds.

The AMC provides a decisive edge. The convergence time of the LP-INV operation, measured at approximately 120 ns, enables near-instantaneous response capability for critical functions such as adaptive filtering, covariance matrix inversion, or rapid jamming parameter calculation. This fundamental shift, from a mathematical complexity limit O(N3) to a physical time limit O(1) for critical operations , allows electronic countermeasure systems to react with unprecedented readiness.

2. Application to Adaptive Radar and Massive MIMO Radar

Advanced military radar systems, particularly active electronically scanned array (AESA) radars, require the real-time numerical inversion of large matrices. The chip's proven ability to solve a 128×8 Massive MIMO system with FP32 precision in just three iterative cycles validates its immediate utility for high-speed radar and next-generation communication applications. This acceleration ensures that latency is minimized, which is essential for tasks like precoding and decoding in large-scale phased arrays.

B. Geopolitics and the 40-nm Architectural Evasion

The technical development must be viewed through the lens of the PRC's aggressive pursuit of technological self-sufficiency and the Military-Civil Fusion (MCF) strategy .

1. Strategic Bypass of Export Controls

The US strategy focuses on blocking China’s access to advanced chips (≤12 nm) and the extreme ultraviolet (EUV) lithography equipment required for their manufacture.

The success of the HP-INV solver on a commercial 40-nm CMOS platform constitutes a clear evasion of these controls. It demonstrates that architectural innovation—specifically analog in-memory computing—can deliver crucial military capabilities (high-precision, low-latency signal processing) without dependence on advanced Western lithography .

This move validates the Chinese strategy of massive investment in R&D for non-traditional architectures to bypass process limitations.

2. Alignment with Military-Civil Fusion (MCF)

The research, originating from Peking University , aligns perfectly with the MCF strategy, which aims to leverage civilian sector research (academia and private industry) for the modernization of the People's Liberation Army (PLA) . The high-efficiency, dual-use nature of this chip makes it a prime candidate for rapid spin-on into military systems, providing capabilities like reduced Size, Weight, Power, and Cost (SWaP-C) for deployable systems.

Comparative Strategic Analysis

Parameter	Analog RRAM Solver (China, PKU)	Advanced Digital GPU (USA, Nvidia H100)	Geopolitical Implication
Fabrication Node	40-nm CMOS	4-nm/5-nm/7-nm	Immunity to advanced node restrictions; domestic production guaranteed.
Vulnerability to Sanctions	Low (Legacy Node Production)	High (Dependence on advanced equipment)	Strategic resilience of the Chinese supply chain; circumvents export controls.
Computing Architecture	Analog In-Memory Computing (IMC)	Von Neumann Architecture (Digital Cores/Tensor Cores)	Exploration of a fundamentally more efficient, alternative path for matrix-intensive workloads.
Primary Strategic Application	High-Efficiency Scientific Computing, 6G Signal Detection, EW	Training Large LLM/Generative AI Models	Immediate and specific advantage in critical communication infrastructure and defense.

VI. Strategic Outlook and Recommendations

Based on the analysis, this technological leap demands a strategic re-evaluation by competing nations.

A. Long-Term Technological Trajectory

While current demonstrations are limited by array size, future research will focus on scaling the LP-INV circuit size beyond 8×8 (projected up to 32×32) and further minimizing latency by improving peripheral analog components, such as high Gain-Bandwidth Product (GBWP) operational amplifiers (OPAs), potentially reaching 20 ns INV / 10 ns MVM response times. This sustained focus aims to cement the architectural advantage over digital systems.

B. Strategic Recommendations

Revise Export Control Strategy: Export control parameters must be immediately revised and broadened. The current focus on lithography node size is insufficient to contain architectural innovation. Controls should be extended to target analog in-memory computing components, heterogeneous chiplet integration, and specialized design tools that enable analog refinement algorithms, mitigating the "architectural leap" .
Mandatory Investment in Analog R&D (Military/R&D): Significantly increase funding and accelerate technology translation pathways for heterogeneous and analog computing technologies. The focus must be specifically on matching or exceeding the HP-INV latency and efficiency benchmarks (e.g., surpassing the ns INV latency) for critical EW and adaptive radar systems. Competing on energy efficiency (Ops/W) is a national security imperative.
Redefine TCO Metrics (Commercial/Economic): Industry leaders and regulators must proactively update TCO models and efficiency standards for 6G infrastructure and AI data centers. The long-term economic superiority will be defined by energy efficiency. Aggressive investment in proprietary or allied IMC solutions is essential to maintain market competitiveness against potentially lower-cost, high-efficiency foreign infrastructures.
Rapid Vulnerability Assessment (Military): Initiate a comprehensive assessment of existing US/NATO signal processing systems (radar, EW, C4ISR) to quantify the performance gap created by the HP-INV's throughput advantage and ns latency. This is crucial for rapidly identifying systems most vulnerable to being tactically outpaced by AMC-enabled platforms.

Zuo, P., Wang, Q., Luo, Y. et al. Precise and scalable analogue matrix equation solving using resistive random-access memory chips. Nat Electron (2025). https://doi.org/10.1038/s41928-025-01477-0