Nvidia Rubin NVL72 report points to memory cut amid supply-chain constraints

Nvidia Rubin NVL72 report points to memory cut amid supply-chain constraints

SemiAnalysis said the AI server cluster reduced planned per-rack memory capacity and rack cost, while institutions later said the change affects only CPU-side pluggable memory rather than GPU-linked HBM demand.

Fact Check
Multiple independent sources confirm both elements of the claim. The BlockBeats 349554 and Odaily reports directly cite the SemiAnalysis June 4 research describing Rubin NVL72 per-rack memory cut from 55TB to 28TB and rack cost reductions, attributed to supply-chain constraints (specifically 192GB SOCAMM module shortage). The BlockBeats 349820 and 349560 reports show institutional clarifications that the change affects only CPU-side pluggable SOCAMM memory rather than GPU-linked HBM demand. Reddit AMD_Stock discussion provides additional corroboration. SemiAnalysis primary report not directly fetched (paywalled), but founder Dylan Patel's acknowledgment via Odaily confirms the underlying report exists, while contesting some framing.
Summary

Nvidia’s next-generation Rubin NVL72 AI server cluster remained in focus after SemiAnalysis said planned per-rack memory capacity would be reduced to 28TB from 55TB, with most systems using 96GB SOCAMM modules instead of the planned 192GB. The report also said the change would lower rack cost to $6.8 million from $7.6 million and triggered a global pullback in storage-related stocks. Institutions later said the reduction applies only to CPU-side pluggable memory, while demand for high-bandwidth memory tied to GPU computing remains intact, softening concerns about a broader hit to AI-memory demand.

Terms & Concepts
  • Rubin NVL72: Nvidia’s next-generation AI server cluster discussed in the report.
  • SOCAMM: A pluggable memory module format referenced in Rubin server configurations.
  • high-bandwidth memory: A type of advanced memory used alongside GPUs to support intensive AI computing workloads.