INFRASTRUCTURE
Blackwell
Overview
Blackwell is NVIDIA‘s flagship 2024–2026 AI GPU architecture (B100 / B200 / GB200 / GB300 product family) — the predecessor to the Rubin platform. The architecture’s load-bearing 2026 feature for the AI Digest corpus is NVFP4, the native FP4 path for both training (mixed FP8 + FP4) and inference. NVFP4 is the substrate that the May 10 DeepSeek V4 paper validates: V4’s FP4 quantization-aware-trained MoE expert weights are built FOR Blackwell’s FP4 path, with NVIDIA‘s own developer blog promoting the integration. The cost-curve pressure from FP4 lands on FP8-era incumbents, not on Blackwell.
Timeline
- 2026-05-05-AI-Digest — NVIDIA opens the Rubin platform for H2 2026 cloud distribution, framed against Blackwell with headline performance claims of 3.5× training throughput, 5× inference throughput, 8× power efficiency. Microsoft’s Fairwater data-centre sites in Wisconsin and Atlanta already operating Vera Rubin NVL72 racks. The Blackwell → Rubin generational shift narrative tracked since March 2026 closes its distribution loop.
- 2026-05-08-AI-Digest — xAI leases the entirety of Colossus 1 (222,000 H100 / H200 / GB200 GPUs, 300+ MW) to Anthropic for Claude serving — a single-counterparty Blackwell + Hopper inventory placement that doubles as a surplus-monetisation read on Grok serving load.
- 2026-05-10-AI-Digest — DeepSeek V4’s full paper lands with FP4 quantization-aware training applied to MoE expert weights (FP8 elsewhere in the stack), real FP4 weights used during inference and RL rollout. Reddit framing — “FP4 end-to-end resets the cost curve and pressures NVIDIA’s Blackwell FP4 narrative” — overshoots: V4 is FP8+FP4 mix, not end-to-end FP4, and is built FOR Blackwell’s NVFP4 path. NVIDIA‘s developer blog promotes the integration. Cleaner read: V4 is a Blackwell validator, not a Blackwell threat.
Key Developments
-
NVFP4 as the 2026 Substrate: Blackwell’s native FP4 path is the hardware feature open-weights frontier MoE training is now targeting. The DeepSeek V4 FP4 QAT result is the first open-weights frontier MoE with FP4 expert weights and a co-released FP4 train+serve stack — and it is built for, not against, the NVFP4 path.
-
Generational Position vs. Rubin: Rubin’s 3.5×/5×/8× headline gains over Blackwell (training throughput, inference throughput, power efficiency) frame Blackwell as the 2024–2026 deployment workhorse. Blackwell remains the volume tier through 2026 even as Rubin enters production at Microsoft Fairwater.
-
Cost-Curve Pressure Lands on FP8 Incumbents: The FP4 train+serve curve compresses cost for hardware that supports the path. The pressure lands on FP8-era H100 / earlier-generation incumbent fleets, not on NVFP4-capable Blackwell or Rubin.
Related
See also: NVIDIA, Rubin, DeepSeek, Colossus 1, MOC - AI Infrastructure.