Skip to content

Nvidia's business plans for upcoming graphics processors and central processing units, including the Rubin, Rubin Ultra, and Feynman models, as well as their silicon photonics technology.

Nvidia plans to unveil advanced GPUs, CPUs, and photonic networking in its 2025-2028 roadmap. The aim is to exponentially enhance AI performance. The pinnacle of this development will be the NVL576 Kyber system, a 576-GPU chiplet, which promises a staggering 14-fold increase in performance...

Nvidia's plans for enterprise-level GPUs and CPUs: Rubin, Rubin Ultra, Feynman, and silicon...
Nvidia's plans for enterprise-level GPUs and CPUs: Rubin, Rubin Ultra, Feynman, and silicon photonics technology

Nvidia's business plans for upcoming graphics processors and central processing units, including the Rubin, Rubin Ultra, and Feynman models, as well as their silicon photonics technology.

Nvidia's Rubin Ultra GPUs Set to Revolutionize AI and HPC Workloads

Nvidia is gearing up for the release of its highly anticipated Rubin Ultra GPUs, scheduled for the second half of 2027. These cutting-edge GPUs are expected to offer around twice the performance of the current VR200/VR300 GPUs, making a significant leap in AI and high-performance computing (HPC) capabilities.

The Rubin Ultra GPUs will employ four GPU chiplets instead of the previous two, delivering approximately 100 petaflops (PFLOPS) of FP4 performance. This massive parallelism is aimed at AI inference workloads, making them a formidable tool for the future of AI and HPC.

One of the key features of the Rubin Ultra GPUs is the use of 1 terabyte (1024 GB) of HBM4E memory per package, offering a bandwidth of up to 32 terabytes per second (TB/s). However, this powerhouse comes with substantial power and thermal requirements, demanding around 3,600 watts per GPU package, necessitating advanced cooling solutions such as liquid cooling.

In the supercomputing platform NVL576, Nvidia plans to connect 576 Rubin Ultra GPUs with Vera CPUs, achieving 15 exaflops of FP4 inference performance and 5 exaflops of FP8 training performance. This is roughly 14 times faster than the previous GB300 generation, marking a significant leap in computational power.

The Rubin Ultra platform will also use the seventh generation of Nvidia’s NVLink interconnect, providing dramatically increased bandwidth (up to 1.5 petabytes per second for NVLink7). This will enable the platform to handle the massive data transfer rates required for AI and HPC workloads.

The Rubin Ultra GPUs fit into Nvidia’s roadmap as a follow-up to the Rubin GPU (coming early 2026) and precede the next-generation "Feynman" GPUs anticipated in 2028, which may introduce newer memory types beyond HBM4E.

In summary, the Rubin Ultra GPUs are designed for extreme AI and HPC workloads, combining massive parallelism, very high memory and interconnect bandwidth, and substantial power/thermal requirements. They represent a major leap forward in AI infrastructure, scaling GPU performance from the Blackwell B200's 10 FP4 PFLOPS and 192 GB of HBM3 memory to the Rubin Ultra's 100 FP4 PFLOPS and 1 TB of HBM4E memory.

  • The Rubin Ultra GPUs from Nvidia, set to revolutionize AI and HPC workloads, will feature four GPU chiplets and 1 terabyte of HBM4E memory per package, offering unprecedented performance in the field of AI inference.
  • By employing the seventh generation of Nvidia’s NVLink interconnect and promising a mind-boggling 15 exaflops of FP4 inference performance, the Rubin Ultra GPUs are poised to redefine the landscape of finance, as their astronomical computational power will undoubtedly have significant implications for high-frequency trading and predictive analytics.

Read also:

    Latest