Data Center Infrastructure Archives

NVIDIA’s Vision for the Future of AI Data Centers: Scaling Beyond Limits

At NVIDIA GTC, Jensen Huang’s keynote highlighted NVIDIA’s growing presence in the data center market, which is projected to surpass $1 trillion by 2028, in reference to Dell’Oro Group’s forecast. NVIDIA is no longer just a chip vendor; it has evolved into a provider of fully integrated, rack-scale solutions that encompass compute, networking, and thermal management. During GTC, NVIDIA also announced an AI Data Platform that integrates enterprise storage with NVIDIA accelerated computing to enable AI agents to provide real-time business insights to enterprise customers. This transformation is redefining how AI workloads are deployed at scale.

GTC2025 keynote Jensen Huang — Source: Nvidia GTC 2025

The Blackwell Platform: Optimized for AI Training and Reasoning

The emergence of NVIDIA’s Blackwell platform represents a major leap in AI acceleration. Not only does it excel at training deep learning models, but it is also optimized for inference and reasoning—two key drivers of hyperscale capital expenditure growth in 2025. Reasoning models, which generate a significant number of tokens, operate differently from conventional AI models. Unlike traditional AI that directly answers queries, reasoning models use “thinking tokens” to process and refine their responses, mimicking cognitive reasoning. This process significantly increases computational demands significantly.

The Evolution of Accelerated Computing

The unit of accelerated computing is evolving rapidly. It started with single accelerators, progressed to integrated servers like the NVIDIA DGX, and has now reached rack-scale solutions like the NVIDIA GB200 NVL72. Looking ahead, NVIDIA aims to scale even further with the upcoming Vera Rubin Ultra platform, featuring 572 GPUs interconnected in a rack. Scaling up AI clusters introduces new challenges in interconnects and power density. However, as compute nodes scale into the hundreds of thousands (and beyond), the industry needs to address several key challenge:

1) Increasing Rack Density

AI data centers aim to pack GPUs as closely as possible to create a coherent compute fabric for large language model (LLM) training and real-time inference. The NVL72 already features extremely high density, necessitating liquid cooling for heat dissipation. With further scaling, interconnect distances will increase. The question arises: will copper cabling remain viable, or will the industry need to transition to optical interconnects, despite their higher cost and power inefficiencies?

2)The Shift to Multi-Die GPUs

To boost computational capacity, increasing GPU die size has been one approach. However, with the Vera Rubin platform, GPUs have already reached the reticle limit, necessitating a shift to multi-die architectures. This will increase the physical footprint and interconnect distance, posing further engineering challenges.

3) Surging Rack Power Density

As GPU size and node count increase, rack power density is skyrocketing. NVIDIA’s GB200 NVL72 racks already consume 132 kW, and the upcoming Rubin Ultra NVL572 is projected to require 600 kW per rack. Given that AI data centers typically operate within a 50 MW range, fewer than 100 racks can be housed in a single facility. This constraint demands a new approach to scaling AI infrastructure.

4)Disaggregating AI Compute Across Data Centers

As power limitations become a bottleneck, AI clusters may need to be strategically distributed across multiple data centers based on power availability. This introduces the challenge of interconnecting these geographically dispersed clusters into a single virtual AI compute fabric. Coherent optics and photonics-based networking may be necessary to enable low-latency interconnects between data centers separated by miles. NVIDIA’s recently introduced silicon photonics switch may be part of this solution, at least from the standpoint of lowering power consumption, but additional innovations in data center interconnect architectures will likely be required to meet the demands of large-scale distributed AI workloads.

The Future of AI Data Centers

As NVIDIA continues to innovate, the next generation of AI data centers will need to embrace new networking technologies, reimagine power distribution, and pioneer novel solutions for high-density, high-performance computing. The future of AI isn’t just about more GPUs—it’s about building the infrastructure to support them at scale.

Related blog: Insights from GTC25: Networking Could Tip the Balance in the AI Race

With a wave of announcements coming out of GTC, countless articles and blogs have already covered the biggest highlights. Rather than simply rehashing the news, I want to take a different approach—analyzing what stood out to me from a networking perspective. As someone who closely tracks the market, it’s clear that AI workloads are driving a steep disruption in networking infrastructure. While a number of announcements at GTC25 were compute related, NVIDIA made it clear that implementations of next generation GPUs and accelerators wouldn’t be made possible without major innovations on the networking side.

1) The New Age of AI Reasoning Driving 100X More Compute Than a Year Ago

Jensen highlighted how the new era of AI reasoning is driving the evolution of scaling laws, transitioning from pre-training to post-training and test-training. This shift demands an enormous increase in compute power to process data efficiently. At GTC 2025, he emphasized that the required compute capacity is now estimated to be 100 times greater than what was anticipated just a year ago.

2) The Network Defines the AI Data Center

The way AI compute nodes are connected will have profound implications on efficiency, cost, and performance. Scaling up, rather than scaling out, offers the lowest latency, cost, and power consumption when connecting accelerated nodes in the same compute fabric. At GTC 2025, NVIDIA unveiled plans for its upcoming NVLink 6/7 and NVSwitch 6/7, key components of its next-generation Rubin platform, reinforcing the critical role of NVLink switches in its strategy. Additionally, the Spectrum-X switch platform, designed for scaling out, represents another major pillar of NVIDIA’s vision (Chart). NVIDIA is committed to a “one year-rhythm”, with networking keeping pace with GPU requirements. Other key details from NVIDIA’s roadmap announcement also caught our attention, and we are excited to share these with our clients.

3) Power Is the New Currency

The industry is more power-constrained than ever. NVIDIA’s next-generation Rubin Ultra is designed to accommodate 576 dies in a single rack, consuming 600 kW—a significant jump from the current Blackwell rack, which already requires liquid cooling and consumes between 60 kW and 120 kW. Additionally, as we approach 1 million GPUs per cluster, power constraints are forcing these clusters to become highly distributed. This shift is driving an explosion in the number of optical interconnects, both intra- and inter-data center, which will exacerbate the power challenge. NVIDIA is tackling these power challenges on multiple fronts, as explained below.

4) Liquid-Cooled Switches Will Become a Necessity, Not a Choice

After liquid cooling racks and servers, switches are next. NVIDIA’s latest 51.2 T SpectrumX switches offer both liquid-cooled and air-cooled options. However, all future 102.4 T Spectrum-X switches will be liquid-cooled by default.

5) Co-packaged Optics (CPO) in Networking Chips Before GPUs

Another key reason for liquid cooling racks is to maximize the number of GPUs within a single rack while leveraging copper for short-distance connectivity—”Copper when you can, optics when you must.” When optics are necessary, NVIDIA has found a way to save power with Co-Packaged Optics (CPO). NVIDIA plans to make CPO available on its InfiniBand Quantum switches in 2H25 and on its Spectrum-X switches in 2H26. However, NVIDIA will continue to support pluggable optics across different SKUs, reinforcing our view that data centers will adopt a hybrid approach to balance performance, efficiency, and flexibility.

6) Impact on Ethernet Switch Vendor Landscape

According to our AI Networks for AI Workloads report, three major vendors dominated the Ethernet portion of the AI Network market in 2024.

However, over the next few years, we anticipate greater vendor diversity at both the chip and system levels. We anticipate that photonic integration in switches will introduce a new dimension, potentially reshaping the dynamics of an already vibrant vendor landscape. We foresee a rapid pace of innovation in the coming years—not just in technology, but at the business model level as well.

Networking could be the key factor that shifts the balance of power in the AI race and customers appetite for innovation and cutting-edge technologies is at an unprecedented level. As one hyperscaler put it during a panel at GTC 2025: “AI infrastructure is not for the faint of heart.”

For more detailed views and insights on the AI Networks for AI Workloads report, please contact us at dgsales@delloro.com.

Significant Share Shifts Expected in 2025 as Ethernet Gains Momentum in AI Back-end Networks

The networking industry is experiencing a dramatic shift, driven by the rise of AI workloads and the need for new AI back-end networks to connect an ever-increasing number of accelerators in large AI clusters. While investments in AI back-end networks are reaching unprecedented levels, traditional front-end networks needed to connect general-purpose servers remain essential.

At Dell’Oro Group, we’ve just updated our five-year forecast reports for both the front-end as well as the back-end and we’re still bullish on both. Below are some key takeaways:

AI Back-End Network Spending Set to Surpass $100B through 2029 with Ethernet Gaining Momentum

Despite growing concerns about the sustainability of spending on accelerated infrastructure—especially in light of DeepSeek’s recent open-source model, which requires significantly fewer resources than its U.S. counterparts—we remain optimistic. Recent data center capex announcements by Google, Amazon, Microsoft, and Meta in their January/February earnings calls showed ongoing commitment to a sustained high level of AI infrastructure capex supports that view.

We have again raised our forecast for data center switch sales sold in AI Back-end networks with our January 2025 report. However, not all technologies are benefiting equally.

Ethernet is experiencing significant momentum, propelled by supply and demand factors. More large-scale AI clusters are now adopting Ethernet as their primary networking fabric. One of the most striking examples is xAI’s Colossus, a massive NVIDIA GPU-based cluster that has opted for Ethernet deployment.

We therefore revised our projections, moving up the anticipated crossover point where Ethernet surpasses InfiniBand to 2027.

Major share shifts anticipated for Ethernet AI Back-end Networks in 2025

While Celestica, Huawei, and NVIDIA dominated the Ethernet segment in 2024, the competitive landscape is set to evolve in 2025, with Accton, Arista, Cisco, Juniper, Nokia, and other vendors expected to gain ground. We expect the vendor landscape in AI Back-end networks to remain very dynamic as Cloud SPs hedge their bets by diversifying their supply on both the compute side and the networking that goes with it.

Strong Rebound in Front-end Networks Spending in 2025 and Beyond

Despite the challenges in 2024, we expect growth in the front-end market to resume in 2025 and beyond, driven by several factors. These include the need to build additional capacity in front-end networks to support back-end deployments, especially for greenfield projects. These additional front-end network connectivity deployments are expected to include high speeds (>100 Gbps), driving a price premium. Sales growth will be further stimulated by inferencing applications that may not require accelerated servers and will instead operate in front-end networks, whether at centralized locations or edge sites.

The Road Ahead

As AI workloads expand and diversify, the networking infrastructure that supports them—in both front-end and back-end must evolve accordingly. The transition to higher-speed Ethernet and the shifting competitive landscape among vendors suggest that 2025 could be a pivotal year for Ethernet data center switching market.

For more detailed views and insights on the Ethernet Switch—Data Center report or the AI Networks for AI Workloads report, please contact us at dgsales@delloro.com.

The 2024 OCP Global Summit theme was “From Ideas to Impact,” but it could have been “AI Ideas to AI Impact.” Accelerated computing infrastructure was front and center starting with the keynote, on the exhibition hall floor, and in the breakout sessions. Hyperscalers and the ecosystem of suppliers that support them were eager to share what they’ve been working on to bring accelerated computing infrastructure and AI workloads to the market, at scale. As you may expect with anything AI related, it drew a crowd – Over 7000 attendees participated in the event in 2024, a significant increase from ~4500 last year. Throughout the crowds, sessions, and expo hall, three key themes stood out to me: Power and cooling designs for NVIDIA GB200 NVL Racks, an explosion of interest in liquid cooling, and sustainability’s presence among the AI backdrop.

Powering and Cooling NVIDIA GB200 NVL Racks

It’s well known that accelerated computing infrastructure significantly increases rack power densities. This has posed a significant challenge for traditional data center designs, where compute and physical infrastructure are developed and deployed in relative isolation. Deploying accelerated computing infrastructure has forced a rethink, where these boundaries are removed to create an optimized end-to-end system to support next generation “AI factories” at scale. The data center industry is acutely aware this applies to power and cooling, with notable announcements and OCP contributions from industry leaders in how they are addressing these challenges:

Meta kicked off the keynote by announcing Catalina, a rack-scale infrastructure design based on NVIDIA GB200 compute nodes. This design increased the power requirements from 12 – 18 kW/rack to 140 kW/system. To no surprise, Catalina utilizes liquid cooling.
NVIDIA contributed (open-sourced) elements of its GB200 NVL72 design, including a powerful 1400-amp bus bar for distributing power in the rack, and many liquid cooling contributions related to the manifold, blind mating, and flow rates. Lastly, NVIDIA recognized a new ecosystem of partners focused on the power and cooling infrastructure, highlighting Vertiv’s GB200 NVL72 reference architecture, which enables faster time to deployment, utilizes less space, and increases cooling energy efficiency.
Microsoft emphasized the need for liquid cooling for AI accelerators, noting retrofitting challenges in facilities without a chilled water loop. In response, they designed and contributed a custom liquid cooling heat exchanger, which leverages legacy air-based data center heat rejection. This is what I would refer to as air-assisted liquid cooling (AALC), more specifically, an air-assisted coolant distribution unit (CDU), which is becoming increasingly common in retrofitted accelerated computing deployments.
Microsoft also announced a collaborative power architecture effort with Meta, named Mt. Diablo based on a 400 Vdc disaggregated power rack, that will be contributed to the OCP soon. Google also highlighted the potential use of 400 Vdc for future accelerated computing infrastructure.

Data Center Liquid Cooling Takes Center Stage

Liquid cooling was among the most discussed topics at the summit, mentioned by nearly every keynote speaker in addition to dozens of breakout sessions dedicated to its growing use in compute, networking, and facility designs. This is justified from my perspective, as Dell’Oro Group previously highlighted liquid cooling as a technology going mainstream creating a $15 B market opportunity over the next five years. Furthermore, the ecosystem understands that not only is liquid cooling a growing market opportunity, but a critical technology to enable accelerated computing and the growth of AI workloads at scale.

There was not just liquid cooling talk, but partnerships and acquisitions leading up to and during the global summit that further cemented the critical role data center liquid cooling will play in industries’ future. This was highlighted in the following announcements:

Jabil acquired Mikros Technologies: Kicking off two weeks of big announcements, Jabil’s acquisition of Mikros brings together Mikros’s expertise in liquid cooling cold plate technology, engineering and design with Jabil’s manufacturing scale. This appears to position Mikros’s technology as a high-volume option for hyperscale end-users and the greater data center industry in the near future.
Jetcool announced facility CDU, Flex partnership: Jetcool, most known for their air-assisted liquid cooling infrastructure packaged in single servers, introduced a facility CDU (liquid-to-liquid) to keep pace with the market’s evolution towards purpose-built AI factories. The partnership brings together a technology specialist with a contract manufacturer to enable the coming scale needed to support hyperscale end-users and the greater data center industries’ liquid cooling needs.
Schneider Electric acquired Motivair: On the Summit’s final day, Schneider Electric announced its $1.13B acquisition of Motivair. This move, following prior partnerships and organic CDU developments, expands Schneider’s high-density cooling portfolio. This now gives Schneider a holistic power and cooling portfolio to support large-scale accelerated computing deployments, a capability previously exclusive to Vertiv, albeit at a high cost for Schneider.

Sustainability Takes a Back Seat but Is Still Very Much Part of the Conversation

While sustainability did not dominate the headlines, it remained a recurring theme throughout the summit. As AI growth drives massive infrastructure expansion, sustainability has become a critical consideration in data center designs. OCP’s CEO George Tchaparian characterized sustainability’s role alongside AI capex investments best, “Without sustainability, it’s not going to sustain.” Other highlights include:

OCP announced a new alliance with Net Zero Innovation Hub, an organization focused on net-zero data center innovation in Europe. Details on the alliance were sparse, but more details are expected to emerge on this partnership at the 2025 OCP EMEA Regional Summit.
Google shared a collaboration with Meta, Microsoft, and Amazon on green concrete. Most impressively, this collaboration began with a roadmap around the time of last year’s OCP Summit, which resulted in a proof-of-concept deployment in August 2024, reducing concrete emissions by ~40%.
A wide range of other sustainability topics were discussed. Improvements in cooling efficiency, water consumption, heat reuse, clean power, lifecycle assessment, and metrics to measure and track progress related to data center efficiency and sustainably were all prevalent.

Conclusion: Data Center Power and Cooling is Central to the Future of the Data Center Industry

The 2024 OCP Global Summit left me as confident as ever in the growing role data center power and cooling infrastructure has in the data center industry. It’s not only improvements to existing technologies but the adoption of new technologies and facility architectures that have emerged. The event’s theme, “From Ideas to Impact,” serves as a fitting reminder of how AI is reshaping the industry, with significant implications for the future. As we look ahead, the question isn’t just how data centers will power and cool AI workloads, but how they’ll do so sustainably, efficiently, and at an unprecedented scale.

In 1H24, worldwide data center capital expenditure (capex) surged by 38 percent year-over-year (Y/Y). This rapid increase was primarily driven by the rise of accelerated servers, which are critical for generative AI applications. This marks the fourth consecutive quarter of triple-digit Y/Y revenue growth in accelerated server shipments. Notably, servers powered by NVIDIA Hopper GPUs and custom accelerators, such as Google’s TPU and Amazon’s Trainium, gained traction among hyperscale cloud service providers. Enterprises and Tier 2 cloud providers also contributed to this strong demand, highlighting the broadening adoption of AI technologies across industries.

Following a correction phase in 2023, the general-purpose server market is steadily recovering, with two consecutive quarters of Y/Y revenue growth. Higher commodity prices played a role in this rebound, but the market also saw positive momentum in unit sales. Server upgrades, particularly to 4th and 5th generation CPU platforms, have been long overdue, and despite ongoing global economic uncertainties, demand for these systems is expected to rise.

Data center switch sales, which account for a significant portion of the overall data center network infrastructure revenues, remained flat Y/Y in 1H24, despite a nice recovery in 2Q24. Heightened AI-related investments, particularly among cloud service providers for networks based on 200, 400, and 800 Gbps port speeds were not able to offset the contraction from the rest of the market, which is still undergoing a digestion cycle.

The data center physical infrastructure (DCPI) market outperformed expectations in 1H24. After a brief digestion in 1Q24, revenue increased by double-digits in 2Q24. Growth was largely attributed to new data center construction with AI-related design modifications to support increasing rack power densities. North America led the way with the fastest growth rate, while revenues in the Asia-Pacific region, excluding China, also saw double-digit growth.

Record-High Server and Storage System Component Revenues

Server and storage system component revenues reached record highs in the first two quarters of the year. The rapid growth of accelerators, which include GPUs and custom accelerators, as well as memory and storage drives, was a key factor behind this revenue increase. Generative AI applications were the primary drivers of accelerated server demand, but higher commodity prices, particularly for memory and storage drives, also contributed to the revenue surge.

NVIDIA emerged as the leader in data center IT component revenues in the first half, accounting for nearly half of the reported revenue, as Hopper GPU supplies improved. Samsung also saw growth in its market share, driven by higher memory prices and increased contributions from high-bandwidth memory (HBM). Intel, on the other hand, experienced a decline in market share due to the slow recovery of the server CPU market and competition from AMD, and slower adoption of its accelerator products.

AI Servers and Infrastructure Fuel Future Growth

Looking ahead to full-year 2024, data center capex is projected to increase by 35 percent to over $400 B, with spending on AI servers and infrastructure leading the way. Hyperscale cloud service providers are racing to expand their AI offerings, creating strong demand for these specialized systems. The recovery of the general-purpose server market, will also contribute to growth, especially as server component prices continue to rise.

Component revenues are forecasted to double in 2024, fueled by the increased deployment of specialized processors such as accelerators and Smart NICs. Commodity component prices, such as memory and storage drives, are expected to rise substantially throughout the year, further boosting revenue growth. Additionally, unit growth for components is projected to improve steadily as demand for general-purpose servers recovers in the second half of 2024.

We project the recovery in the Ethernet switch market to be led by the hyperscale Cloud SPs, on both networks for general-purpose computing, and AI clusters. However, the broader market is still facing significant inventory correction that could persist for several more quarters.

DCPI revenue growth is forecast to maintain momentum in 2024, with growth rates accelerating in the second half of the year. Vendor’s backlog mix has shifted from infrastructure to support general-purpose computing to accelerated computing workloads, which have longer lead times for higher ampacity power distribution and thermal management requirements. However, order growth is expected to moderate in the second half of the year as the recent expansion cycle on infrastructure normalizes.

Contact Us