[wp_tech_share]

Significant Share Shifts Expected in 2025 as Ethernet Gains Momentum in AI Back-end Networks

The networking industry is experiencing a dramatic shift, driven by the rise of AI workloads and the need for new AI back-end networks to connect an ever-increasing number of accelerators in large AI clusters. While investments in AI back-end networks are reaching unprecedented levels, traditional front-end networks needed to connect general-purpose servers remain essential.

At Dell’Oro Group, we’ve just updated our five-year forecast reports for both the front-end as well as the back-end and we’re still bullish on both. Below are some key takeaways:

 

AI Back-End Network Spending Set to Surpass $100B through 2029 with Ethernet Gaining Momentum

Despite growing concerns about the sustainability of spending on accelerated infrastructure—especially in light of DeepSeek’s recent open-source model, which requires significantly fewer resources than its U.S. counterparts—we remain optimistic. Recent data center capex announcements by Google, Amazon, Microsoft, and Meta in their January/February earnings calls showed ongoing commitment to a sustained high level of AI infrastructure capex supports that view.

We have again raised our forecast for data center switch sales sold in AI Back-end networks with our January 2025 report. However, not all technologies are benefiting equally.

Ethernet is experiencing significant momentum, propelled by supply and demand factors. More large-scale AI clusters are now adopting Ethernet as their primary networking fabric. One of the most striking examples is xAI’s Colossus, a massive NVIDIA GPU-based cluster that has opted for Ethernet deployment.

We therefore revised our projections, moving up the anticipated crossover point where Ethernet surpasses InfiniBand to 2027.

Major share shifts anticipated for Ethernet AI Back-end Networks in 2025

While Celestica, Huawei, and NVIDIA dominated the Ethernet segment in 2024, the competitive landscape is set to evolve in 2025, with Accton, Arista, Cisco, Juniper, Nokia, and other vendors expected to gain ground. We expect the vendor landscape in AI Back-end networks to remain very dynamic as Cloud SPs hedge their bets by diversifying their supply on both the compute side and the networking that goes with it.

 

Strong Rebound in Front-end Networks Spending in 2025 and Beyond

Despite the challenges in 2024, we expect growth in the front-end market to resume in 2025 and beyond, driven by several factors. These include the need to build additional capacity in front-end networks to support back-end deployments, especially for greenfield projects. These additional front-end network connectivity deployments are expected to include high speeds (>100 Gbps), driving a price premium. Sales growth will be further stimulated by inferencing applications that may not require accelerated servers and will instead operate in front-end networks, whether at centralized locations or edge sites.

 

The Road Ahead

As AI workloads expand and diversify, the networking infrastructure that supports them—in both front-end and back-end must evolve accordingly. The transition to higher-speed Ethernet and the shifting competitive landscape among vendors suggest that 2025 could be a pivotal year for Ethernet data center switching market.

For more detailed views and insights on the Ethernet Switch—Data Center report or the AI Networks for AI Workloads report, please contact us at dgsales@delloro.com.

[wp_tech_share]

The 2024 OCP Global Summit theme was “From Ideas to Impact,” but it could have been “AI Ideas to AI Impact.” Accelerated computing infrastructure was front and center starting with the keynote, on the exhibition hall floor, and in the breakout sessions. Hyperscalers and the ecosystem of suppliers that support them were eager to share what they’ve been working on to bring accelerated computing infrastructure and AI workloads to the market, at scale. As you may expect with anything AI related, it drew a crowd – Over 7000 attendees participated in the event in 2024, a significant increase from ~4500 last year. Throughout the crowds, sessions, and expo hall, three key themes stood out to me: Power and cooling designs for NVIDIA GB200 NVL Racks, an explosion of interest in liquid cooling, and sustainability’s presence among the AI backdrop.

 

Powering and Cooling NVIDIA GB200 NVL Racks

It’s well known that accelerated computing infrastructure significantly increases rack power densities. This has posed a significant challenge for traditional data center designs, where compute and physical infrastructure are developed and deployed in relative isolation. Deploying accelerated computing infrastructure has forced a rethink, where these boundaries are removed to create an optimized end-to-end system to support next generation “AI factories” at scale. The data center industry is acutely aware this applies to power and cooling, with notable announcements and OCP contributions from industry leaders in how they are addressing these challenges:

  • Meta kicked off the keynote by announcing Catalina, a rack-scale infrastructure design based on NVIDIA GB200 compute nodes. This design increased the power requirements from 12 – 18 kW/rack to 140 kW/system. To no surprise, Catalina utilizes liquid cooling.
  • NVIDIA contributed (open-sourced) elements of its GB200 NVL72 design, including a powerful 1400-amp bus bar for distributing power in the rack, and many liquid cooling contributions related to the manifold, blind mating, and flow rates. Lastly, NVIDIA recognized a new ecosystem of partners focused on the power and cooling infrastructure, highlighting Vertiv’s GB200 NVL72 reference architecture, which enables faster time to deployment, utilizes less space, and increases cooling energy efficiency.
  • Microsoft emphasized the need for liquid cooling for AI accelerators, noting retrofitting challenges in facilities without a chilled water loop. In response, they designed and contributed a custom liquid cooling heat exchanger, which leverages legacy air-based data center heat rejection. This is what I would refer to as air-assisted liquid cooling (AALC), more specifically, an air-assisted coolant distribution unit (CDU), which is becoming increasingly common in retrofitted accelerated computing deployments.
  • Microsoft also announced a collaborative power architecture effort with Meta, named Mt. Diablo based on a 400 Vdc disaggregated power rack, that will be contributed to the OCP soon. Google also highlighted the potential use of 400 Vdc for future accelerated computing infrastructure.

 

Data Center Liquid Cooling Takes Center Stage

Liquid cooling was among the most discussed topics at the summit, mentioned by nearly every keynote speaker in addition to dozens of breakout sessions dedicated to its growing use in compute, networking, and facility designs. This is justified from my perspective, as Dell’Oro Group previously highlighted liquid cooling as a technology going mainstream creating a $15 B market opportunity over the next five years. Furthermore, the ecosystem understands that not only is liquid cooling a growing market opportunity, but a critical technology to enable accelerated computing and the growth of AI workloads at scale.

There was not just liquid cooling talk, but partnerships and acquisitions leading up to and during the global summit that further cemented the critical role data center liquid cooling will play in industries’ future. This was highlighted in the following announcements:

  • Jabil acquired Mikros Technologies: Kicking off two weeks of big announcements, Jabil’s acquisition of Mikros brings together Mikros’s expertise in liquid cooling cold plate technology, engineering and design with Jabil’s manufacturing scale. This appears to position Mikros’s technology as a high-volume option for hyperscale end-users and the greater data center industry in the near future.
  • Jetcool announced facility CDU, Flex partnership: Jetcool, most known for their air-assisted liquid cooling infrastructure packaged in single servers, introduced a facility CDU (liquid-to-liquid) to keep pace with the market’s evolution towards purpose-built AI factories. The partnership brings together a technology specialist with a contract manufacturer to enable the coming scale needed to support hyperscale end-users and the greater data center industries’ liquid cooling needs.
  • Schneider Electric acquired Motivair: On the Summit’s final day, Schneider Electric announced its $1.13B acquisition of Motivair. This move, following prior partnerships and organic CDU developments, expands Schneider’s high-density cooling portfolio. This now gives Schneider a holistic power and cooling portfolio to support large-scale accelerated computing deployments, a capability previously exclusive to Vertiv, albeit at a high cost for Schneider.

 

Sustainability Takes a Back Seat but Is Still Very Much Part of the Conversation

While sustainability did not dominate the headlines, it remained a recurring theme throughout the summit. As AI growth drives massive infrastructure expansion, sustainability has become a critical consideration in data center designs. OCP’s CEO George Tchaparian characterized sustainability’s role alongside AI capex investments best, “Without sustainability, it’s not going to sustain.” Other highlights include:

  • OCP announced a new alliance with Net Zero Innovation Hub, an organization focused on net-zero data center innovation in Europe. Details on the alliance were sparse, but more details are expected to emerge on this partnership at the 2025 OCP EMEA Regional Summit.
  • Google shared a collaboration with Meta, Microsoft, and Amazon on green concrete. Most impressively, this collaboration began with a roadmap around the time of last year’s OCP Summit, which resulted in a proof-of-concept deployment in August 2024, reducing concrete emissions by ~40%.
  • A wide range of other sustainability topics were discussed. Improvements in cooling efficiency, water consumption, heat reuse, clean power, lifecycle assessment, and metrics to measure and track progress related to data center efficiency and sustainably were all prevalent.

 

Conclusion: Data Center Power and Cooling is Central to the Future of the Data Center Industry

The 2024 OCP Global Summit left me as confident as ever in the growing role data center power and cooling infrastructure has in the data center industry. It’s not only improvements to existing technologies but the adoption of new technologies and facility architectures that have emerged. The event’s theme, “From Ideas to Impact,” serves as a fitting reminder of how AI is reshaping the industry, with significant implications for the future. As we look ahead, the question isn’t just how data centers will power and cool AI workloads, but how they’ll do so sustainably, efficiently, and at an unprecedented scale.

[wp_tech_share]

In 1H24, worldwide data center capital expenditure (capex) surged by 38 percent year-over-year (Y/Y). This rapid increase was primarily driven by the rise of accelerated servers, which are critical for generative AI applications. This marks the fourth consecutive quarter of triple-digit Y/Y revenue growth in accelerated server shipments. Notably, servers powered by NVIDIA Hopper GPUs and custom accelerators, such as Google’s TPU and Amazon’s Trainium, gained traction among hyperscale cloud service providers. Enterprises and Tier 2 cloud providers also contributed to this strong demand, highlighting the broadening adoption of AI technologies across industries.

Following a correction phase in 2023, the general-purpose server market is steadily recovering, with two consecutive quarters of Y/Y revenue growth. Higher commodity prices played a role in this rebound, but the market also saw positive momentum in unit sales. Server upgrades, particularly to 4th and 5th generation CPU platforms, have been long overdue, and despite ongoing global economic uncertainties, demand for these systems is expected to rise.

Data center switch sales, which account for a significant portion of the overall data center network infrastructure revenues, remained flat Y/Y in 1H24, despite a nice recovery in 2Q24. Heightened AI-related investments, particularly among cloud service providers for networks based on 200, 400, and 800 Gbps port speeds were not able to offset the contraction from the rest of the market, which is still undergoing a digestion cycle.

The data center physical infrastructure (DCPI) market outperformed expectations in 1H24. After a brief digestion in 1Q24, revenue increased by double-digits in 2Q24. Growth was largely attributed to new data center construction with AI-related design modifications to support increasing rack power densities. North America led the way with the fastest growth rate, while revenues in the Asia-Pacific region, excluding China, also saw double-digit growth.

 

Record-High Server and Storage System Component Revenues

Server and storage system component revenues reached record highs in the first two quarters of the year. The rapid growth of accelerators, which include GPUs and custom accelerators, as well as memory and storage drives, was a key factor behind this revenue increase. Generative AI applications were the primary drivers of accelerated server demand, but higher commodity prices, particularly for memory and storage drives, also contributed to the revenue surge.

NVIDIA emerged as the leader in data center IT component revenues in the first half, accounting for nearly half of the reported revenue, as Hopper GPU supplies improved. Samsung also saw growth in its market share, driven by higher memory prices and increased contributions from high-bandwidth memory (HBM). Intel, on the other hand, experienced a decline in market share due to the slow recovery of the server CPU market and competition from AMD, and slower adoption of its accelerator products.

Dell'Oro Data Center Revenue by Technology Area

 

AI Servers and Infrastructure Fuel Future Growth

Looking ahead to full-year 2024, data center capex is projected to increase by 35 percent to over $400 B, with spending on AI servers and infrastructure leading the way. Hyperscale cloud service providers are racing to expand their AI offerings, creating strong demand for these specialized systems. The recovery of the general-purpose server market, will also contribute to growth, especially as server component prices continue to rise.

Component revenues are forecasted to double in 2024, fueled by the increased deployment of specialized processors such as accelerators and Smart NICs. Commodity component prices, such as memory and storage drives, are expected to rise substantially throughout the year, further boosting revenue growth. Additionally, unit growth for components is projected to improve steadily as demand for general-purpose servers recovers in the second half of 2024.

We project the recovery in the Ethernet switch market to be led by the hyperscale Cloud SPs, on both networks for general-purpose computing, and AI clusters. However, the broader market is still facing significant inventory correction that could persist for several more quarters.

DCPI revenue growth is forecast to maintain momentum in 2024, with growth rates accelerating in the second half of the year. Vendor’s backlog mix has shifted from infrastructure to support general-purpose computing to accelerated computing workloads, which have longer lead times for higher ampacity power distribution and thermal management requirements. However, order growth is expected to moderate in the second half of the year as the recent expansion cycle on infrastructure normalizes.

[wp_tech_share]

Last month was incredibly exciting, to say the least! We had the opportunity to attend two of the most impactful and prominent events in the industry: NVDA’s GTC followed by OFC.

As previously discussed in my pre-OFC show blog, we have been anticipating that AI networks will be in the spotlight at OFC 2024 and will accelerate the development of innovative optical connectivity solutions. These solutions are tailored to address the explosive growth in bandwidth within AI clusters while tackling cost and power consumption challenges. GTC 2024 has further intensified this focus. During GTC 2024, Nvidia announced the latest Blackwell B200 Tensor Core GPU, designed to empower trillion-parameter AI Large Language Models. The Blackwell B200 demands advanced 800 Gbps networking, aligning perfectly with the predictions outlined in our AI Networks for AI Workloads report. With an anticipated 10X traffic growth in AI workloads every two years, these AI workloads are expected to outpace traditional front-end networks by at least two speed upgrade cycles.

While a multitude of topics and innovative solutions were discussed at OFC regarding inter-data center applications as well as compute interconnect for scaling up the number of accelerators within the same domain, this blog will primarily focus on intra-data center applications. Specifically, it will focus on scaling out the network needed to connect various accelerated nodes in large AI clusters with 1000’s of accelerators. This network is commonly referred to in the industry as the ‘AI Back-end Network’ (also referred to; by some vendors; as the network for East-West traffic). Some of the topics and solutions that have been explored at the show are as follows:

1) Linear Drive Pluggable Optics vs. Linear Receive Optics vs. Co-Packaged Optics

Pluggable optics are expected to account for an increasingly significant portion of power consumption at a system level. An issue that will get further amplified as Cloud SPs build their next-generation AI networks featuring a proliferation of high-speed optics.

At OFC 2023, the introduction of Linear Drive Pluggable Optics (LPOs) promising significant cost and power savings through the removal of the DSP, initiated a flurry of testing activities. Fast forward to OFC 2024, we witnessed nearly 20 demonstrations, featuring key players including Amphenol, Eoptolink, HiSense, Innolight, and others. Conversations during the event revealed industry-wide enthusiasm for the high-quality 100G SerDes integrated into the latest 51.2 Tbps network switch chips, with many eager to capitalize on this advancement to be able to remove the DSP from the optical pluggable modules.

However, despite the excitement, the hesitancy from hyperscalers — with the exception of ByteDance and Tencent, who have announced plans to test the technology by end of this year— suggests that LPOs may not be poised for mass adoption just yet. Interviews highlighted hyperscalers’ reluctance to shoulder the responsibility of qualification and potential failure of LPOs. Instead, they express a preference for switch suppliers to handle those responsibilities.

In the interim, early deployments of 51.2 Tbps network chips are expected to continue leveraging pluggable optics, at least through the middle of next year. However, if LPOs can demonstrate safe deployment at mass scale while offering significant power savings for hyperscalers — enabling them to deploy more accelerators per rack — the temptation to adopt may prove irresistible. Ultimately, the decision hinges on whether LPOs can deliver on these promises.

Furthermore, Half-Retimed Linear Optics (HALO), also known as Linear Receive Optics (LROs) were discussed at the show. LRO integrates the DSP chip only on the transmitter side (as opposed to completely removing it in the case of LPOs). Our interviews revealed that while LPOs may proof to be doable at 100G-PAM4 SerDes, they may become challenging at 200G-PAM4 SerDes and that’s when LROs may be needed.

Meanwhile, Co-Packaged Optics (CPOs) remain in development, with large industry players such as Broadcom showcasing ongoing development and progress in the technology. While we believe current LPO and LRO solutions will certainly have a faster time to market with similar promises as CPOs, the latter may eventually become the sole solution capable of enabling higher speeds at some point in the future.

Before closing this section, let’s just not forget that, when possible, copper would be a much better alternative than all of the optical connectivity options discussed above. Put simply, use copper when you can, use optics when you must. Interestingly, liquid cooling may facilitate the densification of accelerators within the rack, enabling increased usage of copper for connecting various accelerator nodes within the same rack. The recent announcement of the NVIDIA GB200 NVL72 at GTC perfectly illustrates this trend.

2) Optical Circuit Switches

OFC 2024 brought some interesting Optical Circuit Switches (OCS) related announcements. OCS can bring many benefits including high bandwidth and low network latency as well as significant capex savings. That is because OCS switches can lead to a significant reduction in the number of required electrical switches within the network which eliminates the expensive optical-to-electrical-to-optical conversions associated with electrical switches. Additionally, unlike electrical switches, OCS switches are speed agnostic and don’t necessarily need to be upgraded when servers adopt next generation optical transceivers.

However, OCS is a novel technology and so far, only Google, after many years in development, was able to deploy it in mass in its data center networks. Additionally, OCS switches may require a change in the installed base of fiber. For that reason, we are still watching to see if any other Cloud SP, besides Google, has any plans to follow suit and adopt OCS switches in the network.

3) The Path to 3.2 Tbps

At OFC 2023, numerous 1.6 Tbps optical components and transceivers based on 200G per lambda were introduced. At OFC 2024, we witnessed further technology demonstrations of such 1.6 Tbps optics. While we don’t anticipate volume shipment of 1.6 Tbps until 2025/2026, the industry has already begun efforts in exploring various paths and options towards achieving 3.2 Tbps.

Given the complexity encountered in transitioning from 100G-PAM4 electrical lane speeds to 200G-PAM4, initial 3.2 Tbps solutions may utilize 16 lanes of 200G-PAM4 within an OSFP-XD form factor, instead of 8 lanes of 400G-PAMx. It’s worth noting that OSFP-XD, which was initially explored and demonstrated two years ago at OFC 2022, may be brought back to action due to the urgency stemming from AI cluster deployments. 3.2 Tbps solutions in OSFP-XD form factor offer superior faceplate density and cost savings compared to 1.6 Tbps. Ultimately, the industry is expected to figure out a way to enable 3.2 Tbps based on 8 lanes of 400G-PAMx SerDes, albeit it may take some time to reach that target.

In summary, OFC 2024 showcased numerous potential solutions aimed at addressing common challenges: cost, power, and speed. We anticipate that different hyperscalers will make distinct choices, leading to market diversification. However, one of the key considerations will be time to market. It’s important to note that the refresh cycle in the AI back-end network is typically around 18 to 24 months, significantly shorter compared to the 5 to 6 years seen in the traditional front-end networks used to connect general-purpose server.

For more detailed views and insights on the Ethernet Switch—Data Center report or the AI Networks for AI Workloads report, please contact us at dgsales@delloro.com.

[wp_tech_share]

Market Overview

The worldwide data center capital expenditure (capex) grew by 4% in 2023, reaching $260 billion, with servers leading all technology areas in revenue (Figure 1). However, this growth rate marked a slowdown from the double-digit growth observed in the previous year. Despite lingering economic uncertainties, the market is poised for growth driven by advancements in accelerated computing for AI applications, and expanding data center footprint.

The growth varied across different categories of data center technology areas.

  • IT infrastructure experienced a decline due to reduced investments in general-purpose servers and storage systems. This decline was attributed to supply issues that occurred in 2022, prompting enterprise customers and resellers to place excess orders, which led to inventory surges and subsequent corrections. Consequently, server shipments declined by 8% in 2023. The demand for general-purpose server and storage system components such as CPUs, memory, storage drives, and NICs, saw a sharp decline in 2023, as the major Cloud Service Providers (SPs) and server and storage system OEMs reduced component purchases in anticipation of weak system demand.
  • In contrast, there was a shift in capex towards accelerated computing. Spending on accelerators, such as GPUs and other custom accelerators, more than tripled in 2023, as the major Cloud SPs raced to deploy accelerated computing infrastructure that is optimized for AI use cases ranging from recommenders to generative AI. Accelerated servers, although comprising a small share of total server volume, command a significant average selling price (ASP) premium, contributing significantly to revenue.
  • Revenues for network infrastructure, consisting mostly of Ethernet switches, showed deceleration throughout 2023 as vendor fulfill back. Modest growth rates observed in the fourth quarter of 2023, reflecting a digestion cycle affecting various vendors and product segments.
  • While the data center physical infrastructure (DCPI) revenues experienced robust double-digit growth in 2023, the market also decelerated in the fourth quarter of 2023. This slowdown was attributed to the diminishing impact of pandemic-induced digitalization and limited price realization from price increases implemented in 2022. However, emerging deployments associated with AI workloads, particularly in retrofitting power distribution and thermal management in existing facilities, provided a marginal contribution to growth.

Data center capex growth varied among customer segments, with Colocation SPs leading in growth due to ongoing momentum in DCPI and global data center footprint expansion. In the Top 4 US Cloud SP segment, Microsoft and Google increased data center investments, particularly in AI infrastructure, while Amazon, underwent a digestion cycle following pandemic-driven expansion. In contrast, the major Chinese Cloud SPs experienced declines in data center capex due to economic, regulatory, and demand challenges. Enterprise data center spending also declined modestly in 2023, reflecting weakening demand amid economic uncertainties and digestion.

 

Vendor Landscape

Below are some vendor highlights in the key technology areas we track:

  • In the Server market, Dell led in revenue share, followed by HPE and IEIT Systems. Excluding white box server vendors, revenue for original equipment manufacturers (OEMs) declined by 10% in 2023, with lower server unit volumes attributed to economic uncertainties and excess channel inventory. However, some vendors experienced revenue growth through shifts in product mix towards accelerated platforms or general-purpose servers with the latest CPUs from Intel and AMD.
  • The Storage System market witnessed a 7% decline in revenue in 2023, with Dell leading in revenue share, followed by Huawei and NetApp. Huawei was the only major vendor to achieve growth, driven by success in adopting the latest all-flash arrays among enterprise customers.
  • In the Ethernet Data Center Switch market, Arista surpassed Cisco in the fourth quarter, although Cisco maintained its position as the market leader for the entirety of 2023. Cisco’s sales were boosted by substantial backlogged shipments earlier in the year. However, demand tapered off later as both cloud service providers and enterprise customers underwent a period of digestion. Meanwhile, Arista experienced remarkable revenue growth, outpacing the market due to its robust presence at Meta and Microsoft, both of which demonstrated significant network spending throughout 2023.
  • In the DCPI market, Schneider Electric held onto the top market share ranking in 2023. Vertiv maintained the number two market position, but gained meaningful share and is now challenging Schneider Electric for the top market share position. Eaton rounds out the top three DCPI vendors. All three companies experienced double-digit revenue growth for the full year.

 

2024 Outlook

Looking ahead to 2024, the Dell’Oro Group forecasts a double-digit increase in worldwide data center capex, driven by increased server demand and average selling prices (Figure 2). Accelerated computing adoption is expected to continue, supported by new GPU platform releases from NVIDIA, AMD, and Intel. Growth in network infrastructure and DCPI revenues will depend on organic investments rather than supply chain-induced backlog or price increases. Recent recovery in server and storage component markets for CPUs, memory and storage drives is signaling the potential for increased system demand later this year. Dell’Oro Group projects moderate growth for the Top 4 US Cloud SPs in data center capex, while the Top 4 China-based cloud SPs are expected to undergo a cautious recovery. Additionally, enterprise and rest-of-cloud segments may be sensitive to macroeconomic conditions, with potential upside opportunities in AI-related investments.