[wp_tech_share]

As we enter the new year, it’s a great opportunity to reflect on 2023 and assess what’s in store for 2024.

Looking back at our 2023 DCPI predictions, we anticipated that macroeconomic uncertainty would not lead to a DCPI recession in 2023. We also foresaw that power availability would challenge data centers to rethink energy storage and on-site power generation. Both proved to be true.

Through 3Q23, DCPI revenues have grown at double-digit rates, surpassing our expectation for 2023. Power availability also became a widespread topic of conversation, with battery energy storage systems (BESS), fuel cells, and small modular reactors (SMRs) all increasingly viewed as options to address future power availability challenges.

We also predicted a 10MW immersion cooling deployment from a top cloud service provider; however, this did not happen. Smaller scale deployments and proof of concepts occurred, but larger scale deployments require continued growth in ecosystem support, new environmentally friendly immersion fluids, and increased end-user operational readiness.

Yet, the most impactful and exciting development of 2023 in the data center industry should come as no surprise at this point, the proliferation of generative AI. This has set the stage for a profound transformation in the DCPI market. The impact will be felt for years to come, and we expect to see the following three trends this year:

  1. Normalizing order cycle to lead to slow start for DCPI market in 2024

After back-to-back years of double-digit growth, which has not been the norm over the past decade, DCPI revenue growth is forecast to moderate in 2024, pronounced in the first half of the year. This moderation is attributed to supply chain constraints that delayed unit shipments in 2022, creating unseasonably strong growth in the first half of 2023. Not only does this create tough year-over-year comparisons for 1H24, but abated supply chain constraints mean end-users’ ordering patterns are normalizing, shifting towards the second half of the year.

Additionally, while DCPI vendor backlogs haven’t meaningfully declined, the contents of those backlogs have changed. Order associated with traditional computing workloads have returned to more normal levels, while backlogs for AI-related DCPI deployments are growing. However, these AI-related DCPI deployments need additional time to materialize.

  1. Purpose-built AI facilities will begin to materialize in 2H24

After a slow first half of the year, growth is forecast to accelerate during the second half of 2024. We anticipate that this growth will be driven by new facilities purpose-built for AI workloads, starting to materialize from the top cloud service providers. These facilities are expected to demand 100s of MWs each, pushing rack power densities from 10 – 15 kW/rack today to 80 – 100 kW/rack to support power hungry accelerated servers.

This requires significant investments in higher ampacity power distribution and thermal management, specifically liquid cooling. We expect the majority of this liquid cooling to materialize in the form of Direct Liquid Cooling and air-assisted Rear Door Heat Exchangers (RDHx). This is due to the familiarity of end-users deploying IT infrastructure in the vertical rack form factor and existing ecosystem support, alongside the performance and sustainability benefits. We plan to provide more detail on liquid cooling in our upcoming Advanced Research Report, Data Center Liquid Cooling,’ scheduled to publish in 2Q24.

  1. Changes in GHG Protocol accounting will add pressure to data center sustainability

The data center industry is on a rapid growth trajectory, a trend further accelerated by the growth of AI workloads. However, this surge has raised concern about a potential for alarming growth in greenhouse gas (GHG) emissions. This has attracted attention to the data center industry, to which the industry has responded with commitments to grow sustainably.

To help measure and assess progress here, many within the data center ecosystem report on carbon emissions following the GHG Protocol Corporate Accounting and Reporting Standard. GHG Protocol recently began working on updates to the standards that may significantly impact data center Scope 2 emissions, or indirect emissions generated from the purchase of electricity. Historically, data center owners and operators have been able to limit these emissions through power purchase agreements (PPAs) and renewable energy certificates (RECs) offsets. However, these offsets no longer have the shiny appeal they once did. That’s because the burden a data center has on its local power grid and community may not align with the benefits from the offsets.

We expect the updates from GHG Protocol to address this issue, and become introduce more granularity and stringency in Scope 2 emissions accounting. This may make sustainability claims related to Scope 2 emissions more difficult to make, but much more meaningful. A draft version of these updates is expected in 2024, with the final standards slated for release in 2025. These changes will set the stage for the sustainability claims the data center industry can make in the second half of this decade.

[wp_tech_share]

Happy New Year! As usual, we’re excited to start the year by reflecting on the developments in the Ethernet data center switch market throughout 2023 and exploring the anticipated trends for 2024.

First, looking back at 2023, the market performed largely in line with our expectations as outlined in our 2023 prediction blog published in January of last year. As of January 2024, data center switch sales are set to achieve double-digit growth in 2023, based on the data collected up to the 3Q23 period. Shipments of 200/400 Gbps nearly doubled in 2023. While Google, Amazon, Microsoft, and Meta continue to dominate deployments, we observed a notable increase in 200/400 Gbps port shipments destined toward Tier 2/3 Cloud Service Providers (SPs) and large enterprises. In the meantime, 800 Gbps deployments remained sluggish throughout 2023, with expectations for acceleration in 2024. Unforgettably, 2023 marked a transformative moment in the history of AI with the emergence of generative AI applications, propelling meaningful impact and changes on modern data center networks.

Now as we look into 2024, below are our top 3 predictions for the year:

1. The Data Center Switch market to slow down in 2024

Following three consecutive years of double-digit growth, the Ethernet data center switch market is expected to slow down in 2024 and grow at less than half the rate of 2023. We expect 2024 sales performance to be suppressed by normalization of backlog, digestion of existing capacity, and optimization of spending caused either by macroeconomic conditions or a shift in focus to AI and budgets diverted away from traditional front-end networks used to connect general-purpose servers.

2. The 800 Gbps adoption to significantly accelerate in 2024

We predict 2024 to be a tremendous year for 800 Gbps deployments, as we expect a swift adoption of a second wave of 800 Gbps (based on 51.2 Tbps chips) from a couple of large Cloud SPs. The first wave of 800 Gbps (based on 25.6 Tbps chips) started back in 2022/2023 but has been slow as it has been adopted only by one Cloud SP. In the meantime, we expect 400 Gbps port shipments to continue to grow as 51.2 Tbps chips will also enable another wave of 400 Gbps adoption. We expect 400 Gbps/800 Gbps speeds to achieve more than 40% penetration by 2027 in terms of port volume.

3. AI workloads to drive new network requirements and to expand the market opportunity for both Ethernet and InfiniBand

The enormous appetite for AI is reshaping the data center switch market.  Emerging generative AI applications deal with trillions of parameters that drive the need for thousands or even hundreds of thousands of accelerated nodes. To connect these accelerated nodes, there is a need for a new fabric, called the AI back-end network, which is different from the traditional front-end network mostly used to connect general-purpose servers. Currently, InfiniBand is dominating the AI back-end networks but Ethernet is expected to gain significant share over the next five years. We provide more details about the AI back-end network market in our recently published Advanced Research Report: ‘AI Networks for AI Workloads.’ Among many other requirements, AI back-end networks will accelerate the migration to high speeds. As noted in the chart below, the majority of switch ports in AI back-end networks are expected to be 800 Gbps by 2025 and 1600 Gbps by 2027.

Migration to High-speeds in AI Clusters (AI Back-end Networks)

For more detailed views and insights on the Ethernet Switch—Data Center report or the AI Networks for AI Workloads report, please contact us at dgsales@delloro.com.

[wp_tech_share]

I would like to share some initial thoughts about the groundbreaking announcement that HPE has entered into a definitive agreement to acquire Juniper for $14 B. My thoughts are mostly around the switch business of both firms. The WLAN and security aspects of the acquisition are covered by our WLAN analyst Sian Morgan and security analyst Mauricio Sanchez.

My initial key takeaways and thoughts on the potential upside and downside impact of the acquisition are:

Pros:

  • In the combined data center and campus switch market, Cisco has consistently dominated as the major incumbent vendor, with a 46% revenue share in 2022. HPE held the fourth position with approximately 5%, and Juniper secured the fifth spot with around 3%. A consolidated HPE/Juniper entity would climb to the fourth position, capturing 8% market share, trailing closely behind Huawei and Arista.
  • Juniper’s standout performer is undeniably their Mist portfolio, recognized as the most cutting-edge AI-driven platform in the industry. As AI capabilities increasingly define the competitive landscape for networking vendors, HPE stands to gain significantly from its access to the Mist platform. We believe that Mist played a pivotal role in motivating HPE to offer a premium of about 30% for the acquisition of Juniper. In other words, Juniper brings better “AI technology for networking” to the table.
  • In the data center space, HPE has predominantly focused on the compute side, with a relatively modest presence in the Data Center switch business (HPE Data Center switch sales amounted to approximately $150 M in 2022, in contrast to Juniper’s sales that exceeded $650 M). Consequently, we anticipate that HPE stands to gain significantly from Juniper’s data center portfolio. Nonetheless, a notable contribution from HPE lies in their Slingshot Fabric, which serves as a compelling alternative to InfiniBand for connecting large GPU clusters. In other words, HPE brings better “Networking technology for AI” to the table.
  • Juniper would definitely benefit from HPE’s extensive channels and go-to-market strategy (about 95% of HPE’s business goes through channels). Additionally, HPE has made great progress driving their as-a-service GreenLake solution. However, GreenLake has been so far mostly dominated by compute. With the Juniper acquisition, we expect to see more networking components pushed through GreenLake.
  • In campus and with the Mist acquisition in particular, Juniper has been focusing mostly on high-end enterprises whereas HPE has been playing mainly in commercial and mid-market. Therefore, from that standpoint, there should be a little overlap in the customer base and more of cross-selling opportunities.

Cons:

  • Undoubtedly, a significant challenge arises from the substantial product overlap, evident across various domains such as data center switching, campus switching, WLAN, and security. Observing how HPE navigates the convergence of these diverse product lines will be intriguing. Ideally, the merged product portfolio should synergize to bolster the market share of the consolidated entities. Regrettably, history has shown that not all product integration and consolidation meet that desired outcome.
[wp_tech_share]

We’ve been participating in the OCP Open Compute Project (OCP) Global Summit for many years, and while each year has brought pleasant surprises and announcements, as described in previous OCP blogs from 2022 and 2021, this year stands out in a league of its own. 2023 marks a significant turning point, notably with the advent of AI, which many speakers have referred to as a tectonic shift in the industry and a once-in-a-generation inflection point in computing and in the broader market. This transformation has unfolded within just the past few months, sparking a remarkable level of interest at the OCP conference. In fact, this year, the conference was completely sold out, demonstrating the widespread eagerness to grasp the opportunities and confront the challenges that this transformative shift presents to the market. Furthermore, at OCP 2023, there was a new track just to focus on AI. This year marks the beginning of a new era in the age of AI. AI is here! The race is on!

This new era of AI is marked and defined by the emergence of new generative AI applications and large language models. Some of these applications deal with billions and even trillions of parameters and the number of parameters seems to be growing 1000X every 2 to 3 years.

This complexity and size of the emerging AI applications dictate the number of accelerated nodes needed to run the AI applications as well as the scale and type of infrastructure needed to support and connect those accelerated nodes. Regrettably, as illustrated in the chart below presented by Meta at the OCP conference, a growing disparity exists between the requirements for model training and the available infrastructure to facilitate it.

This predicament poses the pivotal question: How can one scale to hundreds of thousands or even millions of accelerated nodes? The answer lies in the power of AI Networks purposively built and tuned for AI applications. So, what are the requirements that the AI Networks need to satisfy? To answer that question, let’s first look at the characteristics of AI workloads, which include but are not limited to the following:

  • Traffic patterns consist of a large portion of elephant flows
  • AI workloads require a large number of short remote memory access
  • The fact that all nodes transmit at the same time saturates links very quickly
  • The progression of all nodes can be held back by any delayed flow. In fact, Meta showed last year that 33% of elapsed time in AI/ML is spent waiting for the network.

Given these unique characteristics of AI workloads, AI Networks have to meet certain requirements such as high speed, low tail-latency, lossless and scalable fabrics.

In terms of high-speed performance, the chart below, which I presented at OCP, shows that by 2027, we anticipate that nearly all ports in the AI back-end network will operate at a minimum speed of 800 Gbps, with 1600 Gbps comprising half of the ports. In contrast, our forecast for the port speed mix in the front-end network reveals that only about a third of the ports will be at 800 Gbps speed by 2027, while 1600 Gbps ports will constitute just 10%. This discrepancy in port speed mix underscores the substantial disparity in requirements between the front-end network, primarily used to connect general-purpose servers, and the back-end network, which primarily supports AI workloads.

In the pursuit of achieving tail-latency and creating a lossless fabric, we are witnessing numerous initiatives aimed at enhancing Ethernet and modernizing it for optimal performance in AI workloads. For instance, the Ultra Ethernet Consortium (UEC) was established in July 2023, with the objective of delivering an open, interoperable, high-performance full-communications stack architecture based on Ethernet. Additionally, OCP has formed a new alliance to address significant networking challenges within AI cluster infrastructure. Another groundbreaking announcement from the OCP conference came from Google, who unveiled their opening of Falcon chips; a low-latency hardware transport, to the ecosystem through the Open Compute Project.

Source: Google

 

At OCP, there was a huge emphasis on adopting an open approach to address the scalability challenges of AI workloads, aligning seamlessly with the OCP 2023 theme: ‘Scaling Innovation Through Collaboration.’ Both Meta and Microsoft have consistently advocated, over the years, for community collaboration to tackle scalability issues. However, we were pleasantly surprised by the following statement from Google at OCP 2023: “A new era of AI systems design necessitates a dynamic open industry ecosystem”.

The challenges presented by AI workloads to network and infrastructure are compounded by the broad spectrum of workloads. As illustrated in the chart below showcased by Meta at OCP 2023, the diversity of workloads is evident in their varying requirements.

Source: Meta at OCP 2023

 

This diversity underscores the necessity of adopting a heterogeneous approach to build high-performance AI Networks and infrastructure capable of supporting a wide range of AI workloads. This heterogeneous approach will entail a combination of standardized as well as proprietary innovations and solutions. We anticipate that Cloud service providers will make distinct and unique choices, resulting in market bifurcation. In the upcoming Dell’Oro Group’s AI Networks for AI Workloads report, I delve into the various network fabric requirements based on cluster size, workload characteristics, and the distinctive choices made by cloud service providers.

Exciting years lie ahead of us! The AI journey is just 1% finished!

 


Save the date: Free OCP Educational Webinar on November 9, 8 AM PT, explores AI-driven network solutions, market potential, featuring Juniper Networks and Dell’Oro Group. Register now!

[wp_tech_share]

The rise of accelerated computing for applications such as AI and ML over the last several years has led to new innovations in the areas of compute, networking, and rack infrastructure. Accelerated computing generally refers to servers that are equipped with coprocessors such as GPUs and other custom accelerators. These accelerated servers are deployed as a system consisting of low-latency networking fabric, and enhanced thermal management to accommodate the higher power envelope.

Today, data centers account for approximately 2% of the global energy usage. While the latest accelerated server can consume up to 6kW of power each and may seem counterintuitive from a sustainability perspective, accelerated systems are actually more energy efficient compared to general-purpose servers when matched to the right mix of workloads. The advent of generative AI has significantly raised the threshold in compute and network demands, given that these language models consist of billions of parameters. Accelerators can help to train these large language models within a practical timeframe.

Deployment of these AI language models usually consists of two distinct stages: training and inference.

  • In AI training, data is fed into the model, so the model learns about the type of data to be analyzed. AI training is generally more infrastructure intensive, consisting of one to thousands of interconnected servers with multiple accelerators (such as GPUs and custom coprocessors) per server. We classify accelerators for training as “high-end” and examples include NVIDIA H100, Intel Gaudi2, AMD MI250, or custom processors such as the Google TPU.
  • In AI inference, the trained model is used to make predictions based on live data. AI inference servers may be equipped with discrete accelerators (such as GPUs, FPGAs, or custom processors) or embedded accelerators in the CPU. We classify accelerators for inference as “low-end” and examples include NVIDIA T4 or L40S. In some cases, AI inference servers are classified as general-purpose servers because of the lack of discrete accelerators.

 

Server Usage: Training vs. Inference?

A common question that has been asked is how much of the infrastructure, typically measured by the number of servers, is deployed for training as opposed to inference applications, and what is the adoption rate of each type of platform? This is a question that we have been investigating and debating, and the following factors complicate the analysis.

  • NVIDIA’s recent GPU offerings based on the A100 Ampere and H100 Hopper platforms are intended to support both training and inference. These platforms typically consist of a large array of multi-GPU servers that are interconnected and well-suited for training large language models. However, any excess capacity not used for training can be utilized towards inference workloads. While inference workloads typically do not require a large array of servers (although inference applications are increasing in size), inference applications can be deployed for multiple tenants through virtualization.
  • The latest CPUs from Intel and AMD have embedded accelerators on the CPU that are optimized for inference applications. Thus, a monolithic architecture without discrete accelerators is ideal as capacity can be shared by both traditional and inference workloads.
  • The chip vendors also sell GPUs and other accelerators not as systems but as PCI Express add-in cards. One or several of these accelerator add-in cards can be installed by the end-user after the sale of the system.

Given that different workloads (training, inference, and traditional) can be shared on one type of system, and that end-users can reconfigure the systems with discrete accelerators, it becomes less meaningful to delineate the market purely by workload type. Instead, we segment the market by three distinct types of server platform types as defined in Figure 1.

Server Platform Types - DellOro

We expect each of these platform types to have a different growth expectation. Growth of general-purpose servers is slowing, with a 5-year CAGR of less than 5% given increasing CPU core counts and use of virtualization. On the other hand, accelerated systems are forecast for a 5-year CAGR in the range of approximately 40%. By 2027, we project accelerated systems will account for approximately a 16% share of all server shipments, and will have the mix of accelerator types as shown in Figure 2.

Looking ahead we expect continued innovation and new architectures to support the growth of AI. More specialized systems and processors will be developed that will enable more efficient and sustainable computing. We also expect the vendor landscape to be more diversified, with compelling solutions from the vendors and cloud service providers to optimize the performance of each workload.

To get additional insights and outlook for servers and components such as accelerators and CPUs for the data center market, please check out our Data Center Capex report and Data Center IT Semiconductor and Components report.