Data Center Infrastructure Archives - Page 5 of 8

The OCP Global Summit was back to an exclusively in-person event in 2022. The community was as excited as ever to get together in person, with 3500+ people in attendance for an all-time record of attendance. In this new blog, exclusively for OCP, Lucas Beran, Principal Analyst for Data Center Physical Infrastructure market, will discuss the three key takeaways from the event.

At the end of April, Nokia, a fairly new entrant to the data center switch space, made the groundbreaking announcement that the company will be supplying its 7250 IXR networking gear to Microsoft, the third-largest Cloud Service Provider (SP).

As I noted in my 2022 prediction blog published earlier this year, I have been anticipating a fair number of new switch vendor insertions at the large hyperscalers in 2022, as the 400 Gbps upgrade cycle starts to materialize outside of Google and Amazon. Silicon diversity would be one of the major reasons for these potential changes in the vendor landscape, as these hyperscalers need to keep pricing pressure on Broadcom, the dominant merchant silicon supplier to date. Supply challenges further accelerated the need for silicon diversity. However, what is intriguing is that Nokia’s 7250 IXR is based on Broadcom’s merchant silicon, not Nokia’s FP5 proprietary chips. So what will Nokia bring to the table?

What’s in it for Microsoft?

Although Nokia is a fairly new entrant in the data center switch space, the company is among the leading vendors in the router market and in several other Telecom SP segments. Clearly, Nokia has significant experience in systems design, which – as we learned from the company’s spokesperson – allowed it to achieve power savings at a system level. As a reminder, as network speeds move to 400 Gbps and beyond, power consumption becomes one of the most constraining factors that limits what Cloud SPs can build and deploy in their data centers. In fact, Microsoft already faced this challenge with its 400 Gbps deployment, as it had to wait for Broadcom’s Jericho 2C+ chips that consume less power than their prior generation of Jericho 2 counterparts.

Furthermore, Nokia has made significant contributions to the SONIC ecosystem. (SONIC is the open-source software built by Microsoft that runs in its data center networks.) We view this Microsoft data center win as a reward for the company’s contribution. In fact, this quid pro quo relationship expands well beyond the data center win into several other areas. For example, Nokia is also working with Microsoft on developing 4G LTE and 5G private wireless for the enterprise segment. This collaboration brings together Nokia’s virtualized radio access network (vRAN) and multi-access edge cloud (MEC) with the Azure Private Edge platform.

Additionally, Nokia has the potential to leverage its coherent optics technology; which the firm obtained with its Elenion acquisition to drive cost and power savings at a system level for data center interconnect (DCI) applications.

Last, but not least, although Nokia’s 7250 IXR is built on Broadcom’s silicon which does not satisfy the silicon diversity requirement, it will nonetheless provide Microsoft with another route to access Broadcom chips, which is critical in a supply-constrained environment.

Where will Nokia’s 7250 IXR be deployed?

The initial deployment of Nokia’s modular switches will occur in the spine, which Microsoft refers to as Tier 2, but may expand to DCI applications at a later stage. As a reminder, Microsoft has been deploying predominantly Arista in Spine/DCI but has also recently qualified Cisco (with its silicon one-based 8000 chassis). Nokia will also supply fixed form factors for Top-of-Rack (ToR) applications. It is worth noting that Microsoft has always had a multi-vendor strategy for its ToR applications, where volume is high but the margin is thin. So far, the company has deployed a mix of Cisco, Dell, and Mellanox (Nvidia).

What does this mean for incumbent vendors?

While we view this announcement as a major win for Nokia and as validation of its competitive positioning in the data center switch market, we believe that Microsoft will strive to keep its existing suppliers happy and provide them with enough motivation to compete for its business. Our interviews revealed that Arista is expected to remain the preferred supplier for spine/DCI applications at Microsoft during the 400 Gbps upgrade cycle. Additionally, we expect Microsoft to go through major expansion and upgrade activities this year and that its data center spending will be strong enough to benefit all vendors – incumbents as well as new entrants.

For more details and insights on cloud service providers’ data center network design and a list of suppliers, please contact us at dgsales@delloro.com.

Data centers are the backbone of our digital lives, enabling the real-time processing of and aggregation of data and transactions, as well as the seamless delivery of applications to both enterprises and their end customers. Data centers have been able to grow to support ever-increasing volumes of data and transaction processing thanks in large part to software-based automation and virtualization, allowing enterprises and hyperscalers alike to adapt quickly to changing workload volumes as well as physical infrastructure limitations.

Despite their phenomenal growth and innovation, the principles of which are being integrated into service provider networks, data centers of all sizes are about to undergo a significant expansion as they are tasked with processing blockchain, bitcoin, IoT, gigabit broadband, and 5G workloads. In our latest forecast, published earlier this month, we expect worldwide data center capex to reach $350 B by 2026, representing a five-year projected growth rate of 10%. We also forecast hyperscale cloud providers to double their data center spending over the next five years.

Additionally, enterprises are all becoming smarter about how to balance and incorporate their private clouds, public clouds, and on-premises clouds for the most optimal and efficient processing of workloads and application requests. Similar to highly-resilient service provider networks, enterprises are realizing that the distribution of workload processing allows them to scale faster with more redundancy. Despite the general trend towards migrating to the cloud, enterprises will continue to invest in on-premises infrastructure to handle workloads that involve sensitive data, as well as those applications that are very latency-sensitive.

As application requests, change orders, equipment configuration changes, and other general troubleshooting and maintenance requests continue to increase, anticipating and managing the necessary changes in multi-cloud environments becomes exceedingly difficult. Throw in the need to quickly identify and troubleshoot network faults at the physical layer and you have a recipe for a maintenance nightmare and, more importantly, substantial revenue loss due to the cascading impact of fragmented networks that are only peripherally integrated.

Although automation and machine learning tools have been available for some time, they are often designed to automate application delivery within one of the multiple cloud environments, not across multiple clouds and multiple network layers. Automating IT processes across both physical and virtual environments and across the underlying network infrastructure, compute and storage resources have been a challenge for some time. Each layer has its own distinct set of issues and requirements.

New network rollouts or service changes resulting in network configuration changes are typically very labor-intensive and frequently yield faults in the early stages of deployment that require significant man-hours of labor.

Similarly, configuration changes sometimes result in redundant or mismatched operations due to the manual entry of these changes. Without a holistic approach to automation, there is no way to verify or prevent the introduction of conflicting network configurations.

Finally—and this is just as true of service provider networks as it is of large enterprises and hyperscale cloud providers—detecting network faults is often a time-consuming process, principally because network faults are often handled passively until they are located and resolved manually. Traditional alarm reporting followed by manual troubleshooting must give way to proactive and automatic network monitoring that quickly detects network faults and uses machine learning to rectify them without any manual intervention whatsoever.

Automating a Data Center’s Full Life Cycle

As the size and complexity of data centers continue to increase and as workload and application changes increase, the impact on the underlying network infrastructure can be difficult to predict. Various organizations both within and outside the enterprise have different requirements that all must somehow be funneled into a common platform to prevent conflicting changes to the application delivery layer all the way to the network infrastructure. These organizations can also have drastically different timeframes for the expected completion of changes largely due to siloed management of different portions of the data center, as well as different diagnostic and troubleshooting tools in use by the network operations team and the IT infrastructure teams.

In addition to pushing on their equipment vendor and systems integrator partners to deliver platforms that solve these challenges, large enterprises also want platforms that give them the ability to automate the entire lifecycle of their networks. These platforms use AI and machine learning to build a thorough and evolving view of underlying network infrastructure to allow enterprises to:

- Support automatic network planning and capacity upgrades by modeling how the addition of workloads will impact current and future server requirements as well as the need to add switching and routing capacity to support application delivery.
- Implement network changes automatically, reducing the need for manual intervention and thereby reducing the possibility of errors.
- Constantly provide detailed network monitoring at all layers and provide proactive fault location, detection, and resolution while limiting manual intervention.
- Simplify the service and application provisioning process by providing a common interface that then translates requests into desired network changes.

Ultimately, one of the key goals of these platforms is to create a closed-loop between network management, control, and analysis capabilities so that changes in the upper-layer services and applications can drive defined changes in the underlying network infrastructure automatically. In order for this to become a reality in increasingly complex data center network environments, these platforms must provide some critical functions, including:

- Providing a unified data model and data lakes across multiple cloud environments and multi-vendor ecosystems
  - This function has been a long-standing goal of large enterprises and telecommunications service providers for years. Ending the swivel-chair approach to network management and delivering error-free network changes with minimal manual intervention are key functions of any data center automation platform.
- Service orchestration across multiple, complex service flows
  - This function has also been highly sought-after by large enterprises and service providers alike. For service providers, SDN overlays were intended to add in these functions and capabilities into their networks. Deployments have yielded mixed, but generally favorable results. Nevertheless, the principles of SDN continue to proliferate into other areas of the network, largely due to the desire to streamline and automate the service provisioning process. The same can be said for large enterprises and data center providers.

Although these platforms are intended to serve as a common interface across multiple business units and network layers, their design, and deployment can be modular and gradual. If a large enterprise wants to migrate to a more automated model, it can do so at a pace that is suited to the organization’s needs. The introduction of automation can be done first at the network infrastructure layer and then introduced to the application layer. Over time, with AI and machine learning tools aggregating performance data across both network layers, correlations between application delivery changes and their impact on network infrastructure can be determined more quickly. Ultimately, service and network lifecycle management can be simplified and expanded to cover hybrid cloud or multi-vendor environments.

We believe that these holistic platforms that bridge the worlds of telecommunications service providers and large enterprise data centers will play a key role in helping automate data center application delivery by providing a common window into the application delivery network as well as the underlying network infrastructure. The result will be the more efficient use of network resources, a reduction in the time required to make manual configuration changes to the network, a reduction in the programming load for IT departments, and strict compliance with SLA guarantee to key end customers and application provider partners.

As pandemic-related headwinds started to ease, we were optimistic for a return to higher growth on data center infrastructure spending in 2021. The Cloud was entering an expansion cycle and demand signals in the Enterprise were gaining momentum. While data center capex grew 9% in 2021, which was in line with our prior projections, growth was mainly driven by higher cost of data center equipment, rather than by unit volume. Server unit growth, which was flat for the year, was constrained by component shortages and long lead times. Deliveries for networking and physical infrastructure equipment are also facing a mounting backlog. Furthermore, higher supply chain costs, from increased commodity, expedite, and logistics costs led to higher system prices. Our 2022 outlook is more optimistic, with a data center capex projection of 17%, accompanied by double-digit growth in server unit shipments. We identify the following key trends that could shape the dynamics of data center capex in 2022.

Hyperscale Cloud on Expansion Cycle

The Top 4 Cloud service providers—Amazon, Google, Meta (formerly Facebook), and Microsoft—are expected to increase data center capex by over 30% in 2022. Investments will go towards the replacement of aged servers, increased deployment of accelerated computing, as well as servers for new data centers in more than 30 regions that are scheduled to launch in 2022. Furthermore, infrastructure planned last year that was not deployed due to extended equipment lead-times have resulted in additional tailwind growth as deliveries are fulfilled in 2022.

Supply Chain Stabilizing

Generally, the major Cloud service providers have weathered through this tough supply chain climate better than the rest of the market given their strong visibility in their demand and can proactively increase inventory levels of crucial components and build redundancies in their supply chains. On the other hand, data center capex growth in Tier 2 and 3 Cloud service providers and Enterprise have been supply-constrained. There is some consensus that the level of supply chain disruptions is starting to stabilize and possibly ease by the second half of 2022. Lead-time for servers could improve sooner than other data center equipment such as networking, given their relatively larger scale and lower product mix.

Metaverse Could Drive Opportunities In AI Infrastructure

Some of the major Cloud service providers, such as Apple, Meta, Microsoft, and Tencent, have announced plans to enrich their metaverse offerings for both enterprise and consumer applications. This would require increased investments in new infrastructures, such as servers with accelerated co-processors, low-latency networking, and enhanced thermal management solutions. Chip manufacturers and major Cloud service providers will be developing specialized processors for AI applications. The ecosystem would need to evolve to enable the community of AI application developers to broaden the reach of AI into enterprises. AI infrastructure is costly and will be a major capex driver. For instance, we estimate that the cost of AI infrastructure is largely responsible for Meta’s plans to increase capex by approximately 60% this year.

New Server Architectures On The Horizon

Intel is releasing a new processor platform, Sapphire Rapids, later this year. Sapphire Rapids will feature the latest in server interconnect technologies, such as PCIe 5, DDR5, and more importantly, CXL. These new high-speed interfaces could alleviate system bandwidth constraints, enabling more processor cores and memory to be packaged into a single server. CXL would enable memory sharing between the CPU and other co-processors within the server and rack, enabling data-intensive applications such as AI to access memory more efficiently and at lower latencies. AMD and ARM will also incorporate these new interfaces within their processor platforms as well. We expect these enhancements could kick off a multi-year journey of new server architecture developments.

Let’s Not Forget About Server Connectivity

Last but not least on this list, server connectivity will also need to evolve continuously and not clog the connection between server and the rest of the network. The hyperscale Cloud service providers have been deploying in production the latest generation network interface cards (NICs) based on 56 Gbps PAM-4 SerDes of up to 100 Gbps for general purpose workloads, and up to 200 Gbps for advanced workloads such as AI. The Enterprise is fully embracing 25 Gbps NICs, and we anticipate the number of 25 Gbps ports to overtake that of 10 Gbps later this year. Smart NICs, or data processing units (DPUs) are being deployed by the major Cloud service providers across their infrastructure to improve server utilization, and to accelerate latency-sensitive applications such as AI. Outside of the hyperscale, Smart NIC adoption is still in its nascent stage. However, given that most of the network adapter vendors have a Smart NIC solution available in the market, enterprises potentially have a wide range of choices to fit their applications and budget.

A New Year always marks a great time to look back and reflect on the previous year and predict what it means for the coming year. It’s specifically an exciting time for the Data Center Physical Infrastructure research at Dell’Oro Group, with the program’s first publication of the Q3 2021 Report. While we did not make any predictions for data center physical infrastructure in 2021, we can certainly recap the year before looking at 2022 predictions.

For the data center physical infrastructure market, 2021 can be split into two major themes. During the first half of 2021, the market for data center physical infrastructure rebounded strongly, growing 17.7% to $10 billion after a pandemic-induced market dip in 2020. Year-over-year comparisons were favorable, but cloud service provider investment and rebounding enterprise spending in North America and EMEA drove the market past 2019 levels. However, the story changed in the second half of 2021. New covid-19 variants Delta and Omicron reared their ugly heads, while supply chains began to break down, leading to a lack of availability in components and products, raw material price increases, and labor and logistical issues. We forecast that this slowed data center physical infrastructure growth to 4.7%, with the market reaching $11.4 billion in revenues during the second half of 2021. Data center physical infrastructure vendors entered 2022 with record backlogs, but questions remain on how much supply they will be able to deliver as demand continues to outpace supply. While these supply chain issues will likely persist throughout 2022, what else does the data center physical infrastructure market have in store for us?

1. Plans to Reach Long-Term Data Center Sustainability Goals Begin to Materialize

As the global COVID-19 pandemic accelerated digital adoption and growth throughout 2021, it also cast a large shadow on the growing climate impact of data center growth. It’s no wonder sustainability quickly became one of the most common buzzwords in the industry. The data center industry responded with aggressively expanding sustainability commitments, which were previously largely tied to 100% renewable energy offset credits. Renewable energy goals transitioned from 100% renewable energy offsets to 100% renewable energy consumption. Data center water usage also came under fire, with Microsoft notably pledging to cut water usage 95% by 2024 and become water positive by 2030. But by far the most common goal set by data center owners and operators was to become carbon neutral, or even carbon negative in some cases, by 2030. Critics were quick to point out the difficult path to achieve those goals, with details on how, sparse. 2022 will bring more clarity on some of the technologies that will help enable progress towards those goals. Data Center Physical Infrastructure will specifically play a big role in a number of areas.

- Backup power connects to the grid – A large portion of data center physical infrastructure is committed to providing clean, uninterruptible power to IT infrastructure even during a utility power outage, through the use of UPS systems, batteries, and generators. Those systems largely sit unutilized when utility power is on. That is beginning to change, spurred by the adoption of lithium-ion batteries, which are creating new energy storage use cases at data center facilities. This technology, commonly referred to as Grid-connected UPS, will enable opportunities for those idle assets to become revenue-generating or cost-saving, through peak shaving, frequency regulation, and other grid participation activities, in addition to supporting better integration of renewable energy. Microsoft and Eaton have publically collaborated on grid-interactive UPS, recently releasing a white paper on the subject. We predict major strides on-grid interactive UPS systems in 2022, with details and an ecosystem forming around early pilots to support execution on larger scale rollouts.
- Fuel cells replace generators – Okay, this isn’t happening in 2022. But, the recent announcement of Vertiv, Equinix, and other utility, fuel cell, and research partners working on a proof-of-concept (POC) fuel cell use case for data centers funded by the Clean Hydrogen Partnership sure does create some excitement. Vertiv has committed to providing a 100kW fuel cell module with an integrated UPS by 2023. Here’s to hoping we can get updates throughout the year to learn more about how fuel cell technology can be applied to data centers and on what timeline.
- Data center heat re-use bubbles up to the top of sustainability priorities – Data centers consume a lot of power, and in turn, generate a lot of heat. Today, air-based thermal management systems are in place to capture and reject that heat into the atmosphere. However, that heat has a significant opportunity to be re-used, with district heating and urban farming as commonly cited examples. The difficulty in scaling data center heat re-use is that today’s thermal management designs and infrastructure largely don’t support it. In 2022, we predict that to change, with heat re-use technology being designed into new products and data center architectures. To take full advantage of heat reuse, data centers owners and ecosystem vendors will turn to liquids, which transfer energy up to ten times more efficiently than air, to get the most out of heat-reuse technology.

2. Liquid Cooling Adoption Momentum Continues as POC Deployments Proliferate and Early Adopters Begin Larger Roll Outs

Traditionally, the data center industry has been conservative in adopting new physical infrastructure technologies. Interested in bringing liquids into my IT space, let alone into the IT rack? Absolutely not. However, as Moore’s Law has struggled to keep pace, data center rack densities have started to rise. In the high-performance computing (HPC) space, air-cooling simply wasn’t an option anymore as HPC IT rack densities surpassed 20 kW, 50 kW, and even 100 kW in some cases. This trend formed the foundation of the market that is liquid cooling today. This includes both direct liquid (pumping liquids to cold plates directly attached to CPUs, GPUs, and memory) and immersion (submerging an entire rack of servers in a liquid-filled tank).

Liquid cooling market revenue growth accelerated in 2021, growing an estimated 64.3% from 2020 to $113M. Another 25% growth is forecast for 2022, with the market forecast to reach $141M, despite constrained supply chains. This growth is forecast to be driven by proliferating POCs from cloud, colocation, and telco service providers, in addition to large enterprises dipping their toes in. For early adopters, larger-scale rollouts of liquid cooling technology are forecast to begin, with increased awareness and comfort in operating liquid-cooled data centers. With momentum continuing to build, an inflection point for liquid cooling adoption appears near.

3. Supply Chain Resiliency and Integrated Solutions Drive Mergers, Acquisitions, and Partnerships

Supply chain discussions are creeping into nearly every conversation these days, so we can’t have 2022 predictions without assessing what impact they might have on the year. First, we do believe supply chain issues will persist throughout 2022, and potentially into 2023. However, we predict their lasting impact on the year will be from the mergers, acquisitions, and partnerships they drive.

Supply chain disruptions have become common place over the past three years. From the onset of US and China trade war tensions, data center physical infrastructure vendors have already been localizing supply chains in region, for region. The pandemic has only added more unpredictability to global supply chains, exposing further weaknesses. To address these weaknesses, we predict a flurry of mergers and acquisitions. We believe these acquisitions will be focused on supply chain resiliency, establishing and growing manufacturing footprints in select regions, while also supporting the delivery of holistic data center solutions at the rack, row, pod or building level. Checking off multiple of these boxes makes any potential acquisition quite appetizing in 2022.

At the beginning of next year, we’ll circle back and see how we did on our predictions. In the meantime, stay connect with the data center physical infrastructure program for the latest updates.

Contact Us