HPE’s AI and supercomputing journey continues with new Cray and Slingshot hardware

Hewlett Packard Enterprise (HPE) is once again flexing its high performance computing muscles with three new pieces of Cray Supercomputing hardware and a new Slingshot interconnect.

The launch is being tipped as an end-to-end portfolio for high performance computing setups and includes a compute blade, accelerator blade, storage system, and interconnect.

All of this is supplemented by new software that the company says will “[improve] the user experience of running compute-intensive workloads”.

AMD vs Nvidia – why not both?

The HPE Cray Supercomputing EX4252 Gen 2 Compute Blade is, according to the company, capable of delivering over 98,000 cores in a single cabinet. This, HPE claims, makes it the most powerful one-rack unit system in supercomputing.

It also contains eight 5th Gen AMD Epyc processors, offering a high density of CPUs. HPE says this allows customers to benefit from higher-performing compute without needing more space.

While the compute blade may be AMD-powered, the HPE Cray Supercomputing EX154n Accelerator Blade is all Nvidia. It features the chipmaker’s GB200 Grace Blackwell NVL4 Superchip, with each blade holding four NVIDIA NVLink-connected Blackwell GPUs connected with two NVIDIA Grace CPUs over NVIDIA NVLink-C2C.

Storage and slingshots

On top of this, the company has lifted the lid on a new piece of networking infrastructure for the supercomputing industry, the Slingshot interconnect 400.

According to HPE, this interconnect offers twice the line speed of its predecessor, with cables, network interface controllers, and switches offering speeds of 400gb/sec.

It also boasts ultra-low latency thanks to automated congestion management and adaptive routing, the company claimed, adding that this will allow customers “to run large workloads with significantly less network infrastructure”.

The final piece of supercomputing hardware in this release is the HPE Cray Supercomputing Storage Systems E2000, which the company claims offers more than double the I/O performance of the Cray ClusterStor E1000 that came before it.

According to HPE’s own metrics, the E2000 is capable of 190 GB/sec read performance and 140 GB/sec write performance. The E1000, by contrast, is capable of 85 GB/sec read and 65 GB/sec write performance.

New AI-focused on-prem servers

In addition to new supercomputing products, HPE has also announced two new AI-focused ProLiant Compute servers, the XD680 and XD685. Both of these servers are aimed at managed service providers and large enterprises that want to train their own AI models rather than use publicly available resources – a key element of HPE’s recent business strategy.

While they may have the target audience, their specifications are quite different.

The XD680 brings a third big-name chipmaker onto the stage, featuring eight Intel Gaudi 3 AI accelerators in one compact node. HPE has pitched this server as being “optimized with price-for-performance in mind”.

It is air cooled, rather than liquid cooled like the XD685, and is designed for intense training, tuning, and inferencing workloads, the company said.

The new XD685 is designed for high-performance and energy efficiency, and features a choice of either eight NVIDIA H200 SXM Tensor Core GPUs or NVIDIA Blackwell GPUs in a five rack-unit chassis.

If the name “HPE ProLiant Compute XD685” sounds familiar, this is because the company already announced a server under the same name in October 2024. That XD685, however, uses AMD accelerators and CPUs, but this is not a swift replacement and both pieces of hardware will be available to purchase depending on what a buyer needs.

Marching to a familiar beat

This latest slew of announcements, as well as the XD685 announcement from October, are all very much in line with the company’s most recent talking points. It has been playing up its prowess in HPC for the past couple of years, pointing to the number of supercomputers in the Top500 that use its hardware.

It was no slouch in capitalizing on the AI trend, either. The firm recognized that some companies will want to train their own large language models (LLMs) and generative AI and that most off-the-shelf hardware available at the AI boom started wouldn’t be capable of handling that.

HPE has also given a nod to its fanless direct water cooling technology, which is more energy efficient than air cooling. This in theory also makes it greener, although water use by supercomputers and cloud data centers is an increasing ecological concern.

Source link