An expanded Nvidia certification program now includes servers with the company’s BlueField-2 data processing units (DPUs) and those with Arm-based processors. The graphics and data acceleration processing provider is also close to rolling out its DGX SuperPOD infrastructure as a service.
The company announced the expanded certification and the DGX SuperPOD as a service last week during the Computex virtual event. Both announcements are the latest milestones by Nvidia to make it easier and more affordable to run applications with artificial intelligence. Because AI requires costly and complex high-performance computing resources, it is out of reach to most organizations.
Nvidia’s Manuvir Das at Computex 2021
Manuvir Das, Nvidia’s head of enterprise computing, gave a keynote address at Computex, where he outlined the company’s announcements.
“It is time to democratize AI by bringing its transformative power to every company and its customers,” Das said.
To accomplish that, Nvidia recently launched its Bluefield-2 data processing units (DPUs), software-programable CPUs with high-performance networking interfaces. Nvidia gained its DPU technology with last year’s $7 billion acquisition of Mellanox. The company is also enabling servers based on Arm’s low-power CPUs.
“We believe that the opportunity for Nvidia in the enterprise market is substantial,” said Craig Weinstein, VP of NVIDIA’s Americas Partner organization. “In the domain of AI, there’s a lot of consulting work that’s required, depending on the industries that we’re serving, and many of the customers in those industries need a lot of help.”
Asus, Dell Technologies, Gigabyte, QCT and Supermicro launched servers with Nvidia’s BlueField-2 DPUs. As computing is increasingly driven by processing large amounts of data, current servers cannot process all of it, Das said.
“The flow of data to the network becomes crucial to both the capability and the security of the data center,” Das said. “A new kind of hardware is needed that sits on the data path and intelligently optimizes, inspects and protects the data, and protects the applications from one another. Every server will need a DPU.”
Nvidia’s DPUs take on the infrastructure tasks traditionally performed on CPUs, according to the company. That frees capacity on the server CPUs to run applications, which NVIDIA believes is more efficient.
Overall, Nvidia has added more than a dozen certified systems partners, with the total now more than 50. Among the latest new certified partners include Dell, HPE, Nettrix and Supermicro, which launched servers based on Nvidia’s HGX accelerated computing platform. The servers include either 4 or 8 Nvidia A100 GPUs, Nvidia NVLink GPU interconnects, Nvidia InfiniBand network interfaces and the company’s AI and HPC software stack.
Certified for New Arm Servers
Nvidia also added a certification program for Arm-based CPU servers that it designed to provide high-performance compute to run AI-based workloads. Gigabyte and Wiwynn said they’ll launch servers with Arm’s Neoverse-based CPUs and either NVIDIA Ampere architecture GPUs or BlueField-2 DPUs. Once Nvidia certifies them, look for the servers sometime next year.
Gigabyte is partnering with Nvidia to create an Arm HPC Developer Kit to let programmers build AI and scientific computing applications. The kit comes with an Arm CPU, and Nvidia’s A100 Tensor Core GPU server APIs.
While approval of Nvidia’s pending $40 billion agreement to acquire Arm remains uncertain, it won’t impact the new certification program, Das said.
Base Command for SuperPOD as a Service
Nvidia’s new DGX SuperPOD supercomputing infrastructure as a service will arrive this summer. To deliver that capability, Nvidia launched Base Command, which will make DGX SuperPOD available in smaller increments.
The minimal configuration of a DGX SuperPOD is a cluster of 20 DGX systems.
“The pinnacle of AI capability is the DGX SuperPod,” Das said. “The first step to democratization is to make this best-of-breed machine more accessible and more obtainable.”
With Base Command, customers will be able to use as few as 3 DGX systems for only a few months.
Base Command, a Kubernetes-based software stack, “allows administrators to share this powerful supercomputer across an organization of data scientists, and a mix of workloads,” Dais said. Like DGX POD, SuperPOD is a supercomputer that uses NetApp’s ONTAP AI data management platform, he added.
As a hosted offering, Nvidia will manage the infrastructure. The offering includes all flash storage and data management from NetApp. Furthermore, it is available in Equinix data centers. Nvidia announced that it is also working with AWS and Google to offer Base Command from their respective cloud services.
“This work offers the promise of a true hybrid AI experience for customers — write once, run anywhere,” Das said.
The subscription service is available in early access now; the company plans broader availability this summer. Monthly subscription pricing starts at $90,000.