Red Hat and NVIDIA this week announced a joint effort under which AI applications can be built using containers, which then are deployed on instances of Red Hat OpenShift that are running on supercomputers utilizing NVIDIA graphical processor units (GPUs).
The Red Hat OpenShift platform, which is based on an instance of Kubernetes, then would be able to host AI applications that are much more manageable using containers. Instead of trying to maintain a massive monolithic AI application that is unwieldy to maintain and update, organizations will be able to update components of the AI application as they see fit.
The announcement was made at the NVIDIA GPU Technology Conference.
Ron Pacheco, director of product management for Red Hat Enterprise Linux (RHEL), says the first step toward achieving this goal is deploying RHEL on NVIDIA DGX-1 hardware systems. After that, Red Hat and NVIDIA have pledged to make NVIDIA GPU Cloud (NGC) containers available on Red Hat OpenShift. NVIDIA already makes extensive use of Docker containers to make various libraries and applications available to developers of AI applications.
The two companies also reiterated a commitment to collaborate on heterogeneous memory management (HMM). The goal is to create a kernel feature that allows devices to access and mirror the content of a system’s memory into their own, which would serve to make GPU more efficient as well as boosting the performance of any application deployed on NVIDIA GPUs.
NVIDIA, of course, is not the only provider of supercomputer hardware using its processors. Eventually, other manufacturers of supercomputers based on NVIDIA GPUs also should be able to run Red Hat OpenShift. In addition, cloud service providers such as Amazon Web Services (AWS) and Microsoft make GPUs available as a cloud service. Red Hat already has existing relationships with both leading cloud service providers.
It’s not crystal clear just yet where and how AI applications will be deployed across the enterprise. Organizations building AI applications typically rely on GPU to train their AI models because the parallelism inherent in those processors provides a more efficient approach than x86 processors. But when it comes to deploying the inference engines required to run AI applications, most of those workloads run on x86 processors. But NVIDIA has launched a series of processors specifically optimized for inference engines as part of an effort to supplant Intel entirely. NVIDIA envisions that many of those processors will be deployed at the network edge in support of 5G and internet of things (IoT) applications.
While most organizations are committed to employing some combination of machine and deep learning algorithms to build AI applications, very few are anywhere near deploying them at scale in a production environment. What many of them are already discovering is that AI applications are not so much built, but trained. However, because each AI application is made up of a series of models that need to be updated and replaced over time, all the best DevOps practices that organizations are adopting today will carry forward to AI applications.