Rafay Systems Provides Visibility Into GPUs on K8s Clusters

Rafay Systems has added a dashboard to its platform for managing Kubernetes clusters that gives IT teams more visibility into how graphical processor unit (GPU) resources are being consumed.

Mohan Atreya, senior vice president for product and solutions for Rafay Systems, says the Rafay Kubernetes Operations Platform (KOP) now features an integrated GPU Resource Dashboard that surfaces GPU metrics.

Many organizations are much more sensitive to GPU optimizations than traditional processors because GPUs cost significantly more to acquire. IT leaders want to be able to reassign GPU resources to teams that are actively using them versus just letting them sit idle, notes Atreya.

In addition to running graphics-intensive applications, GPUs are also being used widely to train artificial intelligence (AI) models. In some instances, GPUs are also being employed to run the inference engines that are used to trigger the rules defined in the AI model. KOP, as of yet, doesn’t provide the ability to slice GPU resources that would make it easier for them to be shared by multiple applications. There may be ways to achieve that goal, but they are not generally supported by providers of GPUs, notes Atreya. There may, of course, come a day when GPUs are much less expensive than they are today, but for the foreseeable future optimizations of GPU resources are likely to be a high priority for the organizations that make use of a GPU within a Kubernetes cluster.

Graphics Processing Unit (GPU) Resource Dashboard from Rafay Systems, the leading platform provider for Kubernetes Operations.

In general, IT teams are looking for tools that allow them to manage individual clusters on a more granular basis as well as fleets of Kubernetes clusters at higher levels of abstraction. One of the reasons that Kubernetes clusters have not been as widely adopted as they might be is that IT teams don’t perceive them to be as simple to manage as a virtual machine. As a result, individual DevOps teams drive the adoption of Kubernetes on a project-by-project basis. It’s only when there starts to be a critical mass of Kubernetes clusters that IT operations teams are asked to become more involved.

The challenge IT operations teams encounter is they typically lack the programming expertise that a DevOps team has. They require access to a set of graphical tools that enable them to manage and govern Kubernetes clusters without having to understand all the underlying YAML files that would otherwise be required.

Over time, most IT organizations will employ a mix of graphical tools and application programming interfaces (APIs) to manage Kubernetes clusters. Even the most experienced DevOps professionals will sometimes find it more expedient to use a graphical tool to accomplish a simple task.

Regardless of how fleets of Kubernetes clusters are managed, the one inescapable fact is that there are a lot more of them showing up in Kubernetes environments. The challenge now is determining the best way to go about managing them all in a way that enables members of IT teams to more easily collaborate, regardless of their skill level.

Mike Vizard

Mike Vizard is a seasoned IT journalist with over 25 years of experience. He also contributed to IT Business Edge, Channel Insider, Baseline and a variety of other IT titles. Previously, Vizard was the editorial director for Ziff-Davis Enterprise as well as Editor-in-Chief for CRN and InfoWorld.

Mike Vizard has 1615 posts and counting. See all posts by Mike Vizard