Cost, Culture and Kubernetes

Webb Brown, CEO at Kubecost, co-authored this piece.

Why isn’t everyone tuning Kubernetes costs? We surmise that it boils down to the complex intersection of culture and technology. This post does not cover the engineering details of cost tuning (there are many great articles); rather, this is an article on how to support Kubernetes cost tuning in your organization.

As teams move to the cloud or run bigger workloads on Kubernetes, one thing is readily apparent: Running Kubernetes costs money. Cost savings can be one of the benefits of migrating to Kubernetes, but companies still often spend more than they have to on their Kubernetes clusters and the cloud. With the proper strategy and culture, though, saving 20% to 30% on your current Kubernetes bill is doable.

Through anecdotal conversations with industry leaders, we can make an educated guess that 20% to 30% of all cloud expenditures are from waste. To put the amount of overspending in perspective, let’s look at AWS’s financial data. The AWS market cap is approximately $78 billion dollars. This number would mean that AWS clients waste about $14 billion dollars annually. So now we understand just how big a problem we are talking about—and we also mentioned that we can tune Kubernetes to save money.

Kubernetes was built to collect cost data. It produces usage metrics available via the Kubernetes API server, but those metrics need organization. A system like Prometheus can store and organize those metrics, but the metrics are produced at a lower level within Kubernetes. Container Advisor (cAdvisor) gathers the metrics from the operating system, and the kubelet sends them to the API server.

cAdvisor provides container users an understanding of the resource usage and performance characteristics of their running containers. It is a running daemon that collects, aggregates, processes and exports information about running containers. Specifically, for each container, it keeps resource isolation parameters, historical resource usage, histograms of complete historical resource usage and network statistics. This data is exported by container and machine-wide.”

The book Core Kubernetes says this about the kubelet:

“At a high level, the kubelet is a binary, started by systemd. The kubelet runs on every node and is a Pod scheduler and node agent, but only for the local node. The kubelet monitors and maintains information about the server it runs on for the node.”

We won’t walk through the technical specifics of cost-tuning Kubernetes, but we recommend you do use a tool. There are options for tuning your cluster, including great open source projects like OpenCost that provide free tools.

But you may think, “I don’t pay the bill, so why is this even my problem?”

It’s not your problem until it becomes your problem. Nobody wants to pile onto the layoffs in the engineering field, so let’s talk about the flipside: Saving money for your company is actual data that can go on your resume, make you more marketable and even justify asking for a raise.

Now you might ask, “Why don’t we have a cost-saving culture already?”

Engineers’ primary job is creating products and keeping those products alive; writing code and keeping Kubernetes clusters up and running. Technical teams are (often) not incentivized to save money; lack the tools to save money and/or have 10 other things to work on. Engineers are not given one of the most precious resources: Time. The company culture must actively promote cost-savings, reward engineers for saving money and provide them with the time to save money. People need training and an environment that supports them in saving money as well as incentivization and accountability. To add a further layer of safety, engineers often will overprovision a cluster and applications. This oversizing does provide more capability to handle unexpected system overhead … but also creates waste. 

We have a mismatch within most organizations. Companies are accountable to the stock market, investors or private owners. Overall, each of these entities cares about the bottom line and margins. Waste eats margins for a living. Engineers don’t think about margin; if they do, they often don’t have the time to help with a company’s margin. Cultural change can fix this problem.

We mentioned AWS previously, which is a fantastic example of a cost-saving culture. Amazon has leadership principles that the company follows. One of them is frugality.

Now, let’s talk about changing culture. We mentioned that changing culture is hard, but you can start the change if the organization is ready. So how do organizations change? Business schools will talk about top-down and bottom-up change. The people on the ‘bottom’ sell to those controlling the budgets (usually the ‘top’). Or the ‘top’, through leadership, sells to the ‘bottom’ that change is occurring. Changing culture is often like changing the direction of a large cargo tanker: It’s slow, but it can be done with enough time and effort. The bottom line is that organizations waste billions of dollars annually because they don’t have a culture that encourages and teaches frugality.

Let’s focus on bottom-up change because the engineer that operates or uses Kubernetes can start that. To do so, you need awareness that you will encounter resistance from people who don’t want change. As engineers, we may think we must create a logical argument to win over people and help change the company. The exact opposite is the case. Jim Camp has written a highly relevant blog post related to this, based on neuroscientific research, called Decisions are Emotional, Not Logical. Influencing and starting technical change is more about influence and negotiations than engineering. We negotiate first to get to do the engineering we want.

The biggest emotional blocker is that eliminating waste is challenging. So, you need to show people that Kubernetes is made to save money. Show people that the right strategy and tools can get the team tuning the cluster. Change is also about momentum. Follow the excitement and the energy. The inertia may not always lead you in the direction you want, but it can eventually get you to the place you want. Also, ask quality, open-ended questions. It is incredible how asking a question can lead to a favorable negotiation. Instead of saying, “Do you want to save money on our Kubernetes cluster?” ask, “If we can save 20% on Kubernetes operating costs, how would that help our division/team?”

Be ready to hear the word “No.” When we ask to do something, we want to hear the word “Yes,” but when trying to influence change, we will hear the word “No.” But “no” is not entirely bad; it means the person talking with you is engaged. The term “no” is an excellent start to a conversation that will lead to buy-off and change.

Gamification is often one of the best ways to influence engineers to change what they do. We all like to play games and get badges. Napoleon Bonaparte once said that “​a soldier will fight long and hard for a bit of colored ribbon.” Engineers can fight waste, so give them a ribbon. Making it fun is one of the best tools to build inertia.

Let’s recap how to get buy-in on change:

  • Culture change is challenging.
  • People often make decisions based on emotions and not logic.
  • Show people that saving money is possible.
  • Follow where the momentum takes you.
  • Get to a “no”—because this is the start of the conversation, not the end.
  • Gamification is a great way to make change fun.

You may wonder if we are talking about culture and negotiation, why did we even mention Kubernetes? You want to start at a place with guaranteed cost savings. Kubernetes provides that place. To “save” money, you must measure your savings, and you need data to do it. Kubernetes has the components to produce usage data which allows for calculating cost savings. As we mentioned before, start with something that will work and almost everyone spends 20% to 30% too much on their Kubernetes bill.

Chris Love

Chris Love is a Google Cloud Certified hybrid multi-cloud Fellow, co-founder of LionKube and the co-author of the book Core Kubernetes. He has over 25 years of software and IT engineering experience with companies including Google, Oracle, VMware, Cisco, Johnson & Johnson and others. Chris has contributed to many open source projects, including Kubernetes, kops (former AWS SIG lead), Bazel (contributed to Kubernetes rules) and Terraform (an early contributor to the VMware plugin). His professional interests include Kubernetes, IT culture transformation, containerization technologies, automated testing frameworks and practices and DevOps. Chris also enjoys speaking around the world about DevOps, Kubernetes, and technology and mentoring people in the IT and software industry. Outside work, Chis enjoys skiing, volleyball, playing with his dogs, and other outdoor activities that come with living in Colorado. If you’re interested in having virtual coffee or have questions for Chris, you can contact him at @chrislovecnm on Twitter or LinkedIn.

Chris Love has 1 posts and counting. See all posts by Chris Love