Kubernetes Misconfigurations are Your Worst Enemy

Although misconfigurations happen all the time, they are now regarded as one of the biggest and most challenging concepts in the world of Kubernetes. Security, efficiency and reliability are the three high-level constructs inextricably linked to successful Kubernetes deployments—and they are also the three main areas impacted by erroneous configurations. When concerns around security, efficiency and reliability are not properly addressed through configuration best practices, critical elements like cost optimization, performance and user experience are impacted. 

The easiest and best way to address these critical areas is through successful Kubernetes configuration. Practitioners need to perform many different types of checks around general reliability of workloads to ensure they are secure, reliable, resource efficient and cost effective. This is why scanning all configurations is essential to running happy Kubernetes clusters. 

Why Configuration Counts

The reality is, not all organizations have found their footing in the configuration department. In fact, many are not even halfway there. For example, a recent benchmark report tells us only 35% of organizations have correctly configured most (>90%) of their workloads with liveness and readiness probes. These probes provide a way for Kubernetes to understand whether an application is alive and ready to serve traffic—and take remedial actions if not. As a result, not having these probes in place can lead to serious reliability problems, all resulting from misconfigurations.

Sure, Kubernetes can automatically scale resources up or down in response to varying workload demands, a capability now considered to be one of the primary platform features. A less well-known fact is that containers have built-in configuration settings for determining the amount of CPU and memory resources they use—done through resource requests and limits. These settings, in essence, override some of the autoscaling capabilities of the underlying platform and can therefore lead to underprovisioning of the workloads. While underprovisioning can cause performance issues, overprovisioning can lead to potentially dramatic  inefficiency and cost overruns. 

Moreover, even minor misconfigurations can generate major security holes if not found and addressed in a timely manner. For instance, containers running with more security permission than needed, such as root level access, have become a common vulnerability. Under certain configurations, containers may be able to escalate their own privileges. And because these configurations are not set by default, they must be established by security teams. 

Again, the benchmark report tells us only about 42% of organizations today manage to lock down the majority of their workloads, while some 54% are leaving more than half their workloads open to privilege escalation—and thus, security holes. Problems with configuration can become increasingly painful over time as they consume considerable resources to fix. What at first feels like a few little issues quickly morphs into full-blown Kubernetes chaos in the form of security vulnerabilities, wasted resources and reliability concerns. 

How IaC Scanning Helps

The beauty of Kubernetes is its customization—but that customization can cause risk, downtime or wasted resources. The bottom line is proper Kubernetes configuration is vital to the success of cloud-native adoption. Without it, improving application security, reliability and efficiency is basically impossible. 

While configuration validation, also known in the industry as infrastructure-as-code (IaC) scanning, might be manually doable in a small team with one or two Kubernetes clusters, the problem becomes increasingly challenging as organizations scale with numerous development teams deploying to multiple clusters. DevOps teams, along with platform and security leaders, can quickly lose visibility into and control over what’s actually happening. This reality points to the need for automation and policies to enforce consistency and provide the appropriate Kubernetes guardrails across the organization. 

Misconfigurations Minimized

The answer to how misconfigurations can be minimized is a multi-faceted one. Large organizations will find it is nearly impossible to manually check each security configuration and assess its risk. Because Kubernetes defaults tend to be inherently open and insecure, it is important to avoid using these default settings until all security implications—and their impact on the overall risk tolerance level—are clearly understood. 

Helpful guidance and a useful framework for hardening an environment can be found in various objective, consensus-driven security guidelines for Kubernetes software, such as the CIS Benchmark. When these best practices are paired with risk-based policies integrated into the CI/CD pipeline, container security improves. Commits or builds that do not meet minimum security requirements or provide guardrails for developers can be halted. 

Protecting Kubernetes clusters and workloads at runtime, to ensure security, efficiency and reliability demands a multi-pronged approach using defense-in-depth. Part of this solution comes from finding a SaaS orchestration platform with the ability to establish effective governance, streamline development and operations and provide a better (and safer) user experience. Because misconfigurations are so common, building a stable, reliable and secure cluster only happens when industry best practices are followed. And this level of governance only comes through a trusted partner, well-versed in the process of unifying teams, simplifying complexity and building on Kubernetes expertise to save time, reduce risk and configure with confidence. 

Danielle Cook

Danielle Cook is the vice president of marketing at Fairwinds, a Kubernetes governance and security company. She can be reached at [email protected] 

Danielle Cook has 1 posts and counting. See all posts by Danielle Cook