5 Kubernetes Predictions for 2023

Kubernetes uptake is growing and will continue to grow. Its continuous development is fueled by broad adoption: More companies, teams and individuals will embrace it as a platform for innovation, building new applications and scaling existing ones faster than ever before.

The most recent Annual Cloud Native Computing Foundation (CNCF) Survey supports the assertion that Kubernetes is growing, and The State of Kubernetes 2022 report from VMware Tanzu also confirms adoption. Gartner and Forrester both confirm the growth and wide adoption of Kubernetes. There is no need to predict that Kubernetes adoption will continue to explode!

However, if we unpack this rapid adoption, what else do we know about the widespread uptake of Kubernetes? What do leaders need to pay attention to so that the ecosystem of Kubernetes flourishes in their organization? There are five things to pay attention to if Kubernetes is important to your IT organization:

  1. The creation of distinct DevOps and platform teams since knowledge areas are becoming so broad. Kubernetes adds additional complexity and needs specific skills amongst teams.
  2. Companies will find better ways to distribute site reliability engineering (SRE) knowledge across teams.
  3. Policy-as-code for Kubernetes will mature and gain traction. 
  4. The struggle around Kubernetes troubleshooting will partially be solved.
  5. SLIs and SLOs will be adopted by more teams and drive investment decisions.

Creation of DevOps and Platform Teams

Lately, there have been articles and tweets proclaiming that DevOps is dead. I entirely disagree: Close collaboration between various disciplines remains critical and the focus on automation and acceleration is vital to surviving in this digital era.

However, companies struggle to bring another big knowledge area to the team. After CI/CD, shift left testing, monitoring, observability and security, teams struggle to gain extensive knowledge about Kubernetes and other cloud platforms. These technologies offer tremendous business and potential financial advantages, but they are complicated environments to learn and maintain.

Companies of all sizes should consider where they want to build a knowledge base about Kubernetes. Many companies pick a platform team to build and set up this expertise. A single platform team can support multiple DevOps teams. With this segregation, DevOps teams continue to focus on developing and operating the (business) applications while the platform team takes care of a robust and reliable underlying platform. DevOps is not dead, but leaders should consider what level of responsibility and technology support is realistic to push into every team.

Better Ways to Distribute SRE Knowledge

If you are a Kubernetes expert on an SRE team, you might not understand this challenge. However, many teams do not have the site reliability engineering expertise to optimize Kubernetes usage. Many teams struggle with this issue. But with more and more companies striving to extend knowledge sharing, new models start to arise. In 2023, more companies will develop best practices around how to spread knowledge internally.

The need for a more reliable landscape, better performing applications and a process without too much waste will drive growth of a culture of knowledge sharing.

A central SRE team might be suitable if you have mature engineering teams that are familiar with Kubernetes and other cloud technologies. If these engineering teams just need occasional guidance and direction or perhaps some support in tool selection, this model might fit your organization. 

A coach/squad is a perfect model if your teams are new to using Kubernetes. A group of experts can go from one team to the other and help to grow the practice implementation as well as share knowledge. Prioritize working with your most critical teams first; help them out for a few weeks or months, then move on to the next team.

Local distribution of SRE knowledge in every DevOps team is obviously the strongest model. Enabling this might require time but having site reliability knowledge in teams who make use of Kubernetes is the ideal model.

Policy-as-Code for Kubernetes Will Mature and Gain Traction

For several years the focus has been on enabling teams to become more autonomous in deploying applications to Kubernetes. Developing pipelines that can push out applications easily is now common practice in many enterprises. 

Although autonomy is a great advantage, finding the right balance with regard to maintaining some manual control remains a challenge. With the shift to as-code, an entire world of possibilities is opening up. Policies defined as-code can be easily validated and reviewed by following established engineering practices. Therefore, policy frameworks will become more important. Within the CNCF, Open Policy Agent (OPA) is the most common policy framework. Open Policy Agent describes itself as follows: 

OPA provides a high-level declarative language that lets you specify policy as code and simple APIs to offload policy decision-making from your software. You can use OPA to enforce policies in microservices, Kubernetes, CI/CD pipelines, API gateways, and more.

To continue the adoption of Kubernetes and autonomous teams, practices like this will mature in parallel to allow continuous growth while maintaining or even gaining additional control. Adoption enables you to control how Kubernetes is used by a wide range of teams.

Solving Kubernetes Troubleshooting

Troubleshooting applications running on a Kubernetes cluster at scale can be troublesome not just because Kubernetes itself is complex, but also due to the connections between so many moving parts. Having knowledge of all of them remains an issue. Providing troubleshooting solutions that help teams remediate issues effectively will provide a competitive advantage. 

To get the full picture, four elements need to be brought together. All four are needed to fully understand what’s wrong, remediate the issue and analyze what caused it to prevent it in the future.

  1. Events—Various troubleshooting solutions are event-driven; they show every change that has happened. Through this data, they can point you to where an issue is and what caused it.
  2. Logs—Many teams will use log analytics to spot warnings and ongoing issues and then try to determine what went wrong. Logs provide great insights, but they can be cumbersome to sift through.
  3. Telemetry data—With an influx of metrics and standards like OpenTelemetry growing in adoption, telemetry data is essential for troubleshooting Kubernetes. Detecting performance degradation of services, memory or disks reaching their limits is helpful in solving these issues.
  4. Trace data—Gathered through, for example, eBPF, trace data is powerful in helping you gain insights on golden signals like error rate, throughput and traffic.

Solutions that bring these four elements together will help you determine more quickly what is wrong and, thus, how to remediate. Vendors and open source frameworks will continue to drive this trend.

SLIs and SLOs Will be Adopted and Drive Investment Decisions

For many years, service level indicators (SLIs) and service level objectives (SLOs) have been used to measure and track how organizations are progressing against their defined targets. For many years setting SLIs and SLOs has been an IT-focused exercise without much visibility to the line of business. 

The connection between SLIs, SLOs and service level agreements (SLAs) becomes more relevant and will be established through the help of tools that, at last, connect business and IT. 

More importantly, setting SLIs and SLOs will not just be a unit of measurement, but will start sparking resource investment conversations, including questions like, “What services and areas in my Kubernetes environment are lagging behind and not meeting their SLOs?” These are the areas that will require additional investments. “What areas perform really well and require less investment? What require the same level of investment but more experimentation?”

In 2023, teams and leaders will become more aware of the data that is hidden in these SLOs and will turn it into valuable insights to drive investment decisions.

Conclusion

Kubernetes and the ecosystem around it are at an interesting moment in its growth toward maturity. With continued, or even accelerated, adoption of Kubernetes, companies and, in particular, engineering leaders, need to start developing a broader look at knowledge within their teams, tools to facilitate growth and different ways of solving problems. If not, Kubernetes adoption might slow down, results may be less than optimal, engineers might leave or compliance rules might be violated. This in itself is already a challenge, but more importantly, it will impact business goals, stability might be compromised and customers might have a poor engagement through your digital channels.

Andreas Prins

Andreas and his team strive to continuously improve StackState’s software, ensuring a rock-solid observability platform to support our customers’ observability needs. Andreas brings a strong mix of knowledge and experience in engineering management, product and strategy management, transformational leadership and Agile. Previously, Andreas held executive roles in engineering and product management for companies such as Mendix, digital.ai and XebiaLabs. He has a bachelor’s degree in kinematics and product design and uses his training as a lean black belt to drive organizational improvement.

Andreas Prins has 1 posts and counting. See all posts by Andreas Prins