4 Reasons to Shift Day 2 Operations Left With Kubernetes

Rapid application growth and increased production deployments at scale have pushed many enterprises to maintain a focus not just on the application development life cycle but also on ‘Day 2’ operations and the challenges of applications and services in production. These include data management, security and observability.

Although Kubernetes’ capacity for data replication and portability can enhance a system’s reliability, it doesn’t protect developers and operators against infrastructure failures, data corruption, data loss, or worse, ransomware. In fact, if there is a coding error that accidentally leads to a deleted database, that error will be faithfully replicated along with everything around it, leading to further data loss. If a Kubernetes deployment is hit with a ransomware attack, the worst-case scenario could be a complete organizational shutdown.

Without a separate and appropriate data management system in place, an enterprise’s high-value data—and the applications and services that rely on it—can remain exposed, creating widespread potential for business risk. Further, Kubernetes-based systems have become sitting ducks for cyberattacks as exposed applications and environments become increasingly common. Overpermissioning, known software vulnerabilities and skipped patches and updates create Day 2 security gaps that leave users exposed.

This is why ensuring the continuity of Day 2 operations is so critical, and why shifting data management, security and observability left in the development life cycle is increasingly important. One of the most effective ways to safeguard against these Day 2 risks is by deploying backup capabilities that effectively integrate with Kubernetes environments. Systems that can automatically discover new as well as changed applications, and do so without forcing developers to change workflows or tooling, is a major value-add. In Kubernetes environments, this is possible through native APIs that support security protocols like authentication and authorization as well as CI/CD and workflow integrations.

In the cloud-native, Kubernetes-orchestrated world, the traditional development life cycle is archaic. Now that applications are built, deployed and operated continuously, functions that used to be reserved for the operational phase must be accounted for much earlier in the process. There are several compelling reasons why.

Number One: Cloud and Microservices Need Continuous Delivery

Day 2 operations SDLC

(Figure 1 – The Continuous DevOps Cycle)

Software development has become largely atomized, with the process becoming an arrangement of microservices where an application is configured out of a fine-grained collection of loosely coupled services rather than formed from a monolithic architectural design. The traditional development sequence tends to visualize the process in discrete stages running from left to right, from beginning to end, at which point the project was considered done. But the process today, with the adoption of DevOps, is more cyclical and infinite than linear and finite. Continuous delivery of updated applications, augmented by automation to help developers keep ahead of those changes, is widely understood to represent continuous improvement in today’s digital enterprises. And for that reason, the automation mechanics of the continuous delivery pipeline also need to shift left.

Cloud-native applications built using containers and Kubernetes as the application infrastructure are significantly more developer-centric. Since developers have more control across the entire application stack, as well as the underlying infrastructure, it is natural that they should have equal control, or at least input into, traditional Day 2 operational requirements starting at the beginning of the development process.

Number Two: Better Integration with Modern Environments

Legacy tooling simply does not work in modern, developer-centric environments. For example, VM-based backup is not functional for Kubernetes environments, given that there is no mapping of containerized applications to servers or virtual machines. Instead, Kubernetes uses dynamic placement policies, automatic rescheduling, and autoscaling. These legacy systems also cannot scale down to where development happens (e.g., a laptop).

Kubernetes-native tools that auto-scale with an environment are needed to shift traditional Day 2 processes further to the left. Kubernetes-native platforms that account for secure multi-tenancy, role-based access control and self-service will be needed to provide capabilities for backup, application migration, security, compliance and other essential functions. This is particularly true in inherently complex enterprise environments where multi-cloud and hybrid cloud strategies are common.

With the advent of tightly integrated, Kubernetes-native data management platforms, integration with data management functions for cloud-native applications will require minimal development effort and improve tooling support for developer and ops teams. The same integration trend is being observed with other Day 2 functions such as security and observability. When considered together, the accelerated speed of cloud-native development alongside the expanding complexity of cloud-native systems is driving the urgency to adopt critical operational functions early in the development life cycle to ensure the protection of key enterprise resources and potentially sensitive data in production.

Number Three: Protection from Growing Security Threats and Ransomware

Security issues facing Kubernetes users have the capacity to disrupt critical operations at an increasing rate, especially as ransomware attacks targeting Kubernetes environments quickly ramp up. This is a serious issue that when prepared for early in the development process using data protection capabilities like backup and disaster recovery will help application-centric organizations prevent damages.

Optimally, organizations will be prepared to prevent ransomware attacks altogether, but it is just as important to be prepared to mitigate the damage and recover from an attack. An attack on a Kubernetes cluster can stem from something as “simple” as an overlooked, unauthenticated endpoint or an unpatched vulnerability. In the worst-case scenario of a successful attack, fast restores are essential to protecting sensitive data from being exploited and allowing enterprises to return to normal business operations quickly.

Development teams need to be sure they have the proper tools for backup and recovery functions. Automation is key and these tools should integrate easily into existing development workflows. Enabling immutability, creating backups with unique code paths, protecting backups for maximum effectiveness, and enabling seamless restores are an essential part of a Kubernetes data protection strategy that can protect organizations from cyber attacks and ransomware.

Number Four: Low Friction Workflows for Developers

For the shift left of data management to accommodate the workflows of developers, it needs to offer a low-friction pathway to be integrated into the application development process as soon as or even before it begins. This is another prime use case for automation. Doing so can simultaneously help to achieve a series of critical goals. It can help to address the chronic shortage of skilled personnel. It can increase the company’s agility. It can improve product quality. It can elevate the customer experience. It can improve the developer experience (DX) and increase developer happiness. It can help create a culture of continuous improvement. And it can nudge an organization along the path toward full digital transformation.

Holistically, the earlier in the development life cycle that functions like security, compliance and data management can be integrated into processes, the more effective they can be in ensuring the proper protection and performance of code and applications. Kubernetes has ushered in a paradigm shift wherein businesses need to rethink how they approach IT to be more application- and service-centric. But this thinking needs to extend beyond the development phase to push critical Day 2 operations further into the developers’ domain to optimize the benefit of cloud-native technologies and ward off downstream risks.

Michael Cade

Michael Cade is a community first technologist for Kasten by Veeam Software. He is based in the UK with over 16 years of industry experience with a key focus on technologies such as cloud native, automation & data management. His role at Kasten is to act as a technical thought leader, community champion and project owner to engage with the community to enable influencers and customers to overcome the challenges of Cloud Native Data Management and be successful, speaking at events sharing the technical vision and corporate strategy whilst providing ongoing feedback from the field into product management to shape the future success.

Michael Cade has 5 posts and counting. See all posts by Michael Cade