What Holistic Cloud-Native Security and Observability Look Like

The rise of cloud-native and containerization, along with the automation of the CI/CD pipeline, introduced fundamental changes to existing application development, deployment, and security paradigms. Because cloud-native is so different from traditional architectures both in how workloads are developed and how they need to be secured, there is a need to completely rethink our approach to security in these environments.

As stated in this article, security for cloud-native applications should take a holistic approach where security is not an isolated concern, but rather a shared responsibility. Collaboration is the name of the game, here. In order to secure cloud-native deployments, the application, DevOps and security teams need to work together to make sure security happens earlier in the development cycle and is more closely associated with the development process.

FinConDX 2021

Since Kubernetes is the most popular container orchestrator and many in the industry tend to associate it with cloud-native, let’s demonstrate this holistic approach by breaking it down into a framework for securing Kubernetes-native environments.

Framework

At a high level, the framework for securing cloud-native environments consists of three stages: Build, deploy and runtime.

Build

In the build stage, developers write code and the code gets compiled, is run through compilation checks (i.e. pre- and post-commit checks) and gets committed. In Kubernetes, developers also need to build repos and push them to registries.

If an organization has sufficient build-time security, it can mitigate various security issues before they become serious vulnerabilities. Here are some build-time security requirements for cloud-native environments such as containers and Kubernetes.

  • Image scanning and securing the CI/CD pipeline—In this automated world, there needs to be a way to verify that the software developers are writing is free from exploits. Adequate security measures need to be implemented in the CI/CD pipeline to prevent threat actors from sneaking a module into your pipeline or registry inside of an image (because if that happens, it would be auto-deployed). Examples of security measures include image scanning, hashes of images, binary authorization, digital signing of images and private registries.
  • Securing the host device (hardening the host)—There are many ways to harden your host, including minimizing the number of applications running on the device to reduce the host attack surface. For example, the host can provide resources for a large number of containers to run but not allow them to enter its namespace. That way, if an application or container is compromised, the host is not compromised along with it. The major cloud providers have their own versions of hardened operating systems, so organizations have several good options to choose from.
  • Secrets management—Once you verify that your images and your host are secure, you need to think about how to manage secrets. One thing you don’t want to do is hardcode secrets in your images. Again, there are many options for secrets management—all of the major cloud providers have good secrets management services, or you could do something independent and cloud-agnostic.

What you build needs to be secure, and this is where the collaboration between the security, DevOps and application teams needs to happen. There is no one person responsible for securing the build stage; everyone must play their part. It’s all about checks and balances.

Deploy

The DevOps and security teams are responsible for the deploy stage, during which a cluster is set up and continuous updates are made to the application. The cluster needs to communicate with the outside world, so, during this stage, teams need to think about how to configure elements inside the cluster to communicate with and access elements outside of the cluster and vice-versa.

Here are some ways to configure clusters for secure communication and operation.

  • Admission controllers—Once you’ve set up a cluster, one way to control communication and access is through the use of admission controllers (a Kubernetes concept), which allow you to set up a set of rules about what type of activity is allowed. Once you have a cluster deployed, you’ll have pods running in your cluster that have access to pull-down images from your registry. Using an admission controller, you can set up rules where certain conditions need to be satisfied or the deployment will fail.
  • Pod security policies—After the pod is deployed, you can create pod security policies to limit what your pod can do inside the host (e.g. cannot mount certain images). Note that pod security policies is a Kubernetes feature that will be deprecated in the next couple of years, but is still very powerful. 
  • Role-based access control (RBAC)—Kubernetes attaches a notion of identity to every pod (you can use the default or set it yourself). With that, you can easily set granular rules about what each pod can access inside Kubernetes (e.g. a particular pod can only have read-access to specific resources). If you don’t set up access properly before deploying pods, you’ll open up your environment to malicious attacks.
  • Perimeter firewalls—Perimeter firewalls still have a role to play in the world of cloud-native. These firewalls are excellent at access control, deep packet inspection, creating security profiles and more. One thing they don’t do well is that they treat the entire cluster as one entity, which doesn’t mesh well with the dynamic nature of Kubernetes and containers. It’s difficult for firewalls to get granular access control if things are changing dynamically. This is why integration with a Kubernetes-native firewall is an extremely important aspect of deploy-time security. 

A best practice for deploy-time security is to implement a continuous review cycle and approval process to ensure the DevOps and security teams are in sync.

Runtime

Runtime is shared between the host and node. In this stage, we determine what can run on the host or node, and what should be allowed to run. 

There is often an assumption that somebody else will secure the network. That’s not the right way to think about it; don’t assume it’s someone else’s problem. Both the host and the network need to be secured. A simple transaction has many touchpoints—you need to figure out what is running on the host, why it’s running and its impact, and you also need to think about what is going on in the network. While Kubernetes abstracts all of this for you, you still need to be aware of what’s happening and how it can be secured.

Network Policy

At this point, your cluster is operating and you need to create a strategy surrounding how you want to implement network access control, network segmentation, etc. This is done with network policies. You’ll also want to have tools that can be used for application-layer policy because sometimes Layer 3 and Layer 4 policy is not enough; you might need more granular access based on URLs. This is important during runtime because it gives you assurance and control.

Threat Defense

There are some more advanced concepts, such as threat defense, that can be implemented during the runtime stage. Any attack that originates from the client is a risk. If a threat actor gains access, they’ll obfuscate using lateral movement, at which point they can use any number of classic techniques to exploit the network. Kubernetes has some advanced techniques to protect against this, and many tools offer integrations with threat feeds so you can block access to known vulnerabilities. 

Why Discuss Security and Observability Together?

Observability is more than just collecting logs for every entity in your environment. Yes, you can do that, but what you really need is context. The big challenge when trying to secure cloud-native workloads is the sheer amount of data inside your cluster. It’s not a prudent approach to look at every byte and every packet. This is where observability needs security. You need to have a platform like Kubernetes where you can build observability in a way that benefits you. 

You want an observability system that observes your cluster and identifies anomalies so that you can then take those anomalies and automate a process for inspecting them (this is generally achieved with techniques like simple baselining). Once your observability system pinpoints an area of the cluster that requires investigation, you can focus your security measures efficiently.

When it comes to security and observability for cloud-native workloads, a collaborative mindset and a holistic approach are needed. I’ve offered a framework for getting started to show what this concept looks like; your organization will need to define its own unique requirements and work across teams to implement a holistic approach that works for you.

Brendan Creane

Brendan Creane is Head of Engineering at Tigera, where he is responsible for all engineering operations, including Calico Cloud, Calico Enterprise and Project Calico. Brendan has several decades of experience building enterprise security, observability, and networking products.

Brendan Creane has 1 posts and counting. See all posts by Brendan Creane