The Principle of Least Privilege in Cloud-Native Applications

Modern applications require modern security. Public cloud vendors are highly motivated to ensure their platforms are not the subject of security attacks that chip away at customers’ trust and well-being.

In cloud-native applications, many components can become compromised. If even a single component is compromised, the entire application is at risk unless proper security procedures are in place to address security vulnerabilities.

One of the biggest challenges in cloud-native application security is the advent of trust between services. Can one service of an application trust that what another service does won’t harm it in some way?

It is so easy to fall into the trap that different services within the same application should logically be able to trust each other. But this is a poor security design. Trust should never be assumed—even among services built by the same company or development team.

Instead, services should not have access to any component they do not require access to. Each service should only have access to the absolute minimum parts of a production system necessary to accomplish their job. And no system, application or individual should have complete, 100% access to everything in a production system.

This policy, known as the principle of least privilege, is an important operational security requirement that all modern, cloud-native applications should adhere to.

What is the principle of least privilege? The easiest way to describe it is by using an example. Imagine a simple application that is composed of three services:

Two of the services communicate with each other using the queue between them. Service A inserts messages into the queue and Service B reads and removes the messages from the queue. Service C performs some other actions unrelated to Services A and B.

In this case, Service A is responsible for pushing items onto the queue and Service B is responsible for pulling items off the queue. Therefore, there is never a time when Service A needs to read from the queue, and Service B should never need to push items onto the queue. Additionally, Service C should never need to access the queue for any reason.

If this application is deployed in the cloud, then the queue could easily be implemented using some queuing service, such as the Amazon SQS service. Without much forethought, it makes sense to restrict the queue so only the services that make up the application can access it—you don’t want to see some rogue person coming in and inserting messages or removing messages from the queue.

But if all services that make up the application have access to the queue, then Service C has access to the queue, even though it doesn’t require access.

This might seem perfectly acceptable. Service C is part of the application, after all. Why should we put energy into restricting Service C from accessing the queue?

Well, what happens if a bad actor gets access to Service C? If a bad actor has access to Service C, it could modify the service to have it insert bogus messages into the queue, pretending they are really coming from Service A. Service B, unaware of the problem, would read the bad messages, assume they are good and process them. This makes Service B vulnerable to the same bad actor that has access to Service C. Now Service B is also compromised, and soon the attack can spread.

But the problem is not just bad actors. What happens if, instead of a bad actor, Service C has a bug—a simple defect? If that bug were to cause it to accidentally insert bad messages into the queue, this could cause Service B to behave incorrectly and cause the application to fail.

The lesson here is since Service C does not need access to the queue, it should be explicitly prevented from accessing the queue. In a cloud-native application, this typically means that a security policy needs to be put on the queue that prevents Service C from accessing the queue.

Obviously, this process can be expanded and applied to an entire application. An application with hundreds of services may only have one or two that need access to the queue. So, rather than disabling access from people that don’t need access to the service, instead, set the security policy for the queue so that nobody can access the queue except those explicitly allowed to access it. In this case, only Service A should be given write access to the queue and only Service B should be given read access to the queue.

If such a security policy was in place when Service C was attacked by a bad actor, that bad actor would not be able to extend their attack into Service B, restricting the negative impact of the attack. Further, if Service C simply had a bug, that bug would not be able to inadvertently impact Service B, limiting the impact of the bug.

The principle of least privilege states that services, systems and even individuals should only have access to the absolute minimum privileges they require to perform the tasks they are required to do and absolutely no further access. The principle is a policy for reducing the scope and impact of bad actors, bugs, and other bad actions on a system.

Lee Atchison

Lee Atchison is an author and recognized thought leader in cloud computing and application modernization with more than three decades of experience, working at modern application organizations such as Amazon, AWS, and New Relic. Lee is widely quoted in many publications and has been a featured speaker across the globe. Lee’s most recent book is Architecting for Scale (O’Reilly Media). https://leeatchison.com

Lee Atchison has 59 posts and counting. See all posts by Lee Atchison