Monitoring, Designed for Humans

Infrastructure management has come a long way. For the most part, gone are the days of manual configurations and deployments, when using SSH in a “for” loop was a perfectly reasonable way to execute server changes. Automation is now a way of life, and microservice-based architecture is the backbone for many organizations, enabling them to deploy software both safely and quickly. The advent of microservices has paved the way for container technology, such as Docker and Kubernetes, and fundamentally changed the way businesses build, deploy and monitor infrastructure.

According to Gartner, by 2020, more than 50 percent of companies will use container technology, up from less than 20 percent in 2017. While their adoption allows for continuous innovation and faster deployments, containers introduce a non-trivial level of complexity when it comes to orchestration. If companies want to reap the full benefits of implementing a multi-cloud strategy and the enhanced deployment capabilities from container technology, they must find a way to monitor their multi-cloud infrastructure easily and efficiently and—most importantly—in a way that doesn’t burn out operators.

Creating a System With Humans in Mind

Historically, monitoring tools were not designed with humans in mind; rather, they were meant to capture all data and have it all in one place, often at the expense of the operator’s sanity, leading to alert fatigue and burnout.

Legacy monitoring tools also tend to be extremely esoteric, which sometimes results in a single operator being relegated to managing that tool. The operator who leads the monitoring system often invests a significant amount of time into the system and truly knows the ins and outs of how to operate it. And, while the operator can solve their team’s biggest monitoring pain points, if other teams in the organization have the same problems, there’s no straightforward way to discover that the problem has already been solved or to reuse the solution.

Even worse, when that person leaves the company, there is a knowledge gap and the learning process must start all over again. Operational pains aside, the traditional model of monitoring everything and alerting on all state changes in the environment doesn’t even provide an accurate indication of a problem; you’re left with unhappy operators and a lack of understanding.

As container adoption increases—often alongside legacy systems—infrastructure grows increasingly complex, as does monitoring those systems, making it even more critical that we design monitoring tools with operators in mind.

Future-proof Monitoring

In this software-dependent world, availability is critical and downtime is not only expensive but damaging to business reputation. As a result, monitoring systems and applications has become a core competency crucial to business operations. The need for a future-proof, multi-cloud monitoring solution is more critical than ever—modern businesses require a monitoring solution that will keep up with dynamic environments while allowing them to keep tabs on legacy systems. Gone are the days that businesses can accept poor operator experiences as a solution for both their employees and the overall health of their IT infrastructure.

The greatest untapped resource in the tech industry is the tribal knowledge of operators. To unlock this knowledge, businesses must invest in technology that makes it easy to share monitoring solutions. If an operator can share a solution with a community of like-minded experts, they, in turn, can iterate on it. You then have users collectively improving on solutions, leveraging the expertise of the whole community to produce the highest quality monitoring for everyone, thereby reducing alert fatigue.

The solution for future-proof monitoring is simple: Your monitoring software should integrate with current and future systems (through documented APIs and open protocols) and draw on a wider community of experts creating and sharing solutions. This change in mindset can leave a positive impact on a business and can change the way operators deal with their day-to-day challenges. By unlocking the knowledge of the tech industry and ensuring integration with legacy and multi-cloud infrastructure, monitoring technology will maintain team happiness and create an environment that fosters knowledge-sharing instead of silos.

Sean Porter

Sean Porter is the creator of the Sensu project and the co-founder and CTO of Sensu Inc., a leader in open source monitoring. Sean is a seasoned systems operator and software developer with over a decade of experience in automating infrastructure. As CTO of Sensu Inc, he oversees the development of Sensu, and works with users to better understand how Sensu can help them solve complex monitoring problems.

Sean Porter has 2 posts and counting. See all posts by Sean Porter