April 29, 2017

Software development is far more streamlined today than it ever has been. Sites such as GitHub, continuous integration and delivery processes (CI/CD) and technologies such as containers have sped up the development process by orders of magnitude. Given all these changes, you might think developer productivity is going through the roof. But that’s not the case; projects often stall, go over budget and ship late. What’s a developer to do?

As any coder can tell you, developers still spend a ton of time—sometimes up to half of their day—waiting for tests to run and tracking down and reproducing bugs. While these processes are essential to building a successful application, they don’t involve much actual coding. And, as you might imagine, these are not very efficient processes.

Data Pains

The problem is that most of the tools built recently for developers, including containers, address only certain parts of application development: the lines of code and other stateless parts of the application stack. Equally important to writing an application is, of course, its associated data. After all, an application can’t run very well without data.

But the fact remains that quite often the data running through applications is what causes issues in the first place. Every developer has experienced cryptic bug reports, spent countless hours spinning up acceptance test environments and wasted time re-creating an environment to solve a problem during the process of coding a new application. The advent of microservices-based architecture and the need for rapid application deployment is further aggravating these problems. Each problem could be fixed if there was a more efficient, easier way to the manage application’s data. This is what I like to call the pain of data in development environments.

How much time could be saved if data was managed throughout the entire development cycle, from creation to acceptance testing to staging to running in production? The ability to pair the correct version of a specific data set with the correct version of code would be incredibly powerful.

USB? You Gotta Be Kidding

Imagine a team of developers are working on coding an app, and one of the team members runs into a bug she can’t fix. What does she do? If she wants another developer to test for the bug, the teammate will have to meticulously set up his environment in an attempt to  replicate the problems she is experiencing—including importing the data of a database, which can take up to half an hour. In some cases, I’ve even heard of developers downloading the data onto a USB stick and literally walking it to another part of the office. How’s that for inefficient? Not to mention insecure!

Now think about how much time could be saved if this developer was able to provide snapshots of the data—a representation of exactly where and how the data was when the bug occurred—to her teammates. Instead of waiting to download an entire database (or walking a USB stick across the office), the necessary data could be shared almost instantly and securely. This would eliminate many minutes of prep time, and the time for bug fixes could be reduced from 30 minutes to 5 minutes, a time saving of 6X.

Bye Bye, Bogus Bug Reports

Data snapshots also can be useful when an application is already running in production and (surprise!) a customer delivers a bad bug report. This report gives no context, provides zero details and leaves no clue what the bug might be. Traditionally, this request would have to be chased after, details researched and, generally, a lot of guesswork would ensue.

But what if we could build a better bug report—one that came with the data attached to it, with all of the information that a developer would need to reproduce the problem as the user was experiencing it? The ability to see a snapshot of data at a specific moment in time would help developers identify and understand problems much more quickly, saving vast amounts of time and money.

This same concept of data snapshots also could be applied to acceptance testing. Instead of spending time spinning up complex environments, data snapshots could be used to speed the process of testing applications for bugs by ensuring correct data is available instantly across hundreds of machines running CI/CD testing.

Snapshots Offer Escape from the Data Quagmire

There’s no doubt that software development is a whole lot faster and more efficient today than even just a few years ago. But that doesn’t mean the industry doesn’t still have a long way to go.

One big step forward is to factor data into the development equation. This means allowing developers to treat data in development just like they treat their code, tying together applications and data simultaneously during the course of development. And we’re getting closer to being able to do that.

Simple solutions, such as tying the correct version of data to the correct version of code throughout the entire DevOps flow, has the potential to save hours of time for developers, translating to more and better code being written, fewer stalled project and, ultimately, more money saved. Snapshots are one such simple solution.

About the Author

mohitMohit Bhatnagar is vice president of Product for ClusterHQ. He is a tech industry veteran with a passion for driving innovation in the infrastructure, storage and enterprise software spaces, and has hands-on experience in running billion-dollar product and solutions businesses at NetApp, Symantec, Motorola and McKinsey, as well as with small startups.

Lorum Ipsums asdfasdfasdf