7 DevOps Mistakes that should be avoided

Hare thumb up: [Micro-foreword by ‘No Bugs’]
Usually guest posts come with an implicit disclaimer of ‘opinions of the authors are their own and do not necessarily coincide with opinions of the editors’. However, this is one post which I personally 98% agree with (remaining 2% are about disagreeing that “deployment frequency” is a virtue per se – IMO, it is merely a way-to-reduce those observable-in-business-space ‘mean time to recover’ and ‘change failure rate’)
[/Micro-foreword by ‘No Bugs’]

Starting something new is always difficult. When you try new things, you’ll always make newbie mistakes, even if you’ve spent plenty of time studying the subject. DevOps isn’t an exception, especially when you consult the internet and find so many definitions of what it is and what it isn’t. I used to think I was doing DevOps because I was using Jenkins and because development and operations were working together. How naive.

By no means am I trying to say that misunderstanding DevOps is the only mistake one can make. And I’ll be sharing some of the struggles I and other folks in the industry have faced. But it’s important to mention that misunderstanding DevOps is the very first mistake you need to overcome since, in order to overcome the other problems, you’ll need to have at least some basic DevOps theory down. There are some prerequisites, like having a good set of automated testing, that have to come before tackling these other common mistakes. If you don’t have these in place, you’ll only speed up disaster, and sooner than later your DevOps efforts will be more frustrating than productive.

Before you move on, consult a trustworthy and high-quality source about what DevOps is, like this one from The Agile Admin, to avoid the biggest mistake of all—misunderstanding it. Then, you’ll be ready to properly learn how to avoid the common DevOps mistakes I’ve seen other shops fall prey to.

Having a Dedicated Team Take the Entire Workload

This is by far the most common mistake organizations make—putting all the burden on one team. It’s complicated enough just to have the development and operations teams deal with a new team that has to talk to both groups and coordinate. DevOps started with the idea of improving collaboration between teams involved in the development of software. That means it’s not just about development and operations. It’s also about management, security, QA—you name it. When you add a dedicated team for this, you’re just making things more complicated. The secret sauce here is simplicity.

You can avoid this mistake by working more on culture. Inject a mindset of automation, quality, and stability. Here’s an example of how you might do this: involve everyone in interesting conversations about architecture or common problems found in production environments. Teams have to be aware of how their work affects others. Developers need to know what happens after they push code and how operations sometimes struggles to maintain a stable environment. Operations teams also need to find ways to avoid being a blocker by automating repeatable tasks. DevOps is not just about a cool name. Some organizations have evolved the operations team to be a DevOps team. That’s OK. But DevOps is way more than that.

Focusing More on Culture Than Providing Value

This might sound contradictory to the previous point, but DevOps is not just about culture. Yes, it needs involvement from leadership and buy-in from everyone in the organization. But people won’t see the benefit until they have their own “aha” moment—the point at which they see the value. That usually happens when you have something to compare it with. Numbers are very good for this.

In order to avoid focusing too much on culture over value, you’ll need to pay more attention to things that can be measured. If you read the state of DevOps report, you’ll find that they have four metrics: deployment frequency, lead time for changes, mean time to recover, and change failure rate. You’ll want to have a higher deployment frequency because you reduce risk by releasing small changes. You’ll also want to improve the time you take to deliver value to customers after code has been pushed. In the case of a failure, you’ll want to reduce the time needed for recovery. And of course, you’ll want to reduce the failure rate.

Culture is not something you can measure, and in the end, your customers won’t care much about that aspect of your company. They will care about tangible and visible things.

Choosing Architectures That Make Things Hard to Change

If your software isn’t easy to change or evolve, you’ll find some interesting challenges. When you can’t deploy parts of your system independently, that system will become hard to start. If the architecture isn’t loosely coupled, it will be hard to adapt that architecture. I’ve faced this problem in the past when deploying a large system. We didn’t spend enough time thinking about how to deploy the parts independently, so we needed to deploy all parts together; if you only deployed one part, the system would break.

DevOps is way more than just automation. One of its ideas is to reduce the time you spend deploying your applications. Even if it’s automated, if deployment takes too much time, your customers won’t be seeing much value from your automation.

Avoid this mistake by spending some time in your architecture. Figure out how you can deploy the parts independently. But don’t take the approach of defining every single detail, either. Instead, make sure you can defer some of those decisions to a later time, when you know more. Architectures should adapt to change naturally.*

Not Experimenting Enough in Production

In software, we have a legacy of trying to get everything just right ahead of releasing to production. And that’s because it used to be so painful to get to production that you’d do it only once every two weeks, every month, or every quarter. But these days, through culture change and automation, getting things into production is much, much easier. There’s unprecedented consistency and speed, to the point where we can release new changes several times a day.

But wait. Isn’t this a win for the DevOps movement rather than a mistake people make? Well, it is a win, to be sure. But the mistake people make is not taking full advantage of DevOps tooling to experiment in production. Getting to production is great, but why not continue testing and experimenting in production?

Using tools like release automation, production monitoring, and feature flags, you can do some truly cool stuff. You can run split tests to see which layout for a feature works better, or you can do a gradual rollout to see how users respond to something new. All of this is possible without blocking the pipeline for changes that are still coming. Taking full advantage of DevOps means letting real production data inform your development process in a tight feedback loop.

Paying Too Much Attention to the Tooling

Although there are some tools that help you practice DevOps, you can’t say you’re doing DevOps just because you’re using a specific tool. People don’t proclaim themselves doctors just because they can use a stethoscope.

New tools are emerging all the time—there’s even ones that will help you to integrate other tools. How crazy is that? You have tools for continuous integration, deployment, configuration management, orchestrators, and version control. You’ll hear vendors claim they have the ultimate tool for your DevOps implementation. But in reality, there’s no single tool that could cover everything you need.

Take an agnostic approach to tools. Don’t fall in love blindly. There’s always going to be a better way to do things. New tools will arise, and after a certain period of stability has passed, you can start adopting them. In the end, tools should help you spend more and more time on things that are providing value to your customers.

Taking an All-or-Nothing Approach

It’s pretty common in software development to say a legacy system needs to be re-architected or recreated from scratch (greenfield projects). The problem with this approach is that new features and bugs won’t hold off just because you’re improving the innards of the application. They’re mean. You’ll need to catch up after you’re finished, and imagine what might happen after a few months. What if, for instance, the team decided to change the way they use configuration variables mid-project? So, unless you’re starting a new project where you introduce automation from the very beginning, this isn’t the recommended approach.

Avoid this mistake by starting with small batches. This is actually the recommended way to introduce a new technology or process into the organization. Take the SRE approach by reducing what Google calls “toil” little by little. Start dedicating a bit of your time to thinking about things that can be easily automated. It could be a specific time of the day, or it could be a complete day of the month. It could start with only one person focused on this, not the entire team. The important thing is that you’ve dedicated time to thinking about automation, and you’ve scheduled it so that you don’t get interrupted with daily routine tasks and can focus on developing new ways to make your work easier. Organizations will benefit from this.

Don’t Be Afraid to Do It Wrong, Though. Start Now!

You’ll always be wrong according to some people, so don’t be afraid of making mistakes. Of course, it’s less painful to learn from others’ missteps rather than your own. But you’ll still make your own mistakes and learn—it’s natural. This is a continuously growing field, and you’ll hear and read countless opinions about the topic. Take what best works for you and share it.

When you have a mindset toward delivering value to your end users all the time, you won’t consider your job done when a development team says that they’ve finished coding. Your job is done when you meet your customer’s expectations after delivery. So as long as you focus on providing value to your customers, you‘ll see the benefit of practicing DevOps. This is especially true when talking about the time you take to deliver value, the time to recover from unexpected behaviour, and the rate of failure. By focusing on these things—the things that matter—you’ll reduce the gap in productivity mistakes can cause.

* There’s a book that caught my attention a few days ago called Building Evolutionary Architectures. I haven’t read it, but it’s already on my list. It might help you out.

Author Bio

Erez Rusovsky is the CEO and co-founder of rollout.io, a company located in San Francisco that provides a feature flag management system enabling developers to quickly and safely build and deploy apps. He holds a B.Sc. in Information Technology and is very passionate about his job. Being a developer himself, he takes very personally the company’s mission to help developers excel in their tasks. He wrote many articles in the Rollout blog, most notably about app development.