I recently finished up a graduate program in Engineering Management and one of my last courses was on “Lean and Agile Management.” This was one of my favorite courses as it focused on understanding and improving process flow by reducing friction and waste. My professor claimed that most processes contain roughly 95% waste. At first that seemed insane, but when you think about most processes from a flow perspective and actually see (and name) the waste, it doesn’t sound as crazy. What *is* waste – or muda – in this context? Anything the customer isn’t willing to pay for! So much of the DevOps movement borrows from Lean, and I thought I’d look at eight types of waste (represented below from the excellent book The Remedy) and see how a DevOps focus (and cloud computing!) can help get rid of these non-value adding activities.
Waste #1 – Motion
Conveyance waste (outlined below) is refers to moving products themselves, while motion waste is all about the physical toll on those creating the product itself. This manifests itself in machine failures, ergonomic issues, and mental exhaustion resulting from (repetitive) actions inflicted by the process.
Does it take you and your team multiple hours to push software? Do your testers run through 100 page scripts to verify that the software is working as expected? Are you wiped out after trying to validate a patch on countless servers in different environments and geographies? This is where some of the principles and technologies that make up DevOps provide some relief. Continuous integration tools make testing more routine and automated, thus reducing motion waste. Deployments tools (whether continuous deployment or not) have made it simpler to push software without relying on manual packaging, publishing, and configuration activities. Configuration management tools make it possible to define a desired state for a server and avoid manual setup and verification steps. All of these reduce the human toll on deploying software.
The cloud helps reduce motion waste as well. Instead of taking ANY time patching and maintaining servers, you can build immutable servers (using templating tools like Packer) that simply get torn down regularly and replaced with fresh templates running the latest software and configuration. You can also use a variety of configuration management tools to orchestrate server builds and save repetitive (and error prone) manual activities that take hours or days to perform.
Waste #2 – Waiting
This is probably the one that most IT people can name immediately. We all know what it’s like to frustratingly wait for one step of a process to finish. You see waiting waste in action whenever a product/service spends much of its life waiting to be worked on. What’s the result of this waste in IT? Teams are idle while waiting for their turn to work on the product, paying for materials (deployment tools, contract resources) that aren’t being used, and your end users giving up because they can’t wait any longer for the promised service.
A DevOps mindset helps with this as its all about efficiency. Teams may share the same ticketing system so that a development team can immediately start working on a defect as soon as its logged by front-line support person. It’s also about empowerment where teams can take action on items that need it, versus waiting around for someone with a VP title to tell them to get on it. One place that VPs can help here is to invest in the tools (e.g. high performing dev workstations, automated test suites) that developers need to build and deploy faster, thus reducing the waiting time between “code finished” and “application ready for use.”
Consider a just-in-time mentality where resources are acquired (e.g. perf test environments) whenever they are needed, and discarded immediately afterwards. The cloud helps make that sort of thing possible. Cloud is famous (infamous?) for helping devs get access to compute resources instantly. Instead of waiting 6-8 weeks for an Ops person to approve a server request (or upgrade request), the cloud user can simply spin up machines, resize them, and tear them down before an Ops person even starts working on the request. Ideally in a DevOps environment, those teams work together to establish gold images hardened by Ops but easily accessible (and deployable!) by devs.
Waste #3 – Conveyance / Transportation
Transportation waste occurs when material (physically or digitally) is moved around in ways that add no value to the product itself. There’s cost in moving the product itself, lost time while waiting for movement to occur, damage to the product in transit, or waiting to ship the product until there’s “enough” that makes shipping worth it.
We see this all the time, right? If you don’t have a continuous deployment mindset, you wait for major “releases” and leave valuable, working code sitting still because it’s too costly to transport it to production. Or, when you don’t have a mature source control infrastructure for both code and environment configurations, there can be “damage” to the product as it moves between developers and deployment environments. DevOps thinking helps change how the organization views “shipping” as less of an event and more of a regular occurrence. When the delivery pipeline is tuned properly, there are a minimum number of stops along the way for a product, there are few chances to introduce defects along the way, and shipment occurs immediately instead of waiting for a right-sized batch.
Waste #4 – Correction / Defects
Defects seem to be more inevitable in software than physical products, but they still are a painful source of waste. When a defect occurs, a team spends time figuring out the issue, updating the software, and deploying it again. All of this takes time away from working on new, valuable work items. Why does this happen? Teams don’t build quality in up front, don’t emphasize prevention and mistake-proofing, have an “acceptable amount” of defects allowed, or relying on spot inspections to validate quality.
You probably see this when you have distinct project management, development, QA, and operations teams that don’t work together or have shared goals. Developers crank out code to hit a date specified by a project manager and may rely on QA to catch any errors or missed requirements. Or, the project team quickly finishes and throws an application over the wall to Operations where ongoing quality issues force churn back and forth.
A DevOps approach is about collaboration and empathy for stakeholders. Quality is critical and its baked into the entire process. Defects may occur, but they aren’t acceptable and you don’t wait until the end to discover them. With automation around the deployment process, defects can be quickly addressed and deployed without unnecessary thrashing between development and support organizations. Code quality and service uptime are metrics that the WHOLE organization should care about and no team should be rewarded for “finishing early” if the quality was subpar.
Waste #5 – Over-processing
Over-processing occurs any time that we do more work on a product than is necessary. Developers make a particular feature more complicated than required, or of a higher quality than is truly needed. This can also happen when we replicate data between systems “just in case” or engage in gold-plating something that’s “good enough.” We all love to delight our users, but there are also plenty of statistics out there that show how few features of any software platform are actually used. When we waste time on items that don’t matter, we take time away from those that do.
DevOps (and the cloud) can help this a bit as you focus on continual, incremental progress versus massive releases. By constantly revisiting scope, collaborating across teams, and having a culture of experimentation, we can publish software in more of a minimum viable product manner that solves a need and solicits feedback. A tighter coupling between product management and development ensures that teams know what’s needed and when to stop. Instead of product managers or business analysts writing 150 page specs and throwing them over the wall to developers, those upstream teams should be using deep understanding of a business domain to craft initial stories that developers can start running with immediately instead of waiting for a perfect spec to be completed.
Waste #6 – Over-production
This is considered one of the worst wastes as it often hides or produces all the others! Over-production waste occurs when you produce more of the product or service than required. This means building large batches because you want to keep people busy or the setup costs are high and it’s “easier” to just build more of the product than constantly switch the delivery pipeline between products. You may see this in action when you do work even when no one asked for it, producing more than needed in anticipation of defects, or when IT departments have more projects than resources to deliver them.
In Lean and DevOps, we want to deliver what the customer needs, when they need it. If you’re keeping the test team busy writing unnecessary scripts just because they’re bored waiting for some code to test, that’s indicative of another problem. There’s likely a bottleneck or constraint elsewhere that blocking flow and causing you to over-produce in one area to compensate. In another example, consider provisioning application environments and purposely asking for more capacity than needed, just to avoid going back and asking for more later. In a cloud environment, that sort of over-production is not needed. You can provision a server or PaaS container at one size, and then adjust as needed. Instead of producing more capacity than requested, you can acquire and pay for exactly what’s needed.
Waste #7 – Inventory
Inventory waste happens when you’re holding on to resources that aren’t generating any revenue. In IT, this can be application code that is stuck waiting for manual QA checks or a batched release window. Customers would be seeing value from that product or service if it was just available and not stuck in the digital holding pen. Inventory can back up at multiple stages in an IT delivery pipeline. There could be a backlog of requirements that aren’t released to development teams until a gate check, infrastructure resources that are sitting idle until someone releases them for use, or code stuck in a pre-release stage waiting for a rolling deployment to occur.
DevOps again makes a difference here as we actively look for ways to reduce inventory and improve flow between teams. You’re constantly asking yourself “how do I reduce friction?” and one way is to prevent inventory backlogs that release in spurts and cause each subsequent part of the delivery chain to get overwhelmed. If you even out the flow and use communication and automation to constantly move ideas from requirements to code to production, everything becomes much more predictable.
Waste #8 – Knowledge
This “bonus” waste happens when there’s a disruption of the flow of knowledge because of physical barriers, constant reorganizations, teams that don’t communicate, non-integrated software systems, and the host of annoying things that make it difficult to share knowledge easily. Haven’t we all seem mind-numbing written procedures or complex reports that hide the relevant information? Or how about the “rock star dev” who doesn’t share their wisdom with the team? What about tribal knowledge that the first few hires at a startup know about, and each subsequent dev thrashes because they aren’t aware of it?
Those following a DevOps model focus on information sharing, collaboration, and empowering their teams with the information they need to do their job. It’s never been easier to set up Wikis of gotchas, have daily standups to disseminate useful info across teams, and simplify written procedures because of the use of automated (and auditable) platforms. If someone leaves or joins the team, that shouldn’t cause a complete meltdown. Instead, to avoid knowledge waste, make sure that developers have access to repositories and tools that make it simple to deploy a standard dev environment (using something like Vagrant), understand the application architecture, check in code, test it out, simulate the results, and understand the impact.
DevOps is about much more than just technology. Understanding and putting a name to the various wastes within a process can help you apply the right type of (continuous) improvement to make. The improvement could be with people, process, technology, or a combination of the three. The cloud by itself doesn’t do anything to streamline an organization if corresponding process (and culture) changes don’t accompany it. I’ve been learning to “see the system” more and recognize where constraints and waste exist and “name and shame” them. Agree? Disagree?