From Ops to DevOps
As a central function, developers must come to us for infrastructure design and hosting. We are there to ensure that all deployments adhere to policies and standards.
Change could only happen as fast as we could deliver it. Although we increased the throughput of services, we couldn’t guarantee it. Occasional spikes in workload (such as WannaCry) caused delays to projects.
Work in Progress
We needed to understand and control the work in progress. Rate of change slows over time due to an increasing body of policy (such as security policies getting stronger to counteract increasing threats) and request get more frequent due to the wider business’s growing agility.
To understand WIP we ensure requests must come via the team’s project coordinator. We set up a Kanban board to get visibility into work requests we receive.
‘The Three Ways’ Improvement Plan
To control the flow, we needed a new way of planning. We plan the backlog on Friday afternoons for the week ahead. On Monday mornings we discuss what is expected for the week. On Friday mornings, we hold a retrospective to see what went well, what didn’t, and what tasks to carry over.
By deploying infrastructure-as-code in public clouds, we have increased the speed of delivery by creating Continuous Deployment pipelines, so the bottleneck is no longer infrastructure delivery.
By creating standard development templates, such as containerisation and autoscaling, we can maintain standards with architecture. Developers are empowered to handle infrastructure within limits required by policy. We publish these standards and run workshops across the business to explain them.
As the business moves implementing Continuous Integration (CI), manual deployment of infrastructure was no longer fast enough. Automation is key to increasing flow, but that alone is not enough. We need to make sure that flow is only in one direction to avoid rework.
To reduce backwards flow of work, we workshop with product teams to discuss change collaboratively and talk regularly. Feedback is instantaneous.
We have given each individual architect and engineer responsibility specific products and technologies. This encourages commitment and retains knowledge. We rotate support rotas, preventing individuals from being pinch-points. When someone else supports a product, they ensure that the documentation required to support it is present. Any changes to a product are peer-reviewed by the team, so feedback is immediate. This increases reliability and reduces repeat incidents.
Even with flow controlled and moving in one direction, systems don’t improve. Arguably, over time they move backwards due to mounting technical debt. The next challenge was to improve the system.
Improvement comes from enhancements. Some will work, and others won’t. The skill is in embracing the ones that do (from fast feedback) and failing fast with the ones that don’t. This reduces investment in non-productive enhancements quickly.
We have a culture that avoids blame. We learn from failure.
We have an experimental nature within the team. Our engineers are free to innovate and experiment with technologies that deliver the strategy. Improvements are presented back to the group for feedback. This creates iterative improvement.
We have increased the rate of release from one release every 3-6 months, up to five releases per day, increasing quality as fixes are released immediately. Without monolithic releases, work isn’t flowing backwards duplicating of effort. Without Ops repeating infrastructure delivery they are focussed on high-value work, increasing productivity.
The Pride index in the internal ‘Great Place To Work’ survey has increased by 8 points over the last three years.
Moving to DevOps has increased efficiency, productivity and has significantly contributed to increasing morale of the Ops team.