The DevOps Handbook describes an admirable DevOps As A Philosophy based on flow, feedback, continual learning and experimentation. However, a near-decade of naivety, confusion, and profiteering surrounding DevOps has left the IT industry with DevOps As A Cult, and the benefits of Operability are all too often overlooked.
In this article Steve Smith explains why DevOps As A Philosophy is a laudable ideal, why DevOps As A Cult is the unpleasant reality, and why organisations should instead focus on Operability as an enabler of Continuous Delivery.
When an IT organisation has separate Development and Operations departments it will inevitably suffer from a serious conflict of interest. The Development teams will be told to keep pace with the market and incentivised by features, while the Operations teams will be told to provide reliability and incentivised by uptime. This creates a troubled relationship in which one party tries to maximise production changes, and the other tries to minimise them.
This conflict of interest has a devastating impact on the stability, throughput, and quality of IT services. It produces unstable, unreliable, and insecure services vulnerable to costly outages. It ensures production changes are delayed by days, weeks, or even months due to endless coordination between teams, convoluted change approvals, and fear of failure. It results in significant amounts of functional and operational rework, and constant firefighting just to keep systems up and running. It means the organisation loses out in the marketplace, due to the high opportunity costs incurred and the high attrition rate of employees.
DevOps As A Philosophy
In 2008, Patrick Debois and Andrew Schafer discussed at Agile 2008 the application of Agile practices to infrastructure. In 2009, John Allspaw and Paul Hammond shared with Velocity 2009 their famous “10 Deploys per Day: Dev and Ops Cooperation at Flickr” story, and Patrick Debois subsequently created the first DevOpsDays conference. The DevOps philosophy of collaboration between Development and Operations had begun.
In 2016, the DevOps Handbook was published by Gene Kim, Jez Humble, Patrick Debois, and John Willis. The DevOps Handbook builds on the Phoenix Project novel by Gene Kim, Kevin Behr, and George Spafford in 2013, and it describes how the Three Ways of DevOps can help organisations to succeed:
- The First Way: The Principles of Flow – create a continuous flow of value-add from Development to Operations
- The Second Way: The Principles of Feedback – create a constant flow of feedback from Operations to Development
- The Third Way: The Principles of Continual Learning and Experimentation – create a culture of ever-increasing knowledge within Development and Operations
The DevOps Handbook advocates long-lived product teams frequently deploying changes during normal business hours, using ubiquitous monitoring to quickly resolve errors, and building a shared culture of Continuous Improvement. It is a seminal work that describes what DevOps should be – DevOps As A Philosophy.
DevOps As A Cult
Unfortunately, there were 7 years between the creation of the DevOps meme and the publication of The DevOps Handbook. In the meantime, a different kind of DevOps has emerged that is entirely distinct from DevOps As A Philosophy yet regrettably popular within the IT industry. This bastardisation of DevOps is a cult based on confusion, naivety, and profiteering.
There has been a great deal of confusion about what DevOps actually is, and many organisations have unwittingly increased their disorder by attempting to adopt DevOps without understanding it. For example, there is now the notion of a DevOps Engineer, in which DevOps is equated with a Infrastructure As Code specialist and any need for further change is ignored. Another example is the DevOps Team, in which a team of DevOps Engineers or similar is inserted between Development and Operations teams and becomes yet another delivery impediment. As Jez Humble has remarked, “creating another functional silo that sits between Dev and Ops is clearly a poor (and ironic) way to try and solve these problems“.
Many people have naively latched onto DevOps via misinformation and with little appreciation of their organisational complexity and context. In a complex, adaptive system every individual has limited information, the cause and effect of an event cannot be predicted, and the system must be probed for insights. One common error is to literally assume the conflict of interest between Development and Operations is always the key constraint, when other emerging conflicts can be equally ruinous such as between separate Development and Testing departments. Another error is to assume large enterprise organisations need some kind of Enterprise DevOps roadmap, despite the ineffectuality of blueprints in a complex system and Dave Roberts pointing out “flow and continuous improvement are equally applicable to a large enterprise as they are to an agile web startup“.
Finally, the lack of clarity on DevOps has led to unabashed profiteering from some recruitment firms and vendors. This can be seen when recruitment firms rebrand sysadmins as DevOps Engineers, or when vendors market their automation tools as DevOps tools. DevOps certification has even been launched by the DevOps Institute, which sells one interpretation of a complex cultural movement and of which Sam Newman complained “aside from perhaps three practitioners, the rest of the group are either professional trainers or sales and marketing people“.
Many organisations that have attempted to adopt DevOps still suffer from short-lived project teams infrequently deploying changes out of business hours, manual regression testing without telemetry, and an antagonistic culture with minimal knowledge sharing. The application of confusion, naivety, and profiteering to the DevOps meme has resulted in what DevOps should not be – DevOps As A Cult.
Aim for Operability, not DevOps As A Cult
The rise of DevOps coincided with the rise of Continuous Delivery, which is explicitly focussed on the improvement of IT stability and throughput to satisfy business demand. Continuous Delivery does not need DevOps As A Philosophy but they can be thought of as complementary, due to their shared emphasis on fast feedback loops, cultural change, and task automation. DevOps As A Cult has no such standing, as shown by Dave Farley stating that “DevOps rarely says enough about the goal of delivering valuable software… this is no place for cargo-cultism“.
Continuous Delivery requires operational excellence to be built into organisations. If a service is unstable, a high level of throughput is impossible to sustain as the rework incurred during periods of instability will restrict the delivery of new features. This means Operability is of critical importance to Continuous Delivery, as throughput is dependent upon the ability of the organisation to maintain safe and reliable systems according to its operational requirements.
Both Continuous Delivery and DevOps As A Philosophy advocate the following operational practices to improve Operability:
- Prioritisation of operational requirements – plan and prioritise work on configuration, infrastructure, performance, security, etc. alongside new features
- Automated infrastructure – automate production infrastructure and build a self-service provisioning capability for on-demand pre-production environments
- Deployment health checks – incorporate system health checks and functional smoke tests into pre-production and production deployments
- Pervasive telemetry – establish a logging/monitoring platform for the aggregation, visualisation, anomaly detection, and alerting of business-level, application-level, and operational-level events
- Failure injection – introduce simulated errors under controlled conditions into production systems, and rehearse incident response scenarios
- Incident swarming – encourage people to work together to identify and resolve production incidents as soon as they occur
- Blameless post-mortems – hold post-incident reviews to understand the context, cause and effect, and remediation of a production incident, and propose countermeasures for the future
- Shared on-call responsibilities – ensure all team members are on rotation for production incidents, and empowered to handle incidents when they occur
Teams need to adopt a “You Build It, You Run It” culture, in which everyone contributes to operational practices and everyone is responsible for Operability. This means teams will need guidance on how to build, deploy, and run services plus how to create the operational toolchain to support those services. Operability engineers are required, that
For this reason operability engineers should be embedded into teams, to share their expertise on the delivery of operational requirements and coach other team members on architecting for resilience, establishing a telemetry platform, adopting a mindset of operational excellence, etc. If there are more teams than available operability engineers then every team should have an operability engineer assigned in a liaison role.
In many organisations the conflict of interest between Development and Operations is enormously damaging, and DevOps As A Philosophy as described in the DevOps Handbook is an admirable model for improving organisations via fast flow, fast feedback, and a culture of learning and experimentation. However, the confusion, naivety, and profiteering surrounding DevOps has led to DevOps As A Cult within the IT industry, and unfortunately its popularity is matched only by its inability to improve organisations.
An organisation that wishes to improve its time to market should adopt Continuous Delivery and aim for Operability. That means operability engineers working on teams to teach others how to adopt an operational mindset and build the necessary tools. Continuous Delivery needs operability, and by achieving operational excellence an organisation can improve its throughput and obtain a strategic competitive advantage in the marketplace.
Thanks to Beccy Stafford, Charles Kubicek, Chris O’ Dell, Edd Grant, John Clapham, and Martin Jackson for their feedback