Antipattern

You are currently browsing articles tagged Antipattern.

End-To-End Testing is used by many organisations, but relying on extensive end-to-end tests is fundamentally incompatible with Continuous Delivery. Why is End-To-End Testing so commonplace, and yet so ineffective?

In this article Steve Smith explains why End-To-End Testing is so undesirable and offers a lower-cost, higher-value Continuous Testing strategy focussed on antifragility. This article includes material from our popular Continuous Delivery For Managers and Continuous Delivery For Practitioners training courses.

Introduction

“Good testing involves balancing the need to mitigate risk against the risk of trying to gather too much information” Jerry Weinberg

Continuous Delivery is a set of holistic principles and practices to reduce time to market, and it is predicated upon rapid and reliable test feedback. Continuous Delivery mandates any change to code, configuration, data, or infrastructure must pass a series of automated and exploratory tests in a Deployment Pipeline to evaluate production readiness, so test execution times must be low and test results must be deterministic if an organisation is to achieve shorter lead times.

For example, consider a Company Accounts service in which year end payments are submitted to a downstream Payments service.

End-To-End Testing Considered Harmful - Company Accounts

The behaviour of the Company Accounts service could be checked at build time by the following types of automated test:

  • Unit tests check intent against implementation by verifying a discrete unit of code
  • Acceptance tests check implementation against requirements by verifying a functional slice of the system
  • End-to-end tests check implementation against requirements by verifying a functional slice of the system, including unowned dependent services

While unit tests and acceptance tests vary in terms of purpose and scope, acceptance tests and end-to-end tests vary solely in scope. Acceptance tests exclude unowned dependent services, so an acceptance test of a Company Accounts user journey would use a System Under Test comprised of the latest Company Accounts code and a Payments Stub.

End-To-End Testing Considered Harmful - A Company Accounts Acceptance Test

End-to-end tests include unowned dependent services, so an end-to-end test of a Company Accounts user journey would use a System Under Test comprised of the latest Company Accounts code and a running version of Payments.

End-To-End Testing Considered Harmful - A Company Accounts End-To-End Test

If a testing strategy is to be compatible with Continuous Delivery it must have an appropriate ratio of unit tests, acceptance tests, and end-to-end tests that balances the need for information discovery against the need for fast, deterministic feedback. If testing does not yield new information then defects will go undetected, but if testing takes too long delivery will be slow and opportunity costs will be incurred.

The folly of End-To-End Testing

“Any advantage you gain by talking to the real system is overwhelmed by the need to stamp out non-determinism” Martin Fowler

End-To-End Testing is a testing practice in which a large number of automated end-to-end tests and manual regression tests are used at build time with a small number of automated unit and acceptance tests. The End-To-End Testing test ratio can be visualised as a Test Ice Cream Cone.

End-To-End Testing Considered Harmful - The Test Ice Cream Cone

End-To-End Testing often seems attractive due to the perceived benefits of an end-to-end test:

  1. An end-to-end test maximises its System Under Test, suggesting a high degree of test coverage
  2. An end-to-end test uses the system itself as a test client, suggesting a low investment in test infrastructure

Given the above it is perhaps understandable why so many organisations adopt End-To-End Testing – as observed by Don Reinertsen, “this combination of low investment and high validity creates the illusion that system tests are more economical“. However, the End-To-End Testing value proposition is fatally flawed as both assumptions are incorrect:

  1. The idea that testing a whole system will simultaneously test its constituent parts is a Decomposition Fallacy. Checking implementation against requirements is not the same as checking intent against implementation, which means an end-to-end test will check the interactions between code pathways but not the behaviours within those pathways
  2. The idea that testing a whole system will be cheaper than testing its constituent parts is a Cheap Investment Fallacy. Test execution time and non-determinism are directly proportional to System Under Test scope, which means an end-to-end test will be slow and prone to non-determinism

Martin Fowler has warned before that “non-deterministic tests can completely destroy the value of an automated regression suite“, and Stephen Covey’s Circles of Control, Influence, and Concern highlights how the multiple actors in an end-to-end test make non-determinism difficult to identify and resolve. If different teams in the same Companies R Us organisation owned the Company Accounts and Payments services the Company Accounts team would control its own service in an end-to-end test, but would only be able to influence the second-party Payments service.

End-To-End Testing Considered Harmful - A Company Accounts End-To-End Test Single Organisation

The lead time to improve an end-to-end test depends on where the change is located in the System Under Test, so the Company Accounts team could analyse and implement a change in the Company Accounts service in a relatively short lead time. However, the lead time for a change to the Payments service would be constrained by the extent to which the Company Accounts team could persuade the Payments team to take action.

Alternatively, if a separate Payments R Us organisation owned the Payments service it would be a third-party service and merely a concern of the Company Accounts team.

End-To-End Testing Considered Harmful - A Company Accounts End-To-End Test Multiple Organisations

In this situation a change to the Payments service would take much longer as the Company Accounts team would have zero control or influence over Payments R Us. Furthermore, the Payments service could be arbitrarily updated with little or no warning, which would increase non-determinism in Company Accounts end-to-end tests and make it impossible to establish a predictable test baseline.

A reliance upon End-To-End Testing is often a symptom of long-term underinvestment producing a fragile system that is resistant to change, has long lead times, and optimised for Mean Time Between Failures instead of Mean Time To Repair. Customer experience and operational performance cannot be accurately predicted in a fragile system due to variations caused by external circumstances, and focussing on failure probability instead of failure cost creates an exposure to extremely low probability, extremely high cost events known as Black Swans such as Knights Capital losing $440 million in 45 minutes. For example, if the Payments data centre suffered a catastrophic outage then all customer payments made by the Company Accounts service would fail.

End-To-End Testing Considered Harmful - Company Accounts Payments Failure

An unavailable Payments service would leave customers of the Company Accounts service with their money locked up in in-flight payments, and a slow restoration of service would encourage dissatisfied customers to take their business elsewhere. If any in-flight payments were lost and it became public knowledge it could trigger an enormous loss of customer confidence.

End-To-End Testing is an uncomprehensive, high cost testing strategy. An end-to-end test will not check behaviours, will take time to execute, and will intermittently fail, so a test suite largely composed of end-to-end tests will result in poor test coverage, slow execution times, and non-deterministic results. Defects will go undetected, feedback will be slow and unreliable, maintenance costs will escalate, and as a result testers will be forced to rely on their own manual end-to-end regression tests. End-To-End Testing cannot produce short lead times, and it is utterly incompatible with Continuous Delivery.

The value of Continuous Testing

“Cease dependence on inspection to achieve quality. Eliminate the need for inspection on a mass basis by building quality into the product in the first place” Dr. W. Edwards Deming

Continuous Delivery advocates Continuous Testing – a testing strategy in which a large number of automated unit and acceptance tests are complemented by a small number of automated end-to-end tests and focussed exploratory testing. The Continuous Testing test ratio can be visualised as a Test Pyramid, which might be considered the antithesis of the Test Ice Cream Cone.

End-To-End Testing Considered Harmful - The Test Pyramid

Continuous Testing is aligned with Test-Driven Development and Acceptance Test Driven Development, and by advocating cross-functional testing as part of a shared commitment to quality it embodies the Continuous Delivery principle of Build Quality In. However, Continuous Testing can seem daunting due to the perceived drawbacks of unit tests and acceptance tests:

  1. A unit test or acceptance test minimises its System Under Test, suggesting a low degree of test coverage
  2. A unit test or acceptance test uses its own test client, suggesting a high investment in test infrastructure

While the End-To-End Testing value proposition is invalidated by incorrect assumptions of high test coverage and low maintenance costs, the inverse is true of Continuous Testing – its value proposition is validated by incorrect assumptions of low test coverage and high maintenance costs:

  1. A unit test will check intent against implementation and an acceptance test will check implementation against requirements, which means both the behaviour of a code pathway and its interactions with other pathways can be checked
  2. A unit test will restrict its System Under Test scope to a single pathway and an acceptance test will restrict itself to a single service, which means both can have the shortest possible execution time and deterministic results

A non-deterministic acceptance test can be resolved in a much shorter period of time than an end-to-end test as the System Under Test has a single owner. If Companies R Us owned the Company Accounts service and Payments R Us owned the Payments service a Company Accounts acceptance test would only use services controlled by the Company Accounts team.

End-To-End Testing Considered Harmful - Acceptance Test Multiple Organisations

If the Company Accounts team attempted to identify and resolve non-determinism in an acceptance test they would be able to make the necessary changes in a short period of time. There would also be no danger of unexpected changes to the Payments service impeding an acceptance test of the latest Company Accounts code, which would allow a predictable test baseline to be established.

End-to-end tests are a part of Continuous Testing, not least because the idea that testing the constituent parts of a system will simultaneously test the whole system is a Composition Fallacy. A small number of automated end-to-end tests should be used to validate core user journeys, but not at build time when unowned dependent services are unreliable and unrepresentative. The end-to-end tests should be used for release time smoke testing and runtime production monitoring, with synthetic transactions used to simulate user activity. This approach will increase confidence in production releases and should be combined with real-time monitoring of business and operational metrics to accelerate feedback loops and understand user behaviours.

In Continuous Delivery there is a recognition that optimising for Mean Time To Repair is more valuable than optimising for Mean Time Between Failures as it enables an organisation to minimise the impact of production defects, and it is more easily achievable. Defect cost can be controlled as Little’s Law guarantees smaller production releases will shorten lead times to defect resolution, and Continuous Testing provides the necessary infrastructure to shrink feedback loops for smaller releases. The combination of Continuous Testing and Continuous Delivery practices such as Blue Green Releases and Canary Releases empower an organisation to create a robust system capable of neutralising unanticipated events, and advanced practices such as Dark Launching and Chaos Engineering can lead to antifragile systems that seek to benefit from Black Swans. For example, if Chaos Engineering surfaced concerns about the Payments service the Company Accounts team might Dark Launch its Payments Stub into production and use it in the unlikely event of a Payments data centre outage.

End-To-End Testing Considered Harmful - Company Accounts Payments Stub Failure

While the Payments data centre was offline the Company Accounts service would gracefully degrade to collecting customer payments in the Payments Stub until the Payments service was operational again. Customers would be unaffected by the production incident, and if competitors to the Company Accounts service were also dependent on the same third-party Payments service that would constitute a strategic advantage in the marketplace. Redundant operational capabilities might seem wasteful, but Continuous Testing promotes operational excellence and as Nassim Nicholas Taleb has remarked “something unusual happens – usually“.

Continuous Testing can be a comprehensive and low cost testing strategy. According to Dave Farley and Jez Humble “building quality in means writing automated tests at multiple levels“, and a test suite largely comprised of unit and acceptance tests will contain meticulously tested scenarios with a high degree of test coverage, low execution times, and predictable test results. This means end-to-end tests can be reserved for smoke testing and production monitoring, and testers can be freed up from manual regression testing for higher value activities such as exploratory testing. This will result in fewer production defects, fast and reliable feedback, shorter lead times to market, and opportunities for revenue growth.

From end-to-end testing to continuous testing

“Push tests as low as they can go for the highest return in investment and quickest feedback” Janet Gregory and Lisa Crispin

Moving from End-To-End Testing to Continuous Testing is a long-term investment, and should be based on the notion that an end-to-end test can be pushed down the Test Pyramid by decoupling its concerns as follows:

  • Connectivity – can services connect to one another
  • Conversation – can services talk with one another
  • Conduct – can services behave with one another

Assume the Company Accounts service depends on a Pay endpoint on the Payments service, which accepts a company id and payment amount before returning a confirmation code and days until payment. The Company Accounts service sends the id and amount request fields and silently depends on the code response field.

End-To-End Testing Considered Harmful - Company Accounts Pay

The connection between the services could be unit tested using Test Doubles, which would allow the Company Accounts service to test its reaction to different Payments behaviours. Company Accounts unit tests would replace the Payments connector with a Mock or Stub connector to ensure scenarios such as an unexpected Pay timeout were appropriately handled.

The conversation between the services could be unit tested using Consumer Driven Contracts, which would enable the Company Accounts service to have its interactions continually verified by the Payments service. The Payments service would issue a Provider Contract describing its Pay API at build time, the Company Accounts service would return a Consumer Contract describing its usage, and the Payments service would create a Consumer Driven Contract to be checked during every build.

End-To-End Testing Considered Harmful - Company Accounts Consumer Driven Contract

With the Company Accounts service not using the days response field it would be excluded from the Consumer Contract and Consumer Driven Contract, so a build of the Payments service that removed days or added a new comments response field would be successful. If the code response field was removed the Consumer Driven Contract would fail, and the Payments team would have to collaborate with the Company Accounts team on a different approach.

The conduct of the services could be unit tested using API Examples, which would permit the Company Accounts service to check for behavioural changes in new releases of the Payments service. Each release of the Payments service would be accompanied by a sibling artifact containing example API requests and responses for the Pay endpoint, which would be plugged into Company Accounts unit tests to act as representative test data and warn of behavioural changes.

End-To-End Testing Considered Harmful - Company Accounts API Examples

If a new version of the Payments service changed the format of the code response field from alphanumeric to numeric it would cause the Company Accounts service to fail at build time, indicating a behavioural change within the Payments service and prompting a conversation between the teams.

Conclusion

“Not only won’t system testing catch all the bugs, but it will take longer and cost more – more than you save by skipping effective acceptance testing” – Jerry Weinberg

End-To-End Testing seems attractive to organisations due to its promise of high test coverage and low maintenance costs, but the extensive use of automated end-to-end tests and manual regression tests can only produce a fragile system with slow, unreliable test feedback that inflates lead times and is incompatible with Continuous Delivery. Continuous Testing requires an upfront and ongoing investment in test automation, but a comprehensive suite of automated unit tests and acceptance tests will ensure fast, deterministic test feedback that reduces production defects, shortens lead times, and encourages the Continuous Delivery of robust or antifragile systems.

Further Reading

  1. Continuous Delivery by Dave Farley and Jez Humble
  2. Principles Of Product Development Flow by Don Reinertsen
  3. 7 Habits of Highly Effective People by Stephen Covey
  4. Test Pyramid by Martin Fowler
  5. Test Ice Cream Cone by Alister Scott
  6. Integrated Tests Are A Scam by JB Rainsberger
  7. Agile Testing and More Agile Testing by Janet Gregory and Lisa Crispin
  8. Perfect Software and Other Illusions by Jerry Weinberg
  9. Release Testing Is Risk Management Theatre by Steve Smith
  10. The Art Of Agile Development by James Shore and Shane Warden
  11. Making End-To-End Tests Work by Adrian Sutton
  12. Just Say No To More End-To-End Tests by Mike Wacker
  13. Antifragile by Nassim Nicholas Taleb
  14. On Antifragility In Systems And Organisational Architecture by Jez Humble

Acknowledgements

Thanks to Amy Phillips, Beccy Stafford, Charles Kubicek, and Chris O’Dell for their early feedback on this article.

Tags: , , , , ,

The Version Control Strategies series

  1. Organisation Antipattern – Release Feature Branching
  2. Organisation Pattern – Trunk Based Development
  3. Organisation Antipattern – Integration Feature Branching
  4. Organisation Antipattern – Build Feature Branching

Build Feature Branching is oft-incompatible with Continuous Integration

Build Feature Branching is a version control strategy where developers commit their changes to individual remote branches of a source code repository prior to the shared trunk. Build Feature Branching is possible with centralised Version Control Systems (VCSs) such as Subversion and TFS, but it is normally associated with Distributed Version Control Systems (DVCSs) such as Git and Mercurial – particularly GitHub and GitHub Flow.

In Build Feature Branching Trunk is considered a flawless representation of all previously released work, and new features are developed on short-lived feature branches cut from Trunk. A developer will commit changes to their feature branch, and upon completion those changes are either directly merged into Trunk or reviewed and merged by another developer using a process such as a GitHub Pull Request. Automated tests are then executed on Trunk, testers manually verify the changes, and the new feature is released into production. When a production defect occurs it is fixed on a release branch cut from Trunk and merged back upon production release.

Consider an organisation that provides an online Company Accounts Service, with its codebase maintained by a team practising Build Feature Branching. Initially two features are requested – F1 Computations and F2 Write Offs – so F1 and F2 feature branches are cut from Trunk and developers commit their changes to F1 and F2.

Organisation Antipattern - Build Feature Branching - 1

Two more features – F3 Bank Details and F4 Accounting Periods – then begin development, with F3 and F4 feature branches cut from Trunk and developers committing to F3 and F4. F2 is completed and merged into Trunk by a non-F2 developer following a code review, and once testing is signed off on Trunk + F2 it is released into production. The F1 branch grows to encompass a Computations refactoring, which briefly breaks the F1 branch.

Organisation Antipattern - Build Feature Branching - 2

A production defect is found in F2, so a F2.1 fix for Write Offs is made on a release branch cut from Trunk + F2 and merged back when the fix is in production. F3 is deemed complete and merged into Trunk + F2 + F2.1 by a non-F3 developer, and after testing it is released into production. The F1 branch grows further as the Computations refactoring increases in scope, and the F4 branch is temporarily broken by an architectural change to the submissions system for Accounting Periods.

Organisation Antipattern - Build Feature Branching - 3

When F1 is completed the amount of modified code means a lengthy code review by a non-F1 developer and some rework are required before F1 can be merged into Trunk + F2 + F2.1 + F3, after which it is successfully tested and released into production. The architectural changes made in F4 also mean a time-consuming code review and merge into Trunk + F2 + F2.1 + F3 + F1 by a non-F4 developer, and after testing F4 goes into production. However, a production defect is then found in F4, and a F4.1 fix for Accounting Periods is made on a release branch and merged into Trunk + F2 + F2.1 + F3 + F1 + F4 once the defect is resolved.

Organisation Antipattern - Build Feature Branching - 4

In this example F1, F2, F3, and F4 all enjoy uninterrupted development on their own feature branches. The emphasis upon short-lived feature branches reduces merge complexity into Trunk, and the use of code reviews lowers the probability of Trunk build failures. However, the F1 and F4 feature branches grow unchecked until they both require a complex, risky merge into Trunk.

The Company Accounts Service team might have used Promiscuous Integration to reduce the complexity of merging each feature branch into Trunk, but that does not prevent the same code deviating on different branches. For example, integrating F2 and F3 into F1 and F4 would simplify merging F1 and F4 into Trunk later on, but it would not restrain F1 and F4 from generating Semantic Conflicts if they both modified the same code.

Organisation Antipattern - Build Feature Branching - 4 Promiscuous Merge

This example shows how Build Feature Branching typically inserts a costly integration phase into software delivery. Short-lived feature branches with Promiscuous Integration should ensure minimal integration costs, but the reality is feature branch duration is limited only by developer discipline – and even with the best of intentions that discipline is all too easily lost. A feature branch might be intended to last only for a day, but all too often it will grow to include bug fixes, usability tweaks, and/or refactorings until it has lasted longer than expected and requires a complex merge into Trunk. This is why Build Feature Branching is normally incompatible with Continuous Integration, which requires every team member to integrate and test their changes on Trunk on at least a daily basis. It is highly unlikely every member of a Build Feature Branching team will merge to Trunk daily as it is too easy to go astray, and while using a build server to continuously verify branch integrity is a good step it does not equate to shared feedback on the whole system.

Build Feature Branching advocates that the developer of a feature branch should have their changes reviewed and merged into Trunk by another developer, and this process is well-managed by tools such as GitHub Pull Requests. However, each code review represents a handover period full of opportunities for delay – the developer might wait for reviewer availability, the reviewer might wait for developer context, the developer might wait for reviewer feedback, and/or the reviewer might wait for developer rework. As Allan Kelly has remarked “code reviews lose their efficacy when they are not conducted promptly“, and when a code review is slow the feature branch grows stale and Trunk merge complexity increases. A better technique to adopt would be Pair Programming, which is a form of continuous code review with minimal rework.

Asking developers working on orthogonal tasks to share responsibility for integrating a feature into Trunk dilutes responsibility. When one developer has authority for a feature branch and another is responsible for its Trunk merge both individuals will naturally feel less responsible for the overall outcome, and less motivated to obtain rapid feedback on the feature. It is for this reason Build Feature Branching often leads to what Jim Shore refers to as Asynchronous Integration, where the developer of a feature branch starts work on the next feature immediately after asking for a review, as opposed to waiting for a successful review and Trunk build. In the short-term Asynchronous Integration leads to more costly build failures, as the original developer must interrupt their new feature and context switch back to the old feature to resolve a Trunk build failure. In the long-term it results in a slower Trunk build, as a slow build is more tolerable when it is monitored asynchronously. Developers will resist running a full build locally, developers will then checkin less often, and builds will gradually slowdown until the entire team grinds to a halt. A better solution is for developers to adopt Synchronous Integration in spite of Build Feature Branching, and by waiting on Trunk builds they will be compelled to optimise it using techniques such as acceptance test parallelisation.

Build Feature Branching works well for open-source projects where a small team of experienced developers must integrate changes from a disparate group of contributors, and the need to mitigate different timezones and different levels of expertise outweighs the need for Continuous Integration. However, for commercial software development Build Feature Branching fits the Wikipedia definition of an antipattern – “a common response to a recurring problem that is usually ineffective and risks being highly counterproductive“. A small, experienced team practising Build Feature Branching could theoretically accomplish Continuous Integration given a well-structured architecture and a predictable flow of features, but it would be unusual. For the vast majority of co-located teams working on commercial software Build Feature Branching is a costly practice that discourages collaboration, inhibits refactoring, and by implicitly sacrificing Continuous Integration acts as a significant impediment to Continuous Delivery. As Paul Hammant has said, “you should not make branches for features regardless of how long they are going to take“.

Tags: , , ,

The Version Control Strategies series

  1. Organisation Antipattern – Release Feature Branching
  2. Organisation Pattern – Trunk Based Development
  3. Organisation Antipattern – Integration Feature Branching
  4. Organisation Antipattern – Build Feature Branching

Integration Feature Branching is overly-costly and unpredictable

Integration Feature Branching is a version control strategy where developers commit their changes to a shared remote branch of a source code repository prior to the shared trunk. Integration Feature Branching is applicable to both centralised Version Control Systems (VCS) and Distributed Version Control Systems (DVCS), with multiple variants of increasing complexity:

  • Type 1 – Integration branch and Trunk. This was originally used with VCSs such as Subversion and TFS
  • Type 2 – Feature branches, an Integration branch, and Trunk. This is used today with DVCSs such as Git and Mercurial
  • Type 3 – Feature release branches, feature branches, an Integration branch, and Trunk. This is advocated by Git Flow

In all Integration Feature Branching variants Trunk represents the latest production-ready state and Integration represents the latest completed changes ready for release. New features are developed on Integration (Type 1), or short-lived feature branches cut from Integration and merged back into Integration on completion (Types 2 and 3). When Integration contains a new feature it is merged into Trunk for release (Types 1 and 2), or a short-lived feature release branch cut from Integration and merged into Trunk and Integration on release (Type 3). When a production defect occurs it is fixed on a release branch cut from Trunk, then merged back to Integration (Types 1 and 2) or a feature release branch if one exists (Type 3).

Consider an organisation that provides an online Company Accounts Service, with its codebase maintained by a team practising Type 2 Integration Feature Branching. Initially two features are requested – F1 Computations and F2 Write Offs – so F1 and F2 feature branches are cut from Integration and developers commit their changes to F1 and F2.

Organisation Antipattern - Integration Feature Branching - Type 2 - 1

Two more features – F3 Bank Details and F4 Accounting Periods – then begin development, with F3 and F4 feature branches cut from Integration and developers committing to F3 and F4. F2 is completed and merged into Integration, and after testing it is merged into Trunk and regression tested before its production release. The F1 branch is briefly broken by a computations refactoring, with no impact on Integration.

Organisation Antipattern - Integration Feature Branching - Type 2 - 2

When F3 is completed it is merged into Integration + F2 and tested, but in the meantime a production defect is found in F2. A F2.1 fix is made on a F2.1 release branch cut from Trunk + F2, and after its release F2.1 is merged into and regression tested on both Integration + F2 + F3 and Trunk + F2. F3 is then merged into Trunk and regression tested, after which it is released into production. F1 continues development, and the F4 branch is temporarily broken by changes to the submissions system.

Organisation Antipattern - Integration Feature Branching - Type 2 - 3

When F1 is completed and merged into Integration + F2 + F3 + F2.1 it is ready for production release, but a business decision is made to release F4 first. F4 is completed and after being merged into and tested on both Integration + F2 + F3 + F2.1 + F1 and Trunk + F2 + F3 + F2.1 it is released into production. Soon afterwards F1 is merged into and regression tested on Trunk + F2 + F2.1 + F3, then released into production. A production defect is found in F4, and a F4.1 fix is made on a release branch cut from Trunk + F2 + F2.1 + F3 + F4 + F1. Once F4.1 is released it is merged into and regression tested on both Integration + F2 + F3 + F2.1 + F1 + F4 and Trunk + F2 + F2.1 + F3 + F4 + F1.

Organisation Antipattern - Integration Feature Branching - Type 2 - 4

In this example F1, F2, F3, and F4 all enjoy uninterrupted development on their own feature branches. The use of an Integration branch reduces the complexity of each merge into Trunk, and allows the business stakeholders to re-schedule the F1 and F4 releases when circumstances change. However, the isolated development of F1, F2, F3, and F4 causes complex, time-consuming merges into Integration, and Trunk requires regression testing as it can differ from Integration – such as F4 being merged into Integration + F2 + F3 + F2.1 + F1 and Trunk + F2 + F2.1 + F3. The Company Accounts Service team might have used Promiscuous Integration on feature release to reduce the complexity of merging into Integration, but there would still be a need for regression testing on Trunk.

Organisation Antipattern - Integration Feature Branching - Type 2 - 4 Promiscuous

If the Company Accounts Service team used Type 3 Integration Feature Branching the use of feature release branches between Integration and Trunk could reduce the complexity of merging into Trunk, but regression testing would still be required on Trunk to garner confidence in a production release. Type 3 Integration Feature Branching also makes the version control strategy more convoluted for developers, as highlighted by Adam Ruka criticising Git Flow’s ability to “create more useless merge commits that make your history even less readable, and add significant complexity to the workflow“.

Organisation Antipattern - Integration Feature Branching - Type 3 - 4 Promiscuous

The above example shows how Integration Feature Branching adds a costly, unpredictable phase into software development for little gain. The use of an Integration branch in Type 1 creates wasteful activities such as Integration merges and Trunk regression testing, which insert per-feature variability into delivery schedules. The use of feature branches in Type 2 discourages collaborative design and refactoring, leading to a gradual deterioration in codebase quality. The use of feature release branches in Type 3 lengthens feedback loops, increasing rework and lead times when defects occur.

Integration Feature Branching is entirely incompatible with Continuous Integration. Continuous Integration requires every team member to integrate and test their code on Trunk at least once a day in order to minimise feedback loops, and Integration Feature Branching is the polar opposite of this. While Integration Feature Branching can involve commits to Integration on a daily basis and a build server constantly verifying both Integration and Trunk integrity, it is vastly inferior to continuously integrating changes into Trunk. As observed by Dave Farley, “you must have a single shared picture of the state of the system… there is no point having a separate integration branch“.

 

 

Tags: , , ,

A taxonomy of version control strategies for and against Continuous Integration
This series of articles describes a taxonomy for different types of Feature Branching – developers working on branches in isolation from trunk – and how Continuous Integration is impacted by Feature Branching variants.

  1. Organisation Antipattern: Release Feature Branching – the what, why, and how of long-lived feature branches
  2. Organisation Pattern: Trunk Based Development – the what, why, and how of trunk development
  3. Organisation Antipattern: Integration Feature Branching – the what, why, and how of long-lived integration branches
  4. Organisation Antipattern: Build Feature Branching – the what, why, and how of short-lived feature branches

Tags: , , , ,

The Version Control Strategies series

  1. Organisation Antipattern – Release Feature Branching
  2. Organisation Pattern – Trunk Based Development
  3. Organisation Antipattern – Integration Feature Branching
  4. Organisation Antipattern – Build Feature Branching

Release Feature Branching dramatically increases development costs and risk

Feature Branching is a version control practice in which developers commit their changes to a branch of a source code repository before merging to trunk at a later date. Popularised in the 1990s and 2000s by centralised Version Control Systems (VCS) such as ClearCase, Feature Branching has evolved over the years and is currently enjoying a resurgence in popularity thanks to Distributed Version Control Systems (DVCS) such as Git.

The traditional form of Feature Branching originally promoted by ClearCase et al might be called Release Feature Branching. The central branch known as trunk is considered a flawless representation of all previously released work, and new features for a particular release are developed on a long-lived branch. Developers commit changes to their branch, automated tests are executed, and testers manually verify the new features. Those features are then released into production from the branch, merged into trunk by the developers, and regression tested on trunk by the testers. The branch can then be earmarked for deletion and should only be used for production defect fixes.

Consider an organisation that provides an online company accounts service, with its codebase maintained by a team practicing Release Feature Branching. Two epics – E1 Corporation Tax and E2 Trading Losses – begin development on concurrent feature branches. The E1 branch is broken early on, but E2 is unaffected and carries on regardless.

In month 2, two more epics – E3 Statutory Accounts and E4 Participator Loans – begin. E3 is estimated to have a low impact but its branch is broken by a refactoring and work is rushed to meet the E3 deadline. Meanwhile the E4 branch is broken by a required architecture change and gradually stabilised.

In month 3, E3 is tested and released into production before being merged into trunk and regression tested. The E2 branch becomes broken so progress halts until it is fixed. The E1 branch is tested and released into production before the merge and regression testing of trunk + E3 + E1.

In month 4, E2 is tested and released into production but the subsequent merge and regression testing of trunk + E3 + E1 + E2 unexpectedly fails. While the E2 developers fix trunk E4 is tested and released, and once trunk is fixed the merge and regression testing of trunk + E3 + E1 + E2 + E4 is performed. Soon afterwards a critical defect is found in E4, so a E4.1 fix is also released.

At this point all 4 feature branches could theoretically be deleted, but Corporation Tax changes are requested for E1 on short notice and a trunk release is refused by management due to the perceived risk. The dormant E1 branch is resurrected so E1.1 can be released into production and merged into trunk. While the E1 merge was trunk + E3 the E1.1 merge is trunk + E3 + E2 + E4.1, resulting in a more complex merge and extensive regression testing.

In this example E1, E2, E3, and E4 enjoyed between 1 and 3 months of uninterrupted development, and E4 was even released into production while trunk was broken. However, each period of isolated development created a feedback delay on trunk integration, and this was worsened by the localisation of design activities such as the E3 refactoring and E4 architectural change. This ensured merging and regression testing each branch would be a painful, time-consuming process that prevented new features from being worked on – except E1.1, which created an even more costly and risky integration into trunk.

This situation could have been alleviated by the E1, E2, E3, and/or E4 developers directly merging the changes on other branches into their own branch prior to their production release and merge into trunk. For instance, in month 4 the E4 developers might have merged the latest E1 changes, the latest E2 changes, and the final E3 changes into the E4 branch prior to release.

Martin Fowler refers to this process of directly merging between branches as Promiscuous Integration, and promiscuously integrating E1, E2, and E3 into E4 would certainly have reduced the complexity of the eventual trunk + E3 + E1 + E2 + E4 merge. However, newer E1 and E2 changes could still introduce complexity into that merge, and regression testing E4 on trunk would still be necessary.

The above example shows how Release Feature Branching inserts an enormously costly and risky integration phase into software delivery. Developer time must be spent managing and merging feature branches into trunk, and with each branch delaying feedback for prolonged periods a complex merge process per branch is inevitable. Tester time must be spent regression testing trunk, and although some merge tools can automatically handle syntactic merge conflicts there remains potential for Semantic Conflicts and subtle errors between features originating from different branches. Promiscuous Integration between branches can reduce merge complexity, but it requires even more developer time devoted to branch management and the need for regression testing on trunk is unchanged.

Since the mid 2000s Release Feature Branching has become increasingly rare due to a greater awareness of its costs. Branching, merging, and regression testing are all non-value adding activities that reduce available time for feature development, and as branches diverge over time there will be a gradual decline in collaboration and codebase quality. This is why it is important to heed the advice of Dave Farley and Jez Humble that “you should never use long-lived, infrequently merged branches as the preferred means of managing the complexity of a large project“.

Tags: , , ,

« Older entries