Practices of Enterprise Continuous Integration
Filed under: build automation agile build process
There are 2 comments on this article.
What is Continuous Integration
Continuous Integration is a hot topic in the software development world. Although often linked with Agile processes such as Extreme Programming, the concept and (in many cases) individual practices of Continuous Integration are really nothing new. To start with a definition, Continuous integration is, first and foremost, a process backed by a set of tools. Martin Fowler defines it as :
… a fully automated and reproducible build, including testing, that runs many times a day. This allows each developer to integrate daily thus reducing integration problems.
To illustrate that Continuous Integration has been around for a while, Grady Booch supplied a more generic definition in the second edition of his book, Object-Oriented Analysis and Design with Applications (from way back in 1993) :
The macro process of object-oriented development is one of "continuous integration"... At regular intervals, the process of "continuous integration" yields executable releases that grow in functionality at every release...It is through these milestones that management can measure progress and quality, and hence anticipate, identify, and then actively attach risks on an ongoing basis.
To me Continuous Integration is as much about adopting a “state of mind” as it is applying individual techniques. Anybody who was worked on a complex software development project will be familiar with the issues of integrating the changes made by multiple developers. The later that integration is delayed the more potential there is for risk and failure in the integration process. Continuous Integration tries to mitigate this risk by frequently integrating small changes into an evolving code base. On complex projects, there is always a temptation for developers to work in an isolated workspace. Only after a period of time (many days or weeks) they will then attempt to integrate the changes back in - a process which can often take many days! It is often the fear of integration failure that motivates people to stay isolated. The irony however is that eventually more work might have to be carried out to integrate large, complex changes than if small changes were integrated more frequently.
The desired result of Continuous Integrations is therefore to expose integration problems as early as possible. It also aims to have a built, tested, and potentially releasable build at every stage of a project. Finally, Continuous Integration (if implemented successfully) is as much about continuous testing (and deployment) as it is code development. After all how can you introduce changes frequently without knowing that related system functionality (both directly and indirectly related) is still intact. I therefore find it hard to separate the implementation of Continuous Integration from that of Test Driven Development and would always recommend that both be adopted at the same time.
Continuous Integration Practices
There is more to Continuous Integration than at first appears - many people assume that "building often" is all that there is to Continuous Integration. However, having a regular, automated build is just one practice and it really needs to be implemented with other related practices for it to be successful. So, what exactly is Continuous Integration and how do you know if you are implementing it? Well, to try and answer this question I have defined a set of common practices which organizations who have successfully adopted Continuous Integration typically implement. This is not an exhaustive list but should go some way in helping you how "CI" you are.
1. Single Repository.
A fundamental pre-requisite for Continuous Integration is for version control to have been implemented on a project. All code and related assets (build scripts, test scripts and so on) should be versioned in a single version control repository which all developers have access to. The repository acts as the master version of the truth and nothing should be deployed or tested which has not originated from this repository. There are many version control tools available both open source and commercial; you do not have to use the most feature rich or expensive tool however, a a typical minimum list of requirements for a version control tool to support the Continuous Integration process are listed here.2. Active Development Line.
In many version control branches can be created easily and cheaply. There is therefore often a temptation to define a more complex branching strategy than is strictly necessary. However if developers are isolated on a branch, problematic or complex merges can occur. Projects implementing Continuous Integration should therefore create a specific branch to act as the Active Development Line which the developers will commit and integrate changes into. Other branches might be needed at specific times (for example a Patch/Maintenance Line - for fixing bugs in delivered products - or Release Line - for stabilizing a release prior to production deployment). For more information on Agile branching to support Continuous Integration see here.3. Integrate and Test before Commit.
Since all developer changes will be delivered to the Active Development Line, any integration errors will fail the build and quickly propagate to all developers. Therefore, before committing their changes back to the repository all developers should update their workspace with the latest changes in the repository (and integrating them if necessary). Some version control tools such as Serena Dimensions CM can enforce this process. Developers should also execute a subset of unit tests (at least those that are related to the change) to ensure that there are no functional side affects of integrating the change. This is a usually a manual process although there are now a number of tools such as OpenMake Meister that can now help automate this by enforcing pre-commit builds.4. Incremental Commit.
Introducing errors into the repository should be avoided at all costs, however this does not mean that all of the changes integrated in need to be "functionally complete". With successful implementation of Test Driven Development (TDD) there is no reason why “feature incomplete” code cannot be committed to the repository. Why is this the case, well if (as TDD proscribes) unit tests are written first and code is incrementally refactored to completion, it is then acceptable (and desirable) to “save” developer changes at appropriate points by committing them to the repository. This will make all changes visible, integrateable and testable much earlier. Since there is potential for this practice to introduce errors, it should only be practiced by teams experienced in continuous integration5. Automated Build.
An automated build process is a key factor of successful Continuous integration. The Active Development Line and Incremental Commit practices can only work successfully if automated builds and tests are executed frequently to validate incremental changes. To support Continuous Integration, the build process typically monitors the Active Development Line for commits (usually at 10-20 minute intervals). If changes are found then the build automatically executes (usually after a grace period in case changes are mid-commit) an integration build and tests (see below) run. There are a wide number of build control frameworks that support such a Continuous Integration schedule including CruiseControl and Hudson.6. Fast Builds.
Since changes can and are recommended to be committed frequently, in order for developers to receive the feedback that they need it is necessary that builds are executed as quickly as possible - a rule of thumb for Continuous Integration builds is no more than 10 minutes. Building any reasonably complex software application however can take time, from several minutes to several hours. Since in Agile projects, developers will be delivering and integrating small changes frequently, they obviously cannot wait for a two-hour build to complete before getting any feedback. To avoid this situation, Agile projects typically "stage" and re-use pre-built binaries, and only rebuild the whole system when necessary (for example, nightly or weekly). This can effectively create a build pipeline where the output of one build process can trigger the start of another.7. Automated Deployment.
When it comes to releasing your application, try to release all of your files rather than releasing individual sets. Most Agile projects prefer to deploy the whole or composite application each time rather than just the individual files that have changed. Although this is not a hard and fast rule, releasing the whole application in the same way each time tends to make the process more repeatable and prevents problems with missing files. This practice is more important for Agile projects practicing Continuous Integration because the aim is to produce executable, testable, and releasable applications to be deployed at the end of every sprint or iteration.8. Automated Testing.
As well as compiling code, the automated build process should also validate and test code to see if it conforms to pre-defined coding conventions (static analysis), executes code level tests (unit testing) and component level tests (integration testing). Continuous Integration is often intricately linked with the practice of Test Driven Development. This is because developers need to implement unit tests for all aspects of their code to validate not only that the build has been compiled, but also that it conforms to some minimum level of functional quality. Testing web or container based applications is often problematic, the Continuous Integration process in such cases should therefore carry out Automated Deployment to a integration environment before running the automated functional tests. There are a number of tools that can be used to help with this such as Canoo WebTest or Selenium.9. Status Notification.
Since Continuous Integration is about automating the incorporation and integration of changes on a continual basis, it is important that any failures are flagged as quickly possible. If a developer has “broken” the build they should be notified immediately as a broken Active Development Line will prevent other developers from committing back to it and delay development. As well as notification of failure it is also desirable for some subsets of users to be informed of successful builds, particularly build engineers and testers. The collection of build and test success/failure rates should also be collated as they can serve as a useful metric.10. Feature Drops.
No matter how much testing is carried out by development at some stage you will need to involve professional or business testers for User Acceptance Testing (UAT) - developers do not usually make good testers and usually have limited understanding of the business domain! If Continuous Integration has been implemented correctly, then the amount of UAT testing can often be substantially reduced however it is still necessary. In order for testers to carry out a planned test run they often need a more controlled input - not just the latest Continuous Integration build. Such a deliverable is often called a Feature Drop and is usually the end of iteration/sprint Continuous Integration build together with some basic documentation on the features and defects contained or addressed within it.
Implementing Continuous Integration
At first, Continuous Integration will not come naturally, it is a skill that needs to be learned and practiced. Setting up an automated build process is relatively straightforward, however gaining the discipline to aggressively write tests and commit incremental changes will take time and practice. I would recommend making sure you have your code and versions under control first (Single Repository and Active Development Line), then automate the build and any tests you have already together with notification of their success or failure (Automated Build , Automated Testing and Status Notification). Once these practices have been implemented, developers can usually see the benefits and start changing their habits to suit Continuous Integration. Some coaching will still be required on how best to commit and test but from then on it is usually just a matter of refinement and implementation of the other practices.
On large projects, there can often be multiple Continuous Integration "streams" - one for each system. In such a scenario more thought needs to be made about how best to implement dependencies and create the overall build pipeline. For really complex dependencies, at such a stage the purchase of commercial tools which are better at build dependency management are often worth investigating. Remember however, that the tools are only as good as the effort you put in ensuring that Continuous Integration is practiced well and successfully.
References
- Continuous Integration (according to Martin Fowler)
- Continuous Integration: Improving Software Quality and Reducing Risks.
- Continuous Integration Server: Feature Matrix
Comments
Nice article, we struggle with automating deployment especially when there are so many variations of platforms, environments and targets. We have started to build up a library of scripts to for example, start/stop app servers, install, undeploy, load database ... but it is taking a long time to build up this set. Although they are unique to some degree there must be some form of ready-built stuff for this that is pluggable with workflow...
Nice article, we struggle with automating deployment especially when there are so many variations of platforms, environments and targets. We have started to build up a library of scripts to for example, start/stop app servers, install, undeploy, load database ... but it is taking a long time to build up this set. Although they are unique to some degree there must be some form of ready-built stuff for this that is pluggable with workflow...
Back to Top
