Defining the Build Process

Posted by buildmeister on May 11th, 2006
Filed under:  deployment  release  build  process 
There are 3 comments on this article.
Bookmark and Share

Introduction

Anyone who has worked on a software development project will be familiar with the term "build". Even for a complete software novice, it would not take much imagination to guess what the term refers to. Everyone has "built" something at some time in their life - maybe a lego dinosaur, a dog kennel or even for the really adventurous an entire house! Of course, what is being referred to in these three examples is the construction of something that has a tangible result. One of the main differences with the software build, is that it doesn't necessarily have to resolve to a finished product. The build might fail, the process to create it might not finish, yet in some ways such failures are seen as successes - as they have uncovered something that is wrong and can hopefully be fixed. Try applying that same concept to the construction industry - you've built the dog kennel but the dog can't fit in it, you've built a house but forgot the windows. In these circumstances you wouldn't get a second chance and probably be recommended to look for another job or take up another hobby pretty quickly. The obvious questions is: why is building software so different?

Obviously, what we are dealing with here is the recognized limitations of the software engineering industry - software as art, as an evolving and moving target. In software development we don't necessarily have a complete blue-print and neither do we know all the answers up front. We would therefore like something to visualize - to demonstrate and talk about - as we try and find our way there. If we could develop software correctly - the first time - then we would only ever need to create a single build, upon whose completion our job would be done. However, as we live in the real world and recognize our industries limitations, we understand that we will never get things right the first time. We are dealing with people, potentially many of whom are working on the same project. They might have slightly different opinions about what the end result should be, they might even make mistakes - they're human after all! What we need therefore are regular milestones or integration points, where we can bring together all the individual pieces of work that have been created so far. This will allow us to correct those mistakes and consolidate those opinions. In software development this is the real reason we create builds.

Build value

Ever increasing pressures of time-to-market and cost reduction are leading to organizations shrinking their release cycles and delivering functionality in smaller, frequent increments. In order to achieve this development teams are examining their build process to see if they can make them quicker, leaner and more automated. Most software development projects consist of a number of developers working towards a common goal. They generally work on individual or collaborative tasks in order to reach this goal and then "integrate" them together as and when they are completed. The larger the project and the more developers you have on it, the more you need some mechanism to pull these tasks together, to identify a common baseline for ongoing development and to demonstrate the progress that has been made so far. This is usually encapsulated in an Integration Build process and in which software development teams see significant value. However, for me a well defined build process can have value at a number of different levels:

  • To the individual developer - where the value is the confidence that their incremental changes can be incorporated into the team environment.
  • To the team - where the value is being able to uncover integration issues between different developer changes.
  • For the project and organization as a whole - where value is the demonstration of executable progress and the resultant metrics.

The build process can be the heartbeat of a successful project. I believe most people do not spend enough time considering how often they should build, what they should build and what the overall functions of the build process should be. Proactively assessing this is what I call finding your "project rhythm" and which I discuss in detail in Architecting the Build Process.

A well defined build process can also be critical in ensuring that certain "well-known" software development challenges can be met. These challenges are often apparent first at the business level, and are expressed usually in terms similar to the following:

  • Regulatory Compliance and Governance
  • Globalisation and Distributed Development
  • Quality and Time to market

However, at the technical build level such challenges can be expressed in more direct terms and can typically be met by addressing the following:

  • Traceability and Completeness - knowing throughout the complete software development lifecycle why you are doing what you are doing and that it contains all of what you intended.
    A build process can help with traceability by automatically capturing and reporting on the changes (new features, defects and so on) that have gone into the build. This information is critical for the builds consumers, for example the Quality Assurance team who need to know which of the defects that they raised have been included in the build.
  • Repeatability and Reliability - being able to do the same thing over and over again and it being correct each time.
    A build process should snapshot everything at the moment it is created, including source file version, compiler settings and the operating system environment itself. This information is critical for being able to reproduce an environment for fixing defects after a product has been released.
  • Agility and Speed - having a build process in which changes can be integrated quickly or as and when needed and that completes in as short a time as possible.
    A build process should be able to be setup and executed quickly so as to meet the needs of its users. It should be capable of being executed continually - maybe many times a day. This capability is critical for delivering hot-fixes quickly but also for projects practicing Continuous Integration where developers are working on small incremental changes and committing them frequently. A typical "bad smell" in a project is when developers are frequently waiting unacceptable amounts of time for the output of the Integration Build.

I have made reference a number of times in this section to a "well defined" build process. In order to bring some clarity to what such a process could include, I will next describe typical build "profiles" and what "functions" a build can carry out, i.e. compiling, unit testing and I will also discuss the "stages" of a typical lifecycle that a single build could go through.

Build profiles

Every software development project involves some form of build, from the one man "helloworld.exe" to the 500 developer, multi-site missile control system. The exact form that the build takes will depend on a number of factors such as the chosen development language, operating system environment or development methodology, however there are generally three profiles of builds¹ that you might want to carry out and that are illustrated in the diagram below.

[Build Frequency]

In more detail, these three profiles can be defined as follows:

  • Private Build.
    A Private Build refers to a build that is created by a developer in his/her own workspace. This type of build is usually created for the purpose of checking the ongoing status of the developer’s changes, i.e., to assess whether his/her source code compiles.
  • Integration Build.
    An Integration Build is a build that is carried out by an assigned integrator or central function. This type of build can be carried out manually by a lead developer or a member of the build team, or alternately via an automatically scheduled program or service. This build is created to assess the effect of integrating a set of changes across a development team.
  • Release Build.
    A Release Build is a build that is carried out by a central function, usually a member of the build team. This build is created with the express intention of being delivered to a customer - either internal or external. A Release Build is also usually created in an isolated and controlled environment.

One important point to note is that although there are three explicit profiles defined here, it does not mean that each type of build should be constructed in an entirely different manner. In fact, I recommend that the same set of build scripts be utilized, irrespective of by whom and for what reason the build is being carried out. In this way, errors with the build scripts are identified at an early stage – during development or integration time - not when you are just about to make a release. Also, there might be variations on these patterns, for example integration can occur at a number of distinct stages; you might have component integration - where you are integrating the code for a single component - or system integration – where you are integrating multiple components, some of which have possibly been already built

Build functions

When most people think about the build process they usually think about compilation, however a build should be seen as an end-to-end process. As an analogy let us refer back to our previous example of house construction. If you were constructing a house you would not say you were finished when the bricks, timber and masonry had gone up. You would also have to fit electrics, plumbing, decorate the house and also have it surveyed and safety tested so that it was of merchantable quality. These are all distinct functions in their own right. In a similar way the software build process can be broken down into the following functions:

  • Version Control
    The Version Control function carries out activities such as workspace creation and updation, tagging or baselining and version reporting. It creates an environment for the build process to run in and captures metadata about the inputs and outputs of the build process to ensure repeatability and reliability.
  • Static Analysis
    The Static Analysis function is used to check that all developers have adhered to basic coding standards and that language specific best practices have been implemented. The use of Static Analysis as part of the build process is a good way of ensuring that any code is changeable and understandable by all members of the development team.
  • Compilation
    The Compilation function turns source files into directly executable or intermediate objects. Not every project necessarily compiles code - some scripted languages can be executed directly - however the majority of projects still do. The important thing to note here is that Compilation is not the build process, just part of it.
  • Unit Testing
    The Unit Testing function is the first quality gate for the build. It is used to assess whether developer's changes work together at the code unit level. A build which passes all of its unit tests (if the tests are comprehensive and well written) has a good chance of being a quality build.
  • Data Processing
    The Data Processing function creates, parses or transforms data files into outputs. It is included here, because not every build consists purely of code and tests. In some industries the compilation process is a very small part of the overall build process. For example, the multimedia games industry have time consuming build process that include generation of three dimensional graphics from models.
  • Packaging
    The Packaging function takes the outputs of the build and bundles them together so as to be complete and installable. These packages are sometimes referred to as a distribution archives. In technical terms Packaging might mean bundling up a set of Java classes and libraries into an archive (a JAR or EAR file) to be available to be installed onto a server. It doesn’t mean copying the outputs of a build onto a CD-ROM or DVD and casually passing it around!
  • Link Testing
    The Link Testing or Functional Integration Testing function is a secondary quality gate and is the execution of a small core subset of functional tests - usually against a deployed application. It is executed to give the Quality Assurance team confidence that the build is suitable for further testing.
  • Code Coverage
    The Code Coverage function is typically executed as a "side-affect" of the Unit Testing and/or Link Testing function. It allows the project to assess how much of the total code base has been exercised and what areas of an application additional testing should concentrate on.
  • Deployment
    The Deployment function transitions the build to its runtime environment. Builds are usually only automatically transitioned to immediately related environments, i.e. Integration or System Test. Production Environments are sometimes deployed to automatically as part of a secure, controlled Release Build process but most of the time this type of deployment is usually carried out manually, as a process in its own right, by members of a separate Operations team.

This is not an exhaustive list nor should it be seen as a prescriptive list, however it illustrates that a well defined build process includes mechanistic (Compilation), quality (Static Analysis, Unit Testing, Link Testing) and publishing (Packaging, Deployment) functions.

Build stages

All builds flow through a similar generic lifecycle: you identify what you want to build, how you are going to build it (if it is different from last time), you execute the build and then finally you examine its results. This lifecycle is illustrated in the diagram below.

[Build Lifecycle]

Note that this lifecycle is iterative in nature, this is because with software we will definitely be building many times. There are four basic stages in this lifecycle which can be defined in more detail as follows:

  • Build Identification – what to build
    Build Identification can either be an informal (just build the latest changes) or formal process (only build agreed upon and complete changes). Typically, during initial coding and unit testing, when Private Builds or Integration Builds are being created, Identification is usually informal whereas during the later stages or when a Release Build is to be carried out the Identification is much more formal – developers can usually only put in changes if they have been agreed upon by project management.
  • Build Definition – how to build it.
    Build Definition is where build scripts (makefiles, Ant build files or other configuration scripts) are created. These scripts define how the different parts of the application should be compiled and/or linked together in order to produce a complete system or application. The scripts may also automate other parts of the process, i.e. database configuration or installation. The Definition stage is where you define the guts of any automation that is to be carried out.
  • Build Execution – invocation and running of the build.
    Build Execution is where the build scripts identified in the Build Definition phase are executed. Execution can be carried out in a number of ways: by direct user invocation (the good old command line, by clicking a button in an IDE) or automatically on a scheduled basis. To increase the speed of the Build Execution phase, two techniques can be used: Build Distribution to distribute the build across a number of processors or machines, or Build Avoidance to only re-build those parts of the system that have changed since the last time the build was carried out.
  • Build Reporting – the results of the build.
    Build Reporting is where the results of the build (success or failure) are reported on. There are different types and levels of reporting: basic compilation reports, unit test reports, release reports and so on. The mechanism via which reports are generated can be either manual or automatic. For example, automatic notification of build success or failure can be sent via email or instant messaging and the results published to a web site. Alternately, reports can be generated on demand as inputs to reviews or status meetings. Build Auditing can also be considered part of this stage; this is where, as a direct result of the Build Execution stage, the versions of all the source files used, the compilation settings and the environment are automatically recorded.

Note that at the core of this lifecycle is the Build Environment – the tools that will be used to implement the build process and their configuration. Also, at some point you will break out the normal build loop - where you are really creating builds for internal use - and create a Release Build for your customer. This typically involves following the same stages but placing more thought and control over identifying the inputs, outputs and quality of the build to assess whether it is suitable for release.

Build roles

The outputs of your build process will often form the basis of any handover and/or communication between different users on your team. Many different "types" of users will be affected by your build process: from Developers, to Build Engineers, to Testers and also your Operations team. However there is a core set of user groups that will be more directly involved in the build process and that can be defined as follows:

  • Developers - the creators of source code, unit tests and supporting artefacts that are "built" as part of the build process.
  • Build Engineers - the creators of the build process itself (via tools and/or scripting), and who are responsible for either automating the execution of the build process or for executing it themselves directly.
  • Deployment Engineers - the creators of scripts to deploy or transition the output of the build process to their runtime environment. In large Corporate organizations Deployment Engineers are often in a completely different group (IT Operations) to the IT Development team.

Note, that these roles might have slightly different names in your own organization and also that a single person could fulfill more than one role - the important thing is that the roles are carried out. An example of how these different roles can interact is illustrated in the diagram below:

[Build Roles]

Again, note that this interactive process might be slightly different in your own team. For example, in some teams the Build Engineer role is completely automated and the scripts to achieve it are updated by Developers as and when needed. In other teams - especially large-scale development projects - there might be a specific Build or Release Engineering team who are solely responsible for defining and executing Integration or Release Builds across many different projects. In my opinion however, this later "ivory tower" approach to build process definition and execution is becoming less common as I have seen the definition of build process (if not its execution) moving back to be the responsibility of Development - especially for projects involved in Agile development and Continuous Integration.

Build infrastructure

In order to execute Integration or Release Builds you should use dedicated and controlled servers. Although I have seen some small projects develop, build and release from a Developers workstation this is not a repeatable or auditable process and should be avoided. The typical infrastructure components that will form a standard build infrastructure are as follows:

  • Developer Desktop - a workstation on which developers implement changes and (typically) conduct their Private Builds
  • SCM Server - a server on which code repositories are held and which developers commit their changes so that are included in Integration or Release Builds
  • Build Server - a server on which Integration or Release Builds are conducted either manually or using an automation tool
  • Build Test Server - a server on which the outputs of the Integration or Release Builds are deployed to in order to assess the quality of the build and its suitability for further testing (see the Link Testing function above)

Note that some development teams conduct all of their builds on shared servers and not on Developer Desktops. In general Developer Desktop builds are conducted mainly when some for of Integrated Development Environment (IDE) is being used as IDEs typically have their own internal build function. An example of how these different components are related is illustrated in the diagram below:

[Build Infrastructure]

Note that the amount of physical servers that are required will depend on the size, complexity and nature of your development environment. For example if you are developing a product for many different platforms, i.e. Windows, Linux and Unix variants, then you would probably implement different Build Servers Variants for each platform. Similarly if your build process takes a significant amount of time to execute then you would probably implement a Build Server Farm which consisted of a number of servers upon which the build could be distributed. As part of your build process you might also deploy to servers other than the initial Build Test Server, for example a System Test Application Server and a System Tests Database Server.

Summary

To summarize, the build process is all about ensuring that builds can be reliably constructed, their content controlled and their outputs deployed to test or run-time environments. A well defined build process can contribute significantly to the success of a project - the value it can bring in terms of visibility and incremental, demonstrable progress should not be underestimated. By breaking down the build process in terms of individual functions and understanding the lifecycle of a typical build you can start to understand more about the scope of the build process and what a typical framework would be. By referencing these concepts, you should hopefully have some idea of where to start when implementing a new build process, or what the next step would be in refining and improving an existing build process.


¹These three profiles were first discussed by Berczuk and Appleton in their book on Software Configuration Management Patterns and their additional paper with Konieczka on Build Management for an Agile Team.

Bookmark and Share

Comments

Posted by Himanshi

Superb Article.. Thanks

Posted on September 24th, 2010
Posted by ragz

fabulous information

Posted on August 31st, 2010
Posted by Ricardo Aponte

Excelent information! We are in the process of developing build processes and this is right what we needed. Thank you!

Posted on May 25th, 2010

Back to Top

Submit a new comment

All fields in bold are required.