Agile Database Change Management with dbdeploy
Filed under: database agile java tools
There are 0 comments on this article.
Agile Database Change Management
Most software applications utilize some form of database technology to persist information; with the instructions or commands required to create the database held in SQL scripts. Sometimes these are monolithic scripts containing the complete set of SQL commands to apply or more likely they are a sequence of individual scripts that represent discrete functional areas of the database. The main challenge with database scripting is not coding the individual SQL commands, rather it is in establishing an effective process to apply them. Although there might be one single production (or customer releasable) database instance, there may also be separate test and development instances. Each of these instances will be at a different levels as development and testing progresses and when a change is made in development (to add some new functionality or to fix a problem) considerable care needs to taken to ensure that the complete set of incremental changes is applied between instances. Consequently, making database changes is often seen as a straightforward process however the impacts of not managing the change correctly can conversely be catastrophic.
Agile Database Change Management is set of lightweight but effective processes, backed up by supporting tools, that enables database developers to successfully manage database changes between environments. There are a number of practices in Agile Database Change Management, but the fundamentally important ones can be summarized as follows:
- Implement development sandboxes
Create an individual database instance per developer, together with a development integration, test and production instances that model your deployment environment. - Maintain a database change log
Record the cumulative set of database changes that have been applied to each instance. - Create database delta scripts
Create a script for each individual change, one per DDL (data definition language) statement (i.e. CREATE, ALTER etc) - Use a naming convention for your scripts
Name your scripts according to the sequence that they are to be applied, and the reason why they are being applied. - Version control your scripts
Treat your database scripts as you would your source code. Use your version control repository as your master reference and ongoing development - do not pull database scripts directly from a database instance.
Most of these practices are described in more detail on Scott Ambler's excellent website www.agiledata.org and in various books including Agile Database Techniques, Refactoring Databases and Patterns of Enterprise Application Architecture. If you want to know more about the overall practices of Agile database refactoring (a superset of Agile Database Change Management) then I encourage you to make use of these references.
There are a number of tools, both commercial and open source that you can use to help you implement Agile Database Change Management. In the remainder of this article I will be describing how you can use dbdeploy in combination with Apache Ant to implement Agile Database Change Management on a Java project.
Introduction to dbdeploy
There are a number of versions of dbdeploy depending of your implementation environment. There are versions for Java, .NET and PHP; however even if you are not using any of these languages you can still use dbdeploy as a standalone tool from the command line. If this is the case then I would recommend the use of the Java version as this is the most stable and frequently updated. At a high level, dbdeploy allows you to to deploy "changesets" to database instances and manage the different change numbers for you so that you can successfully deploy or undo incremental changes. It does not directly update your database (at least not at the moment), rather it creates the update script for you to apply. This script can be executed immediately after creation if you wish using your database command line or if you are using Ant, its built in <sql> task. To allow dbdeploy to "understand" the change numbers that have been deployed to each instance you add a changelog table to each of your instances. There are a number of scripts supplied with dbdeploy for its supported databases (currently Oracle, Sybase, SQL Server, Hypersonic SQL, MySQL and HSQLDB). For reference, the SQL for the changelog table for the open source Java database HSQLDB we will be using in this article looks as follows:
CREATE TABLE changelog ( change_number INTEGER NOT NULL, delta_set VARCHAR(10) NOT NULL, complete_dt TIMESTAMP NOT NULL, applied_by VARCHAR(100) NOT NULL, description VARCHAR(500) NOT NULL ); ALTER TABLE changelog ADD CONSTRAINT Pkchangelog PRIMARY KEY (change_number, delta_set)
In order to apply changes to your database instances, dbdeploy requires that you create "delta" scripts for each of the changes that you are going to make. An example directory structure to contain these delta scripts is illustrated in the diagram below (with the scripts being contained in the db/delta directory):
It is good practice to keep each "delta" script as simple as possible, with each containing a single DDL statement (i.e. CREATE, ALTER etc). This allows individual changes to more clearly deployed and undeployed if necessary. Each of these scripts should start with a number indicating the order of deployment and a descriptive name. The content of each of the scripts should then be created in the following format:
... SQL creation statements --//@UNDO ... SQL rollback statements --//
As an example a script to create a "users" table would look like the following:
ALTER TABLE users ( username varchar(32) NOT NULL, password varchar(32) default NULL, firstname varchar(100) NOT NULL, lastname varchar(100) NOT NULL, email varchar(50) default NULL, PRIMARY KEY (username) ); --//@UNDO DROP TABLE IF EXISTS users; --//
You must use the "--//" format to delimit deploy and undo actions but other than that the script is pure SQL. With a set of such files in place you can then call dbdeploy to create a single update script to bring a database instance up to date. In the next section I will describe how to do this with a working example using Apache Ant.
Getting started with dbdeploy and Apache Ant
In this section, I will be using the reference application HappyBank. This is an online banking application which uses a database to persist customers, customer accounts and transactions. The application uses the open source Java database HSQLDB as this does not require a complex runtime installation. You can download a copy of HappyBank here. Extract the zip file to a directory and then navigate to the db directory - this will look the same as in the diagram above. Inside this directory you will see there is an sql\delta directory for the individual change scripts as well as the sql\testdata directory which contains the scripts for some sample data. Note that the sample data could also be managed with dbdeploy but would potentially meant you had to spend a long time creating undo scripts!
To get dbdeploy working you will need to download the dbdeploy distribution from here and place the included dbdeploy-ant-*.jar file in your classpath. Then you would create and call a <dbdeploy> task from Ant as follows:
<taskdef name="dbdeploy" classname="com.dbdeploy.AntTarget">
<classpath>
<fileset dir="lib">
<include name="dbdeploy-ant-*.jar"/>
<include name="hsqldb*.jar"/>
</fileset>
</classpath>
</taskdef>
<dbdeploy driver="${name.db.driver}"
url="${name.db.connection}"
userid="${name.db.user}"
password="${name.db.password}"
dir="${dir.sql}\deltas"
outputfile="${dir.build}\all-deltas.sql"
dbms="${name.dbms}"
lastChangeToApply="5"
deltaSet="DevSchema"
undoOutputfile="${dir.build}\undo-all-deltas.sql" />
As you can see you need to provide standard JDBC database connection information for the database instance and the location of the delta scripts are specified via the dir attribute. dbdeploy will then create both a single database update and undo script, in this case all-deltas.sql and undo-all-deltas.sql. You can also specify the change numbers to apply via the lastChangeToApply attribute, i.e. if you have 100 delta scripts but only want to apply the changes to 90th, you would specify 90 here. If you leave this attribute out it will use all of the delta scripts in the delta directory. Finally, there is also the capability to specify a delta set name. This is useful if you want to separate your DDL (data definition language) statements from your DML (data manipulation language) statements, i.e. your test data. By default the deltaSet is called "Main".
OK, so let us see how we can use dbdeploy in a working example. Ensure that you have a working copy of Ant in your path and then navigate back to the toplevel db directory. First we will need to start up the HSQLDB database, which you can do using the following target:
Then to initialise the database you can simply execute the following bootstrap command:
which will produce output similar to the following:
Buildfile: build.xml
dbdeploy.bootstrap:
[sql] Executing resource: C:\temp\happybank\db\sql\createSchemaVersionTable.hsql.sql
[sql] 2 of 2 SQL statements executed successfully
dbdeploy.gen:
[dbdeploy] dbdeploy 3.0M1
[dbdeploy] Reading change scripts from directory C:\Temp\happybank\db\sql\deltas...
[dbdeploy] Changes currently applied to database:
[dbdeploy] (none)
[dbdeploy] Scripts available:
[dbdeploy] 1..6
[dbdeploy] To be applied:
[dbdeploy] 1..6
[dbdeploy] Generating undo scripts...
dbdeploy.update:
[sql] Executing resource: C:\Temp\happybank\db\build\all-deltas.sql
[sql] 22 of 22 SQL statements executed successfully
db.reset:
[sql] Executing resource: C:\Temp\happybank\db\sql\testdata\empty-all-tables.sql
[sql] Executing resource: C:\Temp\happybank\db\sql\testdata\customers.sql
[sql] Executing resource: C:\Temp\happybank\db\sql\testdata\accounts.sql
[sql] Executing resource: C:\Temp\happybank\db\sql\testdata\custaccts.sql
[sql] Executing resource: C:\Temp\happybank\db\sql\testdata\transactions.sql
[sql] 39 of 39 SQL statements executed successfully
bootstrap:
BUILD SUCCESSFUL
Total time: 3 seconds
Once this is complete you will then have a populated initial database.
There are a number of things going on behind the scenes in this example, so let's break them down so you understand what is going on.
Creation of a properties file
A single Ant properties file is used throughout the example to hold database and dbdeploy configuration parameters. The relevant properties are included below:
# hsqldb properties name.db = HappyBankDB name.dbms = hsql name.db.driver = org.hsqldb.jdbcDriver name.db.connection = jdbc:hsqldb:hsql://localhost/xdb name.db.user = sa name.db.password = # dbdeploy properties name.dbdeploy.schema = createSchemaVersionTable.hsql.sql
If you were using a different database provider, you would change these parameters accordingly. To make use of these properties you would include them in your build script via the usual Ant <property> task as follows:
These properties will then be used throughout the Ant tasks in the remainder of the build script.
Install the dbdeploy changelog script
As I discussed earlier, before you can use dbdeploy you need to create the changelog table, the following Ant target will simply execute the SQL script specified by the property ${name.dbdeploy.schema}:
<!-- target to bootstrap a initialize a managed database -->
<target name="dbdeploy.init" depends="-ivy.init"
description="initialize a dbdeploy managed database">
<sql driver="${name.db.driver}"
url="${name.db.connection}"
userid="${name.db.user}"
password="${name.db.password}"
src="${dir.sql}/${name.dbdeploy.schema}"
onerror="continue">
<classpath>
<path refid="ivy.internal.classpath" />
</classpath>
</sql>
</target>
Note in this example you will see reference to a target called -ivy.init and a Classpath called ivy.internal.classpath. This is because I am using Apache Ivy to manage the download of library dependencies (including the dbdeploy jar file). This will be transparent to you for the purposes of running through this exercise, but please refer to this article if you want to know more about how Apache Ivy is used for the HappyBank reference application.
Create the update script
Once the database has been setup, the following dbdeploy.gen Ant target can then be used to examine the delta scripts and the database instance change level, and created the composite SQL scripts to be applied:
<!-- run dbdeploy to generate database update script -->
<target name="dbdeploy.gen" depends="-ivy.init"
description="generate database update script">
<taskdef name="dbdeploy" classname="com.dbdeploy.AntTarget">
<classpath>
<path refid="ivy.internal.classpath" />
</classpath>
</taskdef>
<dbdeploy driver="${name.db.driver}"
url="${name.db.connection}"
userid="${name.db.user}"
password="${name.db.password}"
dir="${dir.sql}\deltas"
outputfile="${dir.build}\all-deltas.sql"
dbms="${name.dbms}"
lastChangeToApply="9"
undoOutputfile="${dir.build}\undo-all-deltas.sql" />
</target>
Execute the update (or undo) script
To update the database, the generated all-deltas.sql script can then be executed via the following dbdeploy.update Ant target:
<!-- update a database based on output of dbdeploy -->
<target name="dbdeploy.update" depends="dbdeploy.gen">
<sql driver="${name.db.driver}"
url="${name.db.connection}"
userid="${name.db.user}"
password="${name.db.password}"
src="${dir.build}/all-deltas.sql"
onerror="abort">
<classpath>
<path refid="ivy.internal.classpath" />
</classpath>
</sql>
</target>
<!-- undo the last dbdeploy changes -->
<target name="dbdeploy.undo" depends="-ivy.init">
<sql driver="${name.db.driver}"
url="${name.db.connection}"
userid="${name.db.user}"
password="${name.db.password}"
src="${dir.build}/undo-all-deltas.sql"
onerror="abort">
<classpath>
<path refid="ivy.internal.classpath" />
</classpath>
</sql>
</target>
There is also a dbdeploy.undo target which can be used to run the undo script if necessary. Note however that this undo script is only valid if the database change number remains at the current level - I will be discussing more on this later.
Populate the database with sample data
Then we can populate the database with some sample data. For development purposes I typically create a db.reset target which creates a brand new database and populates it as in the following:
<!-- populate the development database with sample data -->
<target name="db.reset"
depends="dbdeploy.init,dbdeploy.gen,dbdeploy.update"
description="update database and load test data">
<sql driver="${name.db.driver}"
url="${name.db.connection}"
userid="${name.db.user}"
password="${name.db.password}"
onerror="abort">
<classpath>
<path refid="project.classpath" />
</classpath>
<transaction src="${dir.sql}/testdata/empty-all-tables.sql" />
<transaction src="${dir.sql}/testdata/customers.sql" />
<transaction src="${dir.sql}/testdata/accounts.sql" />
<transaction src="${dir.sql}/testdata/custaccts.sql" />
<transaction src="${dir.sql}/testdata/transactions.sql" />
</sql>
</target>
Note that in normal use you would use the dbdeploy.update target to deploy incremental updates as well as incremental test data.
The final master bootstrap target is then simply a shell target which calls the others as follows:
<!-- bootstrap the development database --> <target name="bootstrap" depends="init,get-db-pass,dbdeploy.init,db.reset" description="bootstrap development database"> </target>
Note that I typically place the reuseable targets dbdeploy.init, dbdeploy.gen and dbdeploy.update in their own common build script and import it into a project build script.
Migrating ongoing changes
On an ongoing development basis, you will be creating individual delta scripts and migrating them, so let's work through an example of how this would happen. Suppose for example we wanted to introduce a new database view for HappyBank which reported a customers total wealth (the sum of all of their accounts). We could create a delta script to do this as follows:
CREATE VIEW CUSTBALANCES AS SELECT ca.customerid, SUM(a.balance) AS baltotal FROM ACCOUNT a, CUSTACCT ca WHERE a.accid = ca.accid GROUP BY ca.customerid --//@UNDO DROP VIEW CUSTBALANCES --//
To save you typing, this file is contained within the example directory as " 7 - Create Customer total balance view.sql" - so it didn't get using on "bootstrap". To make use of it, copy it to the deltas directory. To deploy this change we can then simply run the dbdeploy.update target as listed earlier on as follows:
which will produce output similar to the following:
dbdeploy.gen:
[dbdeploy] dbdeploy 3.0M1
[dbdeploy] Reading change scripts from directory C:\Temp\happybank\db\sql\deltas...
[dbdeploy] Changes currently applied to database:
[dbdeploy] 1..6
[dbdeploy] Scripts available:
[dbdeploy] 1..7
[dbdeploy] To be applied:
[dbdeploy] 7
[dbdeploy] Generating undo scripts...
dbdeploy.update:
[sql] Executing resource: C:\Temp\happybank\db\build\all-deltas.sql
[sql] 2 of 2 SQL statements executed successfully
BUILD SUCCESSFUL
Total time: 1 second
If you examine the all-deltas.sql file you would see that it contains the flattened SQL as well as a command to update the changelog table as follows:
CREATE VIEW CUSTBALANCES AS
SELECT ca.customerid, SUM(a.balance) AS baltotal
FROM ACCOUNT a, CUSTACCT ca
WHERE a.accid = ca.accid
GROUP BY ca.customerid
INSERT INTO changelog (change_number, delta_set, complete_dt, applied_by, description)
VALUES (7, 'Main', CURRENT_TIMESTAMP, USER(), '7 Create Customer total balance view.sql');
COMMIT;
----- END CHANGE SCRIPT #7: 7 Create Customer total balance view.sql -----
Similarly, the the undo script will be as follows:
DROP VIEW CUSTBALANCES;
DELETE FROM changelog WHERE change_number = 7 AND delta_set = 'Main';
COMMIT;
----- END CHANGE SCRIPT #7: 7 Create Customer total balance view.sql -----
Suppose for example, if we deployed this change and found that we had errors as a result. We could subsequently "undo" the change via the dbdeploy.undo target as follows:
which will produce output similar to the following:
dbdeploy.undo:
[sql] Executing resource: C:\Temp\happybank\db\build\undo-all-deltas.sql
[sql] 2 of 2 SQL statements executed successfully
BUILD SUCCESSFUL
Total time: 1 second
Note that the undeploy script is only valid if the change number remains at this current level. Unfortunately there is currently no way to undeploy (or rollback) to an arbritary change number. To workaround this you could save the undeploy scripts, perhaps labelling them with the change numbers that they deploy from and to. One of the ways you could achieve this is via Ant property files as in the following:
<!-- execute dbdeploy and save change number -->
<target name="dbdeploy.controlled">
<!-- read the last change that was deployed from ${value.last.change} -->
<property file="dbdeploy.properties">
<!-- get the latest maximum change to deploy -->
<input addproperty="value.max.change"
message="Please enter the maximum change number to deploy:">
<!-- create the change scripts -->
<dbdeploy driver="${name.db.driver}"
url="${name.db.connection}"
userid="${name.db.user}"
password="${name.db.password}"
dir="${dir.sql}\deltas"
outputfile="${dir.build}\deltas-${value.last.change}-${value.max.change}.sql"
dbms="${name.dbms}"
lastChangeToApply="${value.max.change}"
undoOutputfile="${dir.build}\undo-deltas-${value.last.change}-${value.max.change}.sql" />
<!-- save the latest max change to the property file -->
<propertyfile file="dbdeploy.properties">
<entry key="value.last.change" type="int"
value="${value.max.change}" />
</propertyfile>
</target> What this example does is store the last change deployed in a property file (dbdeploy.properties) and prompt you for the maximum change you want to deploy. It then names the deploy and undeploy scripts using these numbers. You should save these scripts (as well as the property file) or add them to version control for later use.
In practice you need to be very careful with undeploy transactions in a database as it is possible to break your database if you deploy untested undo transactions. If you value this capability then I would recommend adopting TDD (Test Driven Development) practices for your delta scripts, ensuring that you have a full suite of unit tests to validate deploy and undeployment transactions. There are various database unit testing tools such as DbUnit which can help you with this.
References
- dbdeploy home page
- Taking Control of your Database Development - white paper
- Agile DBA - experience of using dbdeploy on a project
- Database integration in your build scripts

