What works and what doesn't work in software development? For the first 10 years or so of my career, we followed a strict waterfall development model. Then in 2008 we started switching to an agile development model. In 2011 we added DevOps principles and practices. Our product team has also become increasingly global. This blog is about our successes and failures, and what we've learned along the way.

The postings on this site are my own and don't necessarily represent IBM's positions, strategies or opinions. The comments may not represent my opinions, and certainly do not represent IBM in any way.

Monday, October 8, 2012

DevOps Days Open Space: Making Your Automated Tests Faster

One part of my job is helping other teams adopt DevOps in general, and continuous delivery in particular.  But I have a problem: many of them have a suite of automated tests that run slowly; so slowly that they only run a build, and the tests that run in the build, about once per day.  (Automated test run times of 8-24 hours are not uncommon.)  There are several reasons why this is the case, including:

  • The artifacts that are produced from the build, and then copied over to the test servers, are very large (greater than 1 GB in size).  Also, sometimes the artifacts are copied across continents.
  • Sometimes there are multiple versions of the build artifacts that must be copied to different test servers after the build.  A typical product I deal with will support at least a dozen platforms; a few support around 100 different platforms, when you multiply the number of supported operating system versions times the number of different components (client, server, gateway, etc.) times 2 (for 32- and 64-bit hardware).
  • Often, the database(s) for the product must be loaded with a large amount of test data, which can take a long time to copy and load.
  • Many products have separate test teams writing test automation.  Testers who are not developers tend to write tests that run through the UI, and those tests are usually slower than developers' code-level unit tests.
Running builds and tests often, so developers know quickly when they make a change that breaks something else, is a key goal of both continuous integration and continuous delivery.  Ideally, a developer should get feedback on whether their code is "ok", using a quick personal build and test run, within 5 minutes.  Anything over 10 minutes is definitely too slow; the developer will probably move on to something else, make more changes, and forget exactly what was changed for that particular test.

Once the quick tests pass, the developer can run a full set of tests and then integrate the tested changes.  Or, in cases where a full set of tests is extremely slow, the developer can integrate his or her code changes once the quick tests pass, and then let the daily build run the full set of tests.

In this DevOps Days open space session, we brainstormed ways to make automated tests run more quickly.  We focused more on quick builds for personal tests, but most of these ideas would make the full set of tests faster too.  Many thanks to the dozens of smart people who contributed their ideas.  I don't even have their names, but they know who they are.  I'm sure we'll use several of these ideas right away.

Watch for an article with more details on each of these, coming soon...

Fail quickly

Run a quick smoke test first

Run a small set of tests that fail often next

Run slow tests last, or not at all

See also: Remove slow tests

Run in parallel

Run test buckets in parallel

Use snapshots of databases or VMs to make it easier to run tests in parallel

Break up tests into smaller groups

Divide your application into components, and test the changed components

Automatically determine which tests to run when code changes

Save time on I/O

Mock responses

Use LXC (Linux Containers)

Move servers and data so they are close to one another

Make your test infrastructure faster

See also: Use snapshots of databases or VMs to make it easier to run tests in parallel

Cache what you can

Remove some tests

Remove tests that never fail

Remove slow tests

Replace some UI tests with code-level tests

Replace some tests with monitoring