| Work Guidelines: Maintaining Automated Test SuitesTopicsLike physical objects, tests can break. It's not that they wear down, it's 
  that something's changed in their environment. Perhaps they've been ported to 
  a new operating system. Ormore likelythe code they exercise has 
  changed in a way that correctly causes the test to fail. Suppose you're 
  working on version 2.0 of an e-banking application. In version 1.0, this method 
  was used to log in: 
    public boolean login
    (String username); In version 2.0, the marketing department has realized that password protection 
  might be a good idea. So the method is changed to this: 
    public boolean login
    (String username, String password); Any test that uses login will fail. It won't 
  even compile. Since not much useful work can be done without logging in, not 
  many useful tests can be written that don't use login. 
  You might be faced with hundreds or thousands of failing tests. These tests can be fixed by using a global search-and-replace tool that finds 
  every instance of login(something)and 
  replaces it with login(something, "dummy 
  password"). Then arrange for all the testing accounts to use that 
  password, and you're on your way.  Then, when marketing decides that passwords should not be allowed to contain 
  spaces, you get to do it all over again.  This kind of thing is a wasteful burden, especially whenas is often the 
  casethe test changes aren't so easily made. There is a better way. Suppose that the tests originally did not call the product's login 
  method. Rather, they called a library method that does whatever it takes to 
  get the test logged in and ready to proceed. Initially, that method might look 
  like this: 
public boolean testLogin (String username) {
  return product.login(username);
}
 When the version 2.0 change happens, the utility library is changed to match: 
  
public Boolean testLogin (String username) {
  return  product.login(username, "dummy password");
}
 Instead of a changing a thousand tests, you change one method. Ideally , all the needed library methods would be available at the beginning 
  of the testing effort. In practice, they can't all be anticipatedyou might 
  not realize you need a testLogin utility method 
  until the first time the product login changes. 
  So test utility methods are often "factored out" of existing tests 
  as needed. It is very important that you perform this ongoing test repair, 
  even under schedule pressure. If you do not, you will waste much time dealing 
  with an ugly and un-maintainable test suite. You might well find yourself throwing 
  it away, or being unable to write the needed numbers of new tests because all 
  your available testing time is spent maintaining old ones. Note: the tests of the product's login 
  method will still call it directly. If its behavior changes, some or all of 
  those tests will need to be updated. (If none of the login 
  tests fail when its behavior changes, they're probably not very good at detecting 
  defects.) The previous example showed how tests can abstract away from the concrete application. 
  Most likely you can do considerably more abstraction. You might find that a 
  number of tests begin with a common sequence of method calls: they log in, set 
  up some state, and navigate to the part of the application you're testing. Only 
  then does each test do something different. All this setup couldand shouldbe 
  abstracted into a single method with an evocative name such as readyAccountForWireTransfer. 
  By doing that, you're saving considerable time when new tests of a particular 
  type are written, and you're also making the intent of each test much more understandable. Understandable tests are important. A common problem with old test suites is 
  that no one knows what the tests are doing or why. When they break, the tendency 
  is to fix them in the simplest possible way. That often results in tests that 
  are weaker at finding defects. They no longer test what they were originally 
  intended to test. Suppose you're testing a compiler. Some of the first classes written define 
  the compiler's internal parse tree and the transformations made upon it. You 
  have a number of tests that construct parse trees and test the transformations. 
  One such test might look like this:  
/* 
 * Given
 *   while (i<0) { f(a+i); i++;}
 * "a+i" cannot be hoisted from the loop because 
 * it contains a variable changed in the loop.
 */
loopTest = new LessOp(new Token("i"), new Token("0"));
aPlusI = new PlusOp(new Token("a"), new Token("i"));
statement1 = new Statement(new Funcall(new Token("f"), aPlusI));
statement2 = new Statement(new PostIncr(new Token("i"));
loop = new While(loopTest, new Block(statement1, statement2));
expect(false, loop.canHoist(aPlusI))
 This is a difficult test to read. Suppose that time passes. Something changes 
  that requires you to update the tests. At this point, you have more product 
  infrastructure to draw upon. In particular, you might have a parsing routine 
  that turns strings into parse trees. It would be better at this point to completely 
  rewrite the tests to use it:  
loop=Parser.parse("while (i<0) { f(a+i); i++; }");
// Get a pointer to the "a+i" part of the loop. 
aPlusI = loop.body.statements[0].args[0];
expect(false, loop.canHoist(aPlusI));
 Such tests will be much easier to understand, which will save time immediately 
  and in the future. In fact, their maintenance costs are so much lower that it 
  might make sense to defer most of them until the parser is available. There's a slight downside to this approach: such tests might discover a defect 
  in either the transformation code (as intended) or in the parser (by accident). 
  So problem isolation and debugging may be somewhat more difficult. On the other 
  hand, finding a problem that the parser tests miss isn't such a bad thing. There is also a chance that a defect in the parser might mask a defect in the 
  transformation code. The chance of this is rather small, and the cost from it 
  is almost certainly less than the cost of maintaining the more complicated tests. A large test suite will contain some blocks of tests that don't change. They 
  correspond to stable areas in the application. Other blocks of tests will change 
  often. They correspond to areas in the application where behavior is changing 
  often. These latter blocks of test will tend to make heavier use of utility 
  libraries. Each test will test specific behaviors in the changeable area. The 
  utility libraries are designed to allow such a test to check its targeted behaviors 
  while remaining relatively immune to changes in untested behaviors.  For example, the "loop hoisting" test shown above is now immune to 
  the details of how parse trees are built. It is still sensitive to the structure 
  of a while loop's parse tree (because of the 
  sequences of accesses required to fetch the sub-tree for a+i). 
  If that structure proves changeable, the test can be made more abstract by creating 
  a fetchSubtree utility method:  
loop=Parser.parse("while (i<0) { f(a+i); i++; }");
aPlusI = fetchSubtree(loop, "a+i");
expect(false, loop.canHoist(aPlusI));
 The test is now sensitive only to two things: the definition of the language 
  (for example, that integers can be incremented with ++), 
  and the rules governing loop hoisting (the behavior whose correctness it's checking). Even with utility libraries, a test might periodically be broken by behavior 
  changes that have nothing to do with what it checks. Fixing the test doesn't 
  stand much of a chance of finding a defect due to the change; it's something 
  you do to preserve the test's chance of finding some other defect someday. But 
  the cost of such a series of fixes might exceed the value of the test's hypothetically 
  finding a defect. It might be better to simply throw the test away and devote 
  the effort to creating new tests with greater value. Most people resist the notion of throwing away a testat least until they're 
  so overwhelmed by the maintenance burden that they throw all the tests 
  away. It is better to make the decision carefully and continuously, test by 
  test, asking: 
  How much work will it be to fix this test well, perhaps adding to the utility 
    library?How else might the time be used?How likely is it that the test will find serious defects in the future? 
    What's been the track record of it and related tests?How long will it be before the test breaks again? The answers to these questions will be rough estimates or even guesses. But 
  asking them will yield better results than simply having a policy of fixing 
  all tests. Another reason to throw away tests is that they are now redundant. For example, 
  early in development, there might be a multitude of simple tests of basic parse-tree 
  construction methods (the LessOp constructor 
  and the like). Later, during the writing of the parser, there will be a number 
  of parser tests. Since the parser uses the construction methods, the parser 
  tests will also indirectly test them. As code changes break the construction 
  tests, it's reasonable to discard some of them as being redundant. Of course, 
  any new or changed construction behavior will need new tests. They might be 
  implemented directly (if they're hard to test thoroughly through the parser) 
  or indirectly (if tests through the parser are adequate and more maintainable). 
 
 
Copyright 
© 1987 - 2001 Rational Software Corporation
 |