Google Testing Blog: TotT: Too Many Tests

Thursday, February 21, 2008

In the movie Amadeus, the Austrian Emperor criticizes Mozart's music as having “too many notes.” How many tests are “too many” to test one function?

Consider the method decide:

public void decide(int a, int b, int c, int d,
      int e, int f) {
  if (a > b || c > d || e > f) {
    DoOneThing();
  } else {
    DoAnother();
  } // One-letter variable names are used here only
        because of limited space.
} // You should use better names. Do as I say, not
      as I do. :-)

How many tests could we write? Exercising the full range of int values for each of the variables would require 2¹⁹² tests. We'd have googols of tests if we did this all the time! Too many tests.

What is the fewest number of tests we could write, and still get every line executed? This would achieve 100% line coverage, which is the criterion most code-coverage tools measure. Two tests. One where (a > b || c > d || e > f) is true; one where it is false. Not enough tests to detect most bugs or unintentional changes in the code.

How many tests to test the logical expression and its sub-expressions? If you write a test of decide where a == b, you might find that the sub-expression a > b was incorrect and the code should have been a >= b. And it might make sense to also run tests where a < b and a > b. So that's three tests for a compared to b. For all of the parameters, that would 3 * 3 * 3 = 27 tests. That's probably too many.

How many tests to test the logical expression and its sub-expressions independently? Consider another version of decide, where the logical sub-expressions have been extracted out:

public void decide(int a, int b, int c, int d,
      int e, int f) {
  if (tallerThan(a, b)
      || harderThan(c, d)
      || heavierThan(e, f)) {
    DoOneThing();
  } else {
    DoAnother();
  }
}
boolean tallerThan(int a, int b) { return a > b; }
      // Note “package scope”
boolean harderThan(int c, int d) { return c > d; }
      // rather than public; JUnit
boolean heavierThan(int e, int f) { return e > f; }
      // tests can access these.

We can write four tests for decide. One where tallerThan is true. One where harderThan is true. One where heavierThan is true. And one where they are all false. We could test each of the extracted functions with two tests, so the total would be 4 + 2 * 3 = 10 tests. This would be just enough tests so that most unintentional changes will trigger a test failure. Exposing the internals this way trades decreased encapsulation for increased testability. Limit the exposure by controlling scope appropriately, as we did in the Java code above.

How many tests is too many? The answer is “It depends.” It depends on how much confidence the tests can provide in the face of changes made by others. Tests can detect whether a programmer changed some code in error, and can serve as examples and documentation. Don't write redundant tests, and don't write too few tests.

8 comments :

BenFebruary 21, 2008 at 11:59:00 AM PST
Great example! A suggestion if I may?

The 27 test example doesn't take advantage of the independence provided by OR. One doesn't need to test every combination of pairs, as there is no conditional logic between them. Consider:

A = B, C = D, E = F : DoAnother()
A < B, C < D, E < F : DoAnother()
A > B, C < D, E < F : DoOneThing()
A < B, C > D, E < F : DoOneThing()
A < B, C < D, E > F : DoOneThing()

That's five tests, but it tests equivalence, greater than, and less than for each pair.

This provides the same confidence as the 27 test example, with half the tests of the decreased encapsulation example.

-Ben
ReplyDelete
Replies
Rob BaillieFebruary 22, 2008 at 1:20:00 AM PST
And if one of the 'OR's changed to an 'XOR', the tests would still pass, but the behaviour would change.

Taking advantage of the logical independence of OR produces a gap due to its reliance on the use of OR.

Mind you whilst the level confidence may not be the same as with the 27 test example, the drop may not be particularly significant, thus illustrating the original point I suppose.
ReplyDelete
Replies
Gary BruntonFebruary 22, 2008 at 8:09:00 AM PST
You mention, "Limit the exposure by controlling scope appropriately, as we did in the Java code above.".

I'm not a java guy but it appears to me that you've made the extracted functions private and not testable from outside this class.

What am I missing?
ReplyDelete
Replies
TimFebruary 22, 2008 at 8:23:00 AM PST
Gary, the first comment under the first extracted conditional indicates the visibility which is package.
ReplyDelete
Replies
Gary BruntonFebruary 22, 2008 at 8:40:00 AM PST
Thanks Tim.

Just not paying close enough attention.
ReplyDelete
Replies
dastelsFebruary 25, 2008 at 2:00:00 PM PST
yes... the format of this blog isn't as code friendly as an 8.5x11 page. I've been playing with the code format. Small font, or break the lines... sometimes with resulting oddness. Comments on that appreciated.
ReplyDelete
Replies
AnonymousMarch 3, 2008 at 5:37:00 AM PST
It is also interesting to look at the NPath complexity metric for a function or method as it gives you a pretty good estimate of how many tests you need.
ReplyDelete
Replies
nongolosJune 5, 2009 at 5:04:00 AM PDT
Good example of equivalence class partioning.
ReplyDelete
Replies

The comments you read and contribute here belong only to the person who posted them. We reserve the right to remove off-topic comments.

Testing Blog

TotT: Too Many Tests

8 comments :

Labels

Archive

Feed