Testing on the Toilet: Test Behaviors, Not Methods
Monday, April 14, 2014
by Erik Kuefler
This article was adapted from a Google Testing on the Toilet (TotT) episode. You can download a printer-friendly version of this TotT episode and post it in your office.
After writing a method, it's easy to write just one test that verifies everything the method does. But it can be harmful to think that tests and public methods should have a 1:1 relationship. What we really want to test are behaviors, where a single method can exhibit many behaviors, and a single behavior sometimes spans across multiple methods.
Let's take a look at a bad test that verifies an entire method:
Displaying the name of the purchased item and sending an email about the balance being low are two separate behaviors, but this test looks at both of those behaviors together just because they happen to be triggered by the same method. Tests like this very often become massive and difficult to maintain over time as additional behaviors keep getting added in—eventually it will be very hard to tell which parts of the input are responsible for which assertions. The fact that the test's name is a direct mirror of the method's name is a bad sign.
It's a much better idea to use separate tests to verify separate behaviors:
Now, when someone adds a new behavior, they will write a new test for that behavior. Each test will remain focused and easy to understand, no matter how many behaviors are added. This will make your tests more resilient since adding new behaviors is unlikely to break the existing tests, and clearer since each test contains code to exercise only one behavior.
This article was adapted from a Google Testing on the Toilet (TotT) episode. You can download a printer-friendly version of this TotT episode and post it in your office.
After writing a method, it's easy to write just one test that verifies everything the method does. But it can be harmful to think that tests and public methods should have a 1:1 relationship. What we really want to test are behaviors, where a single method can exhibit many behaviors, and a single behavior sometimes spans across multiple methods.
Let's take a look at a bad test that verifies an entire method:
@Test public void testProcessTransaction() { User user = newUserWithBalance(LOW_BALANCE_THRESHOLD.plus(dollars(2)); transactionProcessor.processTransaction( user, new Transaction("Pile of Beanie Babies", dollars(3))); assertContains("You bought a Pile of Beanie Babies", ui.getText()); assertEquals(1, user.getEmails().size()); assertEquals("Your balance is low", user.getEmails().get(0).getSubject()); }
Displaying the name of the purchased item and sending an email about the balance being low are two separate behaviors, but this test looks at both of those behaviors together just because they happen to be triggered by the same method. Tests like this very often become massive and difficult to maintain over time as additional behaviors keep getting added in—eventually it will be very hard to tell which parts of the input are responsible for which assertions. The fact that the test's name is a direct mirror of the method's name is a bad sign.
It's a much better idea to use separate tests to verify separate behaviors:
@Test public void testProcessTransaction_displaysNotification() { transactionProcessor.processTransaction( new User(), new Transaction("Pile of Beanie Babies")); assertContains("You bought a Pile of Beanie Babies", ui.getText()); } @Test public void testProcessTransaction_sendsEmailWhenBalanceIsLow() { User user = newUserWithBalance(LOW_BALANCE_THRESHOLD.plus(dollars(2)); transactionProcessor.processTransaction( user, new Transaction(dollars(3))); assertEquals(1, user.getEmails().size()); assertEquals("Your balance is low", user.getEmails().get(0).getSubject()); }
Now, when someone adds a new behavior, they will write a new test for that behavior. Each test will remain focused and easy to understand, no matter how many behaviors are added. This will make your tests more resilient since adding new behaviors is unlikely to break the existing tests, and clearer since each test contains code to exercise only one behavior.
Thanks as a QAE, I was so tempted to write all my validations in a single test esp when there is a continous flow happening. Now that temptations has eased.
ReplyDeleteHoweve this because more challenging when we write UI Tests with a lot of inputs before it even lands on the page to test. Where is the line to be drawn before going crazy and write automation tests for each test case vs combining all tests in 1 giant automation tests.
Nice point in the post.
ReplyDeleteTo Gerald's point, I believe that test setup is indeed a nontrivial issue. However, the sort of silver bullet is to keep the UI test at minimum. Test through user interface the things which are behavior of your UI, not the entire functionality. If you follow (and one Should follow) the concept of clean architecture (see uncle Bob martin's literature on this), then what we need to automate is behavior of the application (aka use cases of application) via application boundary.
Having said that, there are certain things that are not the behavior of you application use case but of the UI the way it is implemted right now. There are two important poi ts here. One, there is stuff on UI that needs to be tested even when bulk of our efforts are on automating use cases through boundary. There is no denying it. Second point is that UI is what it is now, it will change without any change in business. You won't want to refactor your automated cases because your UI has changed. So only test through UI what is really a UI behavior.
We have tried Selenium and found it to be difficult to maintain. Currently looking at Graphene. HTH!
--Nafees
Great entry. Do you still suggest to keep the name of the method in the test name?? We currently do that (as you do as well in your example "processTransaction_...") but I often have the feeling this might get quite messy once you want to apply a refactoring. How do you handle that?
ReplyDeleteI would treat too many 'processTransaction_' methods as duplication..even though it's (only) about method names. Just Search/Replacing its occurences in the test code when you rename the production code would ignore that code smell. Moving the test methods to their own class instead (and removing the 'processTransaction_') is more than just cosmetics. You get slimmer, more focussed test classes for different behaviours of your production code. Constantly applying this technique when you get the feeling of messinesa will help you cut your class accordingly once it becomes too clumsy during development and needs to be broken into several smaller classes.
DeleteYou bring up very good points with this article -- namely creating more discrete tests that can be used to better cover the intended behaviour of the product under test. Tests are more modular and are easier to maintain and can be better adapted when the product under test changes.
ReplyDeleteOne issue with breaking up larger tests into "smaller tests" (like the example provided in the blog post) is that the smaller tests now become dependent on the order of execution -- the tests are no longer atomic and must be run in a specific order. For example, testProcessTransaction_sendsEmailWhenBalanceIsLow() can only be executed after testProcessTransaction() otherwise testProcessTransaction_sendsEmailWhenBalanceIsLow() will fail falsely if testProcessTransaction() isn't called first. This is an especially prickly issue if you're working with automation stacks like Java and JUnit 4.8 which do not guarantee the order of execution.
When you break down one large test into multiple small ones you need to call your method(s) under test in each of them. That makes them independent given that your @Before provides a clean setup. In this particular example this is accomplished pretty easily. Nevertheless as soon as repositories or databases are involved it can become an art to keep run times within acceptable boundaries due to expensive setup and complex teardown mechanisms. This should not concern you in isolated unit tests though.
DeleteActually it's a good feedback mechanism that the order of JUnit tests differs depending on your system. Tells you whether your tests are independent or not :-)