Monday, October 15, 2007

Overview of Infrastructure Testing

Posted by Marc Kaplan, Test Engineering Lead

At Google, we have infrastructure that is shared between many projects. This infrastructure creates a situation where we have a many dependencies in terms of build requirements, but also in terms of test requirements. We've found that we actually need two approaches to deal with these requirements depending on whether we are looking to run larger system tests or smaller unittests, both of which ultimately need to be executed to improve quality.

For unittests, we are typically interested in only the module or function that is under test at the time, and we don't care as much about downstream dependencies, except insofar as they relate to the module under test. So we will typically write test mocks to mock out the downstream components that we aren't interested in actually running that simulate their behaviors and failure modes. Of course, this can only be done after understanding how the downstream module works and interfaces with our module.

As an example of mocking out a downstream component in Bigtable, we want to simulate the failure of Chubby , our external lockservice, so we we write a Chubby test mock that simulates the various ways that Chubby can interact with Bigtable. We then use this for the Bigtable unittests so that they a) run faster, b) reduce external dependencies and c) enable us to simulate various failure and retry conditions in the Bigtable Chubby related code.

There are also cases where we want to simulate components that are actually upstream to the component under test. In these cases we write what is called a test driver. This is very similar to a mock, except that instead of being called by our module (downstream) it calls our module (upstream). For example, if Bigtable component has some Mapreduce specific handling, we might want to write a test driver to simulate these Mapreduce-specific interfaces so we don't have to run the full Mapreduce framework inside our unittest framework. The benefits are all the same as those of using test mocks. In fact, in many cases it may be desirable to use both drivers and mocks, or perhaps multiple of each.

In system tests where we're more interested in the true system behaviors and timings, or in other cases where we can't write a driver or mocks we might turn to fault injection. Typically, this involves either completely failing certain components sporadically in system tests, or injecting particular faults via a fault injection layer that we write. Looking back to Bigtable again, since Bigtable uses GFS when we run system tests, we are running fault injection for GFS by failing actual masters and chunkservers sporadically, and seeing how Bigtable reacts under load to verify that when we deploy new versions of Bigtable that they it will work given the frequent rate of hardware failures. Another approach that we're currently work on is actually simulating the GFS behavior via a fault injection library so we can reduce the need to use private GFS cells which will result in better use of resources.

Overall, the use of Test Drivers, Test Mocks, and Fault Injection allows developers and test engineers at Google to test components more accurately, quickly, and above all helps improve quality.


  1. Marc, thanks for a great article. I have recently completed work on a successful project using similar techniques.

    We ended up with a system that had far fewer bugs than normal and also we had far greater confidence in the system.

    Agility Thoughts

  2. Really nice article...thanks fir sharing


  3. It seems that all referred web-links are now broken.


The comments you read and contribute here belong only to the person who posted them. We reserve the right to remove off-topic comments.