Wednesday, August 27, 2008

Taming the Beast (a.k.a. how to test AJAX applications) : Part 2


Posted by John Thomas and Markus Clermont


This is the second of a two part blog series titled 'Taming the Beast : How to test AJAX applications'. In part one we discussed some philosophies around web application testing. In this part we walk through a real example of designing a test strategy for an AJAX application by going 'beyond the GUI'.

Application under test

The sample application we want to test is a simple inventory management system that allows users to increase or decrease the number of parts at various store locations. The application is built using GWT (Google Web Toolkit) but the testing methodology described here could be applied to any AJAX application.

To quickly recap from part one, here's our recipe for testing goodness:
  1. Explore the system's functionality
  2. Identify the system's architecture
  3. Identify the interfaces between components
  4. Identify dependencies and fault conditions
  5. For each function
    1. Identify the participating components
    2. Identify potential problems
    3. Test in isolation for problems
    4. Create a 'happy path' test
Let's look at each step in detail:

1. Explore the system's functionality

Simple as this sounds, it is a crucial first step to testing the application. You need to know how the system functions from a user's perspective before you can begin writing tests. Open the app, browse around, click on buttons and links and just get a 'feel' of the app. Here's what our example app looks like:
The app has a NavPane to filter the inventory by locations, list number of items in each location, increase/decrease the balance for items and sort the list by office and by product.


2. Identify the architecture
Learning about the system architecture is the next critical step. At this point think of the system as a set of components and figure out how they talk to each other. Design documents and architecture diagrams are helpful in this step. In our example we have the following components:
  • GWT client: Java code compiled into JavaScript that lives in the users browser. Communicates with the server via HTTP-RPC
  • Servlet: standard Apache Tomcat servlet that serves the "frontend.html" (main page) with the injected JavaScript and also serves RPCs to communicate with the client-side JavaScript.
  • Server-side implementation of the RPC-Stubs: The servlet dispatches the RPC over HTTP calls to this implementation. The RPCImpl communicates with the RPC-Backend via protocol-buffers over RPC
  • RPC backend: deals with the business logic and data storage.
  • Bigtable: for storing data
It helps to draw a simple diagram representing the data flows between these components, if one doesn't already exist:
In our sample application, the RPC-Implementation is called "StoreService" and the other RPC-Backend is called "OfficeBackend".

3. Identify the interfaces between components

Some obvious ones are:
  • gwt_module target in Ant build file
  • "service" servlet of Apache Tomcat
  • definition of the RPC-Interface
  • Protocol buffers
  • Bigtable
  • UI (it is an interface, after all!)

4. Identify dependencies and fault conditions
With the interfaces correctly identified, we need to identify dependencies and figure out input values that are needed to simulate error conditions in the system.

In our case the UI talks to the servlet which in turn talks to StoreService (RPCImpl). We should verify what happens when the StoreService:
  • returns null
  • returns empty lists
  • returns huge lists
  • returns lists with malformed content (wrongly encoded, null or long strings)
  • times out
  • gets two concurrent calls
In addition the RPCImpl (StoreService) talks to the RPC-Backend (OfficeAdministration). Again we want to make sure the proper calls are made and what happens when the backend:
  • returns malformed content
  • times out
  • sends two concurrent requests
  • throws exceptions
To achieve these goals, we will want to replace the RPCImpl (StoreService) with a mock that we can control, and have the servlet talk to the mock. The same is true for the OfficeAdministration - we will want to replace the real RPCBackend with a more controllable fake, and have StoreService communicate with the mock instead.

To get a better overview, we will first look at individual use-cases, and see how the components interact. An example would be the filter-function in the UI (only those items under a 'checked' in a checked-location in the NavPane will be displayed in the table).

Analyze the NavPane filter

  • Client
    • Gets all offices from RPC
    • On select, fetch items with RPC. On completion, update table.
    • On deselect, clear items from table.
  • RPCImpl
    • Gets all offices from RPC-Backend
    • Fetches all stock for an office from RPC-Backend
  • RPC-Backend
    • Scan bigtable for all offices
    • Query stock for a given office from bigtable.
Our next step is to figure out the "smallest test" that can give us confidence that each of the components works as expected.

Test client-side behavior
Make sure that de-selecting an item removes it. For that, we need to be sure what items will be in the list. A fake RPCImpl could do just that - independent of other tests that might use the same datasource.

The task is to make the Servlet talk to the "MockStoreService" as RPCImpl. We have different possibilities to achieve that:
  • Introduce a flag to switch
  • Use the proxy-pattern
  • Switch it at run time
  • Add a different constructor to the servlet
  • Introduce a different build-target that links to the fake implementation
  • Use dependency injection to swap out real for fake implementations
Any one of these options would do the job depending on the application. Solutions like adding a new constructor to the servlet would need production code to depend on test code, which is obviously a bad idea. Switching implementations at run time (using class loader trickery) is also an option but could expose security holes. Dependency injection offers a flexible and efficient way to do the same job without polluting production code.

There are various frameworks to allow this form of dependency injection. We want to briefly introduce GuiceBerry as one of them.

Guiceberry is a framework to allow you to use a composition model for the services your test needs. In other words, if your test depends on certain services you can have those services "injected" into your tests using a popular dependency injection tool called Guice.

In our example we need to annotate the RPCImpl object with "@Inject" in the servlet test class and create an alternate implementation called MockStoreService to swap in at run time. Here's a code snippet that shows how:

@GuiceBerryEnv(StoreGuiceBerryEnvs.NORMAL)
public class StorePortalTest extends TestCase {

@Inject

StoreServiceImpl storeService;
public void testStorePortal() {
...
storeService.doSomething();
...
}
}


In the code snippet above, note the lines marked in bold. That's Guiceberry magic that allows us to inject a StoreServiceImpl object into the StorePortalTest class. The construction of the StoreServiceImpl is done inside a Guiceberry environment class called NormalStoreGuiceBerryEnvs (and linked to StorePortal via the StoreGuiceBerryEnvs class). To inject a mock RPCImpl into StorePortalTest we would need to create a MockStoreGuiceBerryEnvs (which would instantiate a mock StoreService) and swap that for NormalStoreGuiceBerryEnvs at run time. All we need to do is to specify the following JVM args for the test ...

JVM_ARGS="-DNormalStoreGuiceBerryEnvs=MockStoreGuiceBerryEnvs"

This is just a quick peek at how Guiceberry works. Go to the official Guiceberry website to learn more.

This will be enough to decouple the client from the rest of the system. GwtTestCase does the rest of the job on the client side. You find more details here. Don't forget to inject all possible failure scenarios through the MockStoreService.

Let's see what we found out so far:
  • We know that
    • UI callbacks work correctly
    • Interaction UI - Frontend works fine
    • Expected errors are handled adequately by the UI
  • We don't know whether
    • things are rendered correctly
    • things we expect to be on a page are really there
Although we already found out quite a lot about the UI, it is too early to be confident that the client works as expected. We need to know more about the UI to be able to answer the remaining two questions.

This is where some more traditional techniques, namely automated UI tests, enter the stage.
  • Add JavaScript hooks into the page, that return the elements (JSNI is the way to go here)
  • Use Selenium for UI tests (using both the hooks and the MockStoreService). All we need to do is check whether
    • the elements exist
    • all the buttons (which need to be clicked on) are clickable
    • scrollbars are added when needed
We don't need to do more work here - the GwtTestcase helped us to determine that that the "Model" and the "Controllor" work properly. All we needed to look at is the "View".

One problem we have often had with Selenium tests in the past was that people relied overly on XPath queries to retrieve the elements from webpages. Of course, when the DOM changed it caused many tests to break. One way to work around that is to introduce JavaScript hooks. They are only added when the application runs with a special "testing" flag and they directly return the elements needed.

You might wonder why this approach is any better? Well for one, we can catch problems earlier, and fix them without even looking at the tests that use them. A small and fast JsUnit test can be used to determine whether a hook is broken. If so, it is only a line of code to fix the problem.

Let's review what we have found out so far:
  • We know that
    • UI callbacks work correctly
    • Interaction UI - Frontend works fine
    • Expected errors are handled adequately by the UI
    • Things are rendered appropriately
    • DOM is correct
  • We don't know whether
    • Other (non-UI) components work as expected

Test the StoreService (RPCImpl)

The methods in StoreService (RPCImpl) need a lot of good unit testing. If we write a good amount of unit tests, we probably already have a MockOfficeAdministration (RPC-Backend) that we can use for our further testing efforts.

The main value we can add here is to verify that (1) each interface method in the StoreService behaves correctly, even in the face of communication errors with the RPC-Backend and (2) each method behaves as expected. By using a MockOfficeAdministration as RPC-Backend, we don't have to worry about setting up the data (plus injecting faults is easy!)

Besides testing the basic functionality, e.g.
  • Are all the records that we expect retrieved
  • Are no records that shouldn't be retrieved passed on to the caller
  • Does the application behave correctly, even if no records are found
... we can now also look at
  • Malformed or Unexpected data
  • Too much data
  • Empty replies
  • Exceptions
  • Time-outs
  • Concurrency problems
How can we replace our real RPC-Backend with the mock? That shouldn't be all that difficult, as using an RPC mechanism already forced us to define interfaces for the server. All we need to do is implement a mock-RPC-Backend and run that instead. You might want to consider running the mock-RPC-Backend on the same machine as the tests, to make your tests run faster.

Some example test cases at this level are:
  • Retrieve the list of all offices Let the mock-RPC-Backend
    • return no office
    • return 100 offices, 1 malformed encoded
    • return 100 offices, 1 null
    • ...
    • throw an exception
    • time out
  • Retrieve product / stock for an office Let the mock-RPC-Backend stubby return
    • ...
  • Retrieve a product for an office Let the mock-RPC-Backend block, and
    • issue a second query for the same product at the same time (and to make it more interesting, play with the results that the mock could return!).
  • ....
Let's see what we have found out so far: We know that
  • the UI works in isolation as expected
  • the StoreService (RPCImpl) appropriately invokes the RPC-Backend-Service
  • the StoreService (RPCImpl) properly handles any error-conditions
  • A little bit about the app's behavior under concurrency
We don't know whether
  • the RPC-Backend-Service really expects the behavior the StoreServiceImpl exposes.
It is easy to see that we can do the same excercise for OfficeAdministration (RPC-Backend) and possibly use a MockBigtable implementation. After that, we would know that:
  • Backend correctly reads from Bigtable
  • Business logic in the backend works correctly
  • Backend knows how to handle error-conditions
  • Backend knows how to handle missing data
We don't know whether
  • Backend is used correctly, i.e. in the way it is intended to be used
Test the OfficeAdministration (RPC-Backend) and StoreService (RPCImpl)

Now let us verify the interaction between OfficeAdministration (RPC-Backend) and StoreService (RPCImpl). This is an essential task, and is not really that difficult. The following points should make testing this quick and easy:
  • Easy to test (through Java API)
  • Easy to understand
  • Ideally contains all the business logic
  • Available early
  • Executes fast (MockBigtable is an option here)
  • Maintenance burden is low (because of stable interfaces)
  • Potentially subset of tests as for StoreService (RPCImpl) alone
Let's see what we have found out so far: We know that
  • the UI works in isolation as expected
  • the OfficeAdministration (RPC-Backend) and the StoreService (RPCImpl) work together as expected
We don't know whether
  • The results find their way to the user

Last but not the least ... system test!

Now we need to plug all the components together and do the 'big' system test. In our case, a typical set up would be:
  • Manipulate the "real" Bigtable and populate with "good" data for our test
    • 5 offices, each with 5 products and each with a stock of 5
  • Use Selenium (with the hooks) to
    • Navigate via the Navbar
    • Exclude an item
    • Add an item
    • ...
We now know that all components plugged together can handle one typical use case. We should repeat this test for each function that we can invoke through the UI.

The biggest advantage, however, is that we just need to look for communication issues between all 3 building blocks. We don't need to verify boundary cases, inject network errors, or other things (because we have already verified that earlier!)


Conclusion
Our approach requires that we
  • Understand the system
  • Understand the platform
  • Understand what can go wrong (and where)
  • Start early with our tests
  • Invest in infrastructure to run our tests (mocks, fakes, ...)
What we get in return is
  • Faster test execution
  • Less maintenance for the tests
    • shared ownership
    • early execution > early breakage > easy fix
  • Shorter feedback loops
  • Easier debugging / better localization of bugs due to fewer false negatives.

Root Cause of Singletons

by Miško Hevery

Since I have gotten lots of love/hate mail on the Singletons are Pathological Liars and Where Have All the Singletons Gone I feel obliged to to do some root cause analysis.

Lets get the definition right. There is Singleton the design pattern (Notice the capital "S" as in name of something) and there is a singleton as in one of something (notice the lower case "s"). There is nothing wrong with having a single instance of a class, lots of reasons why you may want to do that. However, when I complain about the Singletons, I complain about the design pattern. Specifically: (1) private constructor and (2) global instance variable which refers to the singleton. So from now on when I say Singleton, I mean the design (anti)pattern.

I would say that at this point most developers recognize that global state is harmful to your application design. Singletons have global instance variable which points to the singleton. The instance is global. The trouble with global variables is that they are transitive. It is not just the global variable marked with static which is global but any other variable/object which is reachable by traversing the object graph. All of it is global! Singletons, usually are complex objects which contain a lot of state. As a result all of the state of Singleton is global as well. I like to say that "Singletons are global state in sheep's clothing." Most developers agree that global state is bad, but they love their Singletons.

The moment you traverse a global variable your API lies about its true dependencies (see: Singletons are Pathological Liars) The root problem is not the Singleton design pattern, the root problem here is the global reference to singleton. But the moment you get rid of the global variable you get rid of the Singleton design pattern. So from my point of view blaming Singletons or blaming global state is one and the same. You can't have a Singleton design pattern and at the same time not have the global state.

Someone pointed out that any design pattern can be abused. I agree, but with Singleton design pattern, I don't know how I can possibly use it in a good way. The global reference and hence the global state is ever so present. Now, in my line of work I don't see too much global state in classical sense of the word, but I see a lot of global state masquerading as Singletons. Hence, I complain about Singletons. If I would complain about global state no one would care, as that is old news.

Now, there is one kind of Singleton which is OK. That is a singleton where all of the reachable objects are immutable. If all objects are immutable than Singleton has no global state, as everything is constant. But it is so easy to turn this kind of singleton into mutable one, it is very slippery slope. Therefore, I am against these Singletons too, not because they are bad, but because it is very easy for them to go bad. (As a side note Java enumeration are just these kind of singletons. As long as you don't put state into your enumeration you are OK, so please don't.)

The other kind of Singletons, which are semi-acceptable are those which don't effect the execution of your code, They have no "side effects". Logging is perfect example. It is loaded with Singletons and global state. It is acceptable (as in it will not hurt you) because your application does not behave any different whether or not a given logger is enabled. The information here flows one way: From your application into the logger. Even thought loggers are global state since no information flows from loggers into your application, loggers are acceptable. You should still inject your logger if you want your test to assert that something is getting logged, but in general Loggers are not harmful despite being full of state.

So the root cause is "GLOBAL STATE!" Keep in mind that global state is transitive, so any object which is reachable from a global variable is global as well. It is not possible to have a Singleton and not have a global state. Therefore, Singleton design patter can not be used in "the right way." Now you could have a immutable singleton, but outside of limited use as enumerations, they have little value. Most applications are full of Singletons which have lots of global state, and where the information flows both directions.

Thursday, August 21, 2008

TotT: Sleeping != Synchronization

You've got some code that uses threads, and it's making your tests flaky and slow. How do you fix it? First, most of the code is probably still single-threaded: test those parts separately. But how to test the threading behavior itself?

Often, threaded tests start out using sleeps to wait for something to happen. This test is trying to verify that DelegateToIntern spawns its work argument into a parallel thread and invokes a callback when it's done.

def testInternMakesCoffee(self):

  self.caffeinated = False
  def DrinkCoffee(): self.caffeinated = True

  DelegateToIntern(work=Intern().MakeCoffee, callback=DrinkCoffee)

  self.assertFalse(self.caffeinated, "I watch YouTubework; intern brews")
  time.sleep(60) # 1min should be long enough to make coffee, right?
  self.assertTrue(self.caffeinated, "Where's mah coffee?!?")



Aside from abusing your intern every time you run the test, this test takes a minute longer than it needs to, and it may even fail when the machine (or intern!) is loaded in odd ways. You should always be skeptical of sleep statements, especially in tests. How can we make the test more reliable?

The answer is to explicitly control when things happen within DelegateToIntern with a threading.Event in Python, a Notification in C++, or a CountDownLatch(1) in Java.

def testInternMakesCoffee(self):

  is_started, can_finish, is_done = Event(), Event(), Event()

  def FakeCoffeeMaker():
    is_started.set() # Allow is_started.wait() to return.
    # Wait up to 1min for can_finish.set() to be called. The timeout
    # prevents failures from hanging, but doesn't delay a passing test.
    can_finish.wait(timeout=60) # .await() in Java

  DelegateToIntern(work=FakeCoffeeMaker, callback=lambda:is_done.set())

  is_started.wait(timeout=60)
  self.assertTrue(is_started.isSet(), "FakeCoffeeMaker should have started")
  self.assertFalse(is_done.isSet(), "Don't bug me before coffee's made")

  can_finish.set() # Now let FakeCoffeeMaker return.
  is_done.wait(timeout=60)
  self.assertTrue(is_done.isSet(), "Intern should ping when coffee's ready")



Now we're guaranteed that no part of the test runs faster than we expect, and the test passes very quickly. It could run slowly when it fails, but you can easily lower the timeouts while you're debugging it.

We'll look at testing for race conditions in a future episode.

No interns were harmed in the making of this TotT.

Remember to download this episode of Testing on the Toilet and post it in your office.

Where Have All the Singletons Gone?

by Miško Hevery

In Singletons are Pathological Liars we discussed the problems of having singletons in your code. Let's build on that and answer the question "If I don't have singletons how do I ensure there is only one instance of X and how do I get X to all of the places it is needed?"

An OO application is a graph of objects. There are three different kinds of graphs I think of when I design an application

  1. Collaborator Graph: This is the graph of objects that would be emitted if you serialized your application. This shows which objects are aware of which others. (through object's fields)

  2. Construction Graph: This graph shows which object created which other ones.

  3. Call Graph: This graph shows which other methods each method calls. A stack-trace would be a single slice through this graph.


If the new operators are mixed with application logic (see: How to Think About the new Operator) then the Constructor Graph and the Collaborator Graph tend to be one and the same. However, in an application which uses Dependency Injection the two graphs are completely independent. Lets have a closer look at our CreditCardProcessor example. Suppose this is our collaborator graph which we need to execute a request.


The above shows the application collaborator graph. The letter (S/R) in the corner designates object lifetime; either Singleton or Request scope. Now, just to be clear, there is nothing wrong with having a single instance of a class. The problem arises only when the singleton is available through a global "instance" variable as in Singleton.getInstance().

The HTTP request would come to AuthenticatorPage which would collaborate with Authenticator to make sure the user is valid and forward a valid request onto ChargePage which would then try to load the user from UserRepository and create the credit card transaction which would be processed by CrediCardProcessor. This in turn would collaborate with OfflineQueue to get the work done.

Now, in order to have a testable codebase we have to make sure that we don't mix the object construction with application logic. So all of the above objects should rarely call the new operator (value objects are OK). Instead each of the objects above would declare its collaborators in the constructor. AuthenticatorPage would ask for ChargePage and Authenticator. ChargePage would ask for CreditCardProcessor and UserRepository. And so on. We have moved the problem of construction elsewhere.

In our tests it is now easy to instantiate the graph of objects and substitute test-doubles for our collaborators. For example if we would like to test the AuthenticatorPage, we would instantiate a real AuthenticatorPage with mock ChargePage and mock Authenticator. We would than assert that a request which comes in causes appropriate calls on Authenticator and ChargePage only if authentication is successful. If the AuthenticatorPage were to get a reference to Authenticator from global state or by constructing it, we would not be able to replace the Authenticator with a test-double. (This is why it is so important not to mix object construction with application logic. In the unit-test what you instantiate is a sub-set of the whole application. Hence the instantiation logic has to be separate from application logic! Otherwise, it's a non-starter.)

So now the problem is, how do we construct the graph of objects?


In short we move all of the new operators to a factory. We group all of the objects of similar lifetime into a single factory. In our case all of the singletons end up in ApplicationFactory and all of the Pages end up in RequestFactory. The main method of our application instantiates an ApplicationFactory. When we call build() the ApplicationFactory in turn instantiates its list of objects (Database, OfflineQueue, Authenticator, UserRepository, CreditCardProcessor and RequestFactory). Because each of the objects declares its dependency, the ApplicationFactory is forced to instantiate the objects in the right order. In our case it must instantiate the Database first and than pass the reference to UserRepository and OfflineQueue. (The code will simply not compile any other way.)

Notice that when we create a RequestFactory we must pass in references to the Authenticator, UserRepository and CreditCardProcessor. This is because when we call build() on RequestFactory it will try to instantiate AuthenticatorPage which needs the Authenticator. So we need to pass the Authenticator into the constructor of RequestFactory and so on.

At run-time an HTTP request comes in. The servlet has a reference to RequestFactory and calls build(). The servlet now has a reference to the AuthenticatorPage and it can dispatch the request for processing.

Important things to notice:

  • Every object only has references to what it needs directly! No passing around of objects which are not directly needed by the code. There is no global state at all. Dependencies are obvious since each object only asks for what it needs.

  • If an object needs a reference to a new dependency it simply declares it. This change only affects the corresponding factory, and as a result, it is very isolated.

  • All of the new operators end up in the factories; application logic is devoid of new operators.

  • You group all of the objects with the same lifetime into a single factory (If the factory gets too big you can break it up into more classes, but you can still think of it as a single factory)

  • The problem of "how do I ensure that I only have one of something" is nicely sidestepped. You instantiate only a single ApplicationFactory in your main, and as a result, you only instantiate a single instance of all of your singletons.


Now the factories become largely a series of object creations. Totally boring stuff, so boring a computer could generate the code. Simply look at the constructor and recursively instantiate whatever the constructor wants. Wait, a computer can generate it! Its called PicoContainer or GUICE! So you don't actually have to write the factories.

Sunday, August 17, 2008

Singletons are Pathological Liars

by Miško Hevery

So you join a new project, which has an extensive mature code base. Your new lead asks you to implement a new feature, and, as a good developer, you start by writing a test. But since you are new to the project, you do a lot of exploratory "What happens if I execute this method" tests. You start by writing this:

testCreditCardCharge() {
CreditCard c = new CreditCard(
"1234 5678 9012 3456", 5, 2008);
c.charge(100);
}

This code:

  • Only works when you run as part of the suite.
  • When run in isolation, throws NullPointerException.
  • When you get your credit card bill, you are out $100 for every time the test runs.

Now, I want to focus on the last point. How in the world did the test cause an actual charge on my credit card? Charging a credit card is not easy. The test needs to talk to a third party credit card web-service. It needs to know the URL for the web-service. It needs to authenticate, pass the credentials, and identify who the merchant is. None of this information is present in the test. Worse yet, since I don't even know where that information is present, how do I mock out the external dependencies so that every run does not result in $100 actually being charged? And as a new developer, how was I supposed to know that what I was about to do was going to result in me being $100 poorer? That is "Spooky action at a distance!"

But why do I get NullPointerException in isolation while the test works fine when run as part of the suite? And how do I fix it? Short of digging through lots of source code, you go and ask the more senior and wiser people on the project. After a lot of digging, you learn that you need to initialize the CreditCardProcessor.

testCreditCardCharge() {
CreditCardProcessor.init();
CreditCard c = new CreditCard(
"1234 5678 9012 3456", 5, 2008);
c.charge(100);
}

You run the test again; still no success, and you get a different exception. Again, you chat with the senior and wiser members of the project. Someone tells you that the CreditCardProcessor needs an OfflineQueue to run.

testCreditCardCharge() {
OfflineQueue.init();
CreditCardProcessor.init();
CreditCard c = new CreditCard(
"1234 5678 9012 3456", 5, 2008);
c.charge(100);
}

Excited, you run the test again: nothing. Yet another exception. You go in search of answers and come back with the knowledge that the Database needs to be initialized in order for the Queue to store the data.

testCreditCardCharge() {
Database.init();
OfflineQueue.init();
CreditCardProcessor.init();
CreditCard c = new CreditCard(
"1234 5678 9012 3456", 5, 2008);
c.charge(100);
}

Finally, the test passes in isolation, and again you are out $100. (Chances are that the test will now fail in the suite, so you will have to surround your initialization logic with "if not initialized then initialize" code.)

The problem is that the APIs are pathological liars. The credit card pretends that you can just instantiate it and call the charge method. But secretly, it collaborates with the CreditCardProcessor. The CreditCardProcessor API says that it can be initialized in isolation, but in reality, it needs the OfflineQueue. The OflineQueue needs the database. To the developers who wrote this code, it is obvious that the CreditCard needs the CreditCardProcessor. They wrote the code that way. But to anyone new on the project, this is a total mystery, and it hinders the learning curve.

But there is more! When I see the code above, as far as I can tell, the three init statements and the credit card instantiation are independent. They can happen in any order. So when I am re-factoring code, it is likely that I will move and rearrange the order as a side-effect of cleaning up something else. I could easily end up with something like this:

testCreditCardCharge() {
CreditCardProcessor.init();
OfflineQueue.init();
CreditCard c = new CreditCard(
"1234 5678 9012 3456", 5, 2008);
c.charge(100);
Database.init();
}

The code just stopped working, but I had no way to knowing that ahead of time. Most developers would be able to guess that these statements are related in this simple example, but on a real project, the initialization code is usually spread over many classes, and you might very well initialize hundreds of objects. The exact order of initialization becomes a mystery.

How do we fix that? Easy! Have the API declare the dependency!

testCreditCardCharge() {
Database db = Database();
OfflineQueue q = OfflineQueue(db);
CreditCardProcessor ccp = new CreditCardProcessor(q);
CreditCard c = new CreditCard(
"1234 5678 9012 3456", 5, 2008);
c.charge(ccp, 100);
}

Since the CreditCard charge method declares that it needs a CreditCardProcessor, I don't have to go ask anyone about that. The code will simply not compile without it. I have a clear hint that I need to instantiate a CreditCardProcessor. When I try to instantiate the CreditCardProcessor, I am faced with supplying an OfflineQueue. In turn, when trying to instantiate the OfflineQueue, I need to create a Database. The order of instantiation is clear! Not only is it clear, but it is impossible to place the statements in the wrong order, as the code will not compile. Finally, explicit reference passing makes all of the objects subject to garbage collection at the end of the test; therefore, this test can not cause any other test to fail when run in the suite.

The best benefit is that now, you have seams where you can mock out the collaborators so that you don't keep getting charged $100 each time you run the test. You even have choices. You can mock out CreditCardProcessor, or you can use a real CreditCardProcessor and mock out OfflineQueue, and so on.

Singletons are nothing more than global state. Global state makes it so your objects can secretly get hold of things which are not declared in their APIs, and, as a result, Singletons make your APIs into pathological liars.

Think of it another way. You can live in a society where everyone (every class) declares who their friends (collaborators) are. If I know that Joe knows Mary but neither Mary nor Joe knows Tim, then it is safe for me to assume that if I give some information to Joe he may give it to Mary, but under no circumstances will Tim get hold of it. Now, imagine that everyone (every class) declares some of their friends (collaborators), but other friends (collaborators which are singletons) are kept secret. Now you are left wondering how in the world did Tim got hold of the information you gave to Joe.

Here is the interesting part. If you are the person who built the relationships (code) originally, you know the true dependencies, but anyone who comes after you is baffled, since the friends which are declared are not the sole friends of objects, and information flows in some secret paths which are not clear to you. You live in a society full of liars.

Thursday, August 14, 2008

TotT: 100 and counting

(This week, TotT issued our 100th internally published episode. That's more than have been published to this Testing Blog -- after all, the internal episodes had an 8-month head start, and many would make no sense to readers outside our own stalls -- but we thought you'd still enjoy reading about the history and future of TotT.)

Did you know there was a time before Testing on the Toilet? It's true! 19.3% of Googlers remember back before TotT's weekly entertainment and testing advice. In this 100th episode, let's reminisce a bit, then look toward our future ... and how you can help keep our toilets humorous and informative.

After our first episode (May 2, 2006), TotT was met with some skepticism from Googlers and others, including Slashdot.org, who said "It [is] faintly reminiscent of a cult." But soon Google embraced TotT. A few weeks later, someone complained on a mailing list: "Why wasn't I informed of this [new testing] technique at my nearby toilet?" Nooglers eagerly sign up to distribute episodes with our motto: “Debugging sucks. Testing rocks!”

Some mottos that didn't make the cut:
"Testing on the Toilet: it's not just for pregnancy anymore!"
"Make software testing your number one priority!"
"Testing: you can't just wash your hands of it."

Now, TotT appears weekly:
  • In hundreds of stalls in 30 Google offices
  • With episodes covering many programming languages and application domains
  • Written by volunteer authors from offices worldwide.
  • TotT is also published to fans outside our walls on Google's public Testing Blog.

It's all done by a volunteer, grassroots effort of dedicated, passionate Googlers. This bottom-up activism – engineers making other engineers' lives better – is a hallmark of Google culture.

Other grouplets have adopted TotT's techniques to effectively spread their own messages in flyers like Hiring on the Table, and “TotT Presents” guest spots have shown items such as Production on the Potty.



Little known fact: TotT's testing advocacy is a leading factor in the recovery of the red kangaroo population, due to the drop in demand for “build red” phosphorus.

Remember to download this special episode of Testing on the Toilet and post it in your office.

Thursday, August 07, 2008

TotT: A Matter of Black and White

The progressive developer knows that in this complex modern world, things aren't always black-and-white, even in testing. Sometimes we know the software won't return the best answer, or even a correct answer, for all input. You may be tempted to write only test cases that your software can pass. After all, we can't have automated tests that fail in even one instance. But, this would miss an opportunity.


Speaking of black-and-white, take decoding of two-dimensional barcodes, like QR Codes. From a blurry, skewed, rotated image of a barcode, software has to pick it out, transform it, and decode it:




http://google.com/gwt/n?u=bluenile.com


Even the best software can't always find that barcode. What should tests for such software do?


We have some answers, from experience testing such software. We have two groups of black-box tests that verify that images decode correctly: must-have and nice-to-have. Tests verify that the must-have set – the easy images – definitely decode correctly. This is what traditional tests would include, which typically demand a 100% pass rate. But we also see how well we do on the more difficult nice-to-have set. We might verify that 50% of them decode, and fail otherwise.


The advantage? We can include tougher test cases in unit tests, instead of avoiding them. We can observe small changes – improvements as well as degradations – in decode accuracy over time. It doubles as a crude quality evaluation framework.


Where can this progressive thinking be applied? Maybe when your code...


Only needs to be correct in most cases. As here, write tests to verify easy cases work, but also that some hard cases pass too.


Needs to be fast. You write unit tests that verify it runs "fast enough" on simple input. How about writing tests that make sure it runs "fast enough" on most of some larger inputs too?


Is heuristic. You write unit tests that verify that the answer is “really close” to optimal on simple input, but also that it's “kind of close” on difficult input.


By the way, did we mention project ZXing, Google's open-source decoder project? Or that Print Ads is already helping clients place these two-dimensional barcodes in the New York Times? Or that there are other formats like Data Matrix? or that you can put more than just a URL in these barcodes? This is a technology going global, so, time to read up on it.


Remember to download this episode of Testing on the Toilet and post it in your office.

Wednesday, August 06, 2008

Writing Testable Code

by Miško Hevery

So you decided to finally give this testing thing a try. But somehow you just can't figure out how to write a unit-test for your class. Well there are no tricks to writing tests, there are only tricks to writing testable code. If I gave you testable code you would have no problems writing a test for it. But, somehow you look at your code and you say, "I understand how to write tests for your code, but my code is different ". Well your code is different because you violated one or more of the following things. (I will go into the details of each in a separate blog posts)

  1. Mixing object graph construction with application logic: In a test the thing you want to do is to instantiate a portion (ideally just the class under test) of your application and apply some stimulus to the class and assert that the expected behavior was observed. In order to instantiate the a class in isolation we have to make sure that the class itself does not instantiate other objects (and those objects do not instantiate more objects and so on). Most developers freely mix the "new" operator with the application logic. In order to have a testable code-base your application should have two kinds of classes. The factories, these are full of the "new" operators and are responsible for building the object graph of your application, but don't do anything. And the application logic classes which are devoid of the "new" operator and are responsible for doing work. In test we want to test the application logic. And because the application logic is devoid of the "new" operator, we can easily construct an object graph useful for testing where we can strategically replace the real classes for test doubles. (see: How to Think About the “new” Operator with Respect to Unit Testing)

  2. Ask for things, Don't look for things (aka Dependency Injection / Law of Demeter): OK, you got rid of your new operators in you application code. But how do I get a hold of the dependencies. Simple: Just ask for all of the collaborators you need in your constructor. If you are a House class then in your constructor you will ask for the Kitchen, LivingRoom, and BedRoom, you will not call the "new" operator on those classes (see 1). Only ask for things you directly need, If you are a CarEngine, don't ask for FuelTank, only ask for Fuel. Don't pass in a context/registry/service-locator. So if you are a LoginPage, don't ask for UserContext, instead ask for the User and the Athenticator. Finally don't mix the responsibility of work with configuration, If you are an Authenticator class don't pass in a path of the configuration information which you read inside the constructor to configure yourself, just ask for the configuration object and let some other class worry about reading the object from the disk. In your tests you will not want to write a configuration into a disk just so that your object can read it in again. (see: Breaking the Law of Demeter is Like Looking for a Needle in the Haystack)

  3. Doing work in constructor: A class under tests can have tens of tests. Each test instantiates a slightly different object graph and than applies some stimulus and asserts a response. As you can see the most common operation you will do in tests is instantiation of object graphs, so make it easy on yourself and make the constructors do no work (other than assigning all of the dependencies into the fields). Any work you do in a constructor, you will have to successfully navigate through on every instantiation (read every test). This may be benign, or it may be something really complex like reading configuration information from the disk. But it is not just a direct test for the class which will have to pay this price, it will also be any related test which tries to instantiate your class indirectly as part of some larger object graph which the test is trying to create.

  4. Global State: Global state is bad from theoretical, maintainability, and understandability point of view, but is tolerable at run-time as long as you have one instance of your application. However, each test is a small instantiation of your application in contrast to one instance of application in production. The global state persists from one test to the next and creates mass confusion. Tests run in isolation but not together. Worse yet, tests fail together but problems can not be reproduced in isolation. Order of the tests matters. The APIs are not clear about the order of initialization and object instantiation, and so on. I hope that by now most developers agree that global state should be treated like GOTO.

  5. Singletons (global state in sheep's clothing): It amazes me that many developers will agree that global state is bad yet their code is full of singletons. (Singletons which enforce their own singletoness through private constructor and a global instance variable) The core of the issue is that the global instance variables have transitive property! All of the internal objects of the singleton are global as well (and the internals of those objects are global as well... recursively). Singletons are by far the most subtle and insidious thing in unit-testing. I will post more blogs on this topic later as I am sure it will create comments from both sides.

  6. Static methods: (or living in a procedural world): The key to testing is the presence of seams (places where you can divert the normal execution flow). Seams are essentially polymorphism (Polymorphism: at compile-time the method you are calling can not be determined). Seams are needed so that you can isolate the unit of test. If you build an application with nothing but static methods you have procedural application. Procedural code has no seams, at compile-time it is clear which method calls which other method. I don't know how to test application without seams. How much a static method will hurt from a testing point of view depends on where it is in you application call graph. A leaf method such as Math.abs() is not a problem since the execution call graph ends there. But if you pick a method in a core of your application logic than everything behind the method becomes hard to test, since there is no way to insert test doubles (and there are no seams). Additionally it is really easy for a leaf method to stop being a leaf and than a method which was OK as static no longer is. I don't know how to unit-test the main method!

  7. Favor composition over inheritance: At run-time you can not chose a different inheritance, but you can chose a different composition, this is important for tests as we want to test thing in isolation. Many developers use inheritance as code reuse which is wrong. Whether or not inheritance is appropriate depends on whether polymorphism is going on. Inheriting from AuthenticatedServlet will make your sub-class very hard to test since every test will have to mock out the authentication. This will clutter the focus of test, with the things we have to do to successfully navigate the super class. But what if AuthenticatedServlet inherits from DbTransactionServlet? (that gets so much harder)

  8. Favor polymorphism over conditionals: If you see a switch statement you should think polymorphisms. If you see the same if condition repeated in many places in your class you should again think polymorphism. Polymorphism will break your complex class into several smaller simpler classes which clearly define which pieces of the code are related and execute together. This helps testing since simpler/smaller class is easier to test.

  9. Mixing Service Objects with Value Objects: There should be two kinds of objects in your application. (1) Value-objects, these tend to have lots of getters / setters and are very easy to construct are never mocked, and probably don't need an interface. (Example: LinkedList, Map, User, EmailAddress, Email, CreditCard, etc...). (2) Service-objects which do the interesting work, their constructors ask for lots of other objects for colaboration, are good candidates for mocking, tend to have an interface and tend to have multiple implementations (Example: MailServer, CreditCardProcessor, UserAthenticator, AddressValidator). A value-object should never take a service object in its constructor (since than it is not easy to construct). Value-objects are the leafs of your application graph and tend to be created freely with the "new" operator directly in line with your business logic (exception to point 1 since they are leafs). Service-objects are harder to construct and as a result are never constructed with a new operator in-line, (instead use factory / DI-framework) for the object graph construction. Service-objects don't take value-objects in their constructors since DI-frameworks tend to be unaware about the how to create a value-object. From a testing point of view we like value-objects since we can just create them on the fly and assert on their state. Service-objects are harder to test since their state is not clear and they are all about collaboration and as a result we are forced to use mocking, something which we want to minimize. Mixing the two creates a hybrid which has no advantages of value-objects and all the baggage of service-object.

  10. Mixing of Concerns: If summing up what the class does includes the word "and", or class would be challenging for new team members to read and quickly "get it", or class has fields that are only used in some methods, or class has static methods that only operate on parameters than you have a class which mixes concerns. These classes are hard to tests since there are multiple objects hiding inside of them and as a resulting you are testing all of the objects at once.


So here is my top 10 list on testability, the trick is translating these abstract concepts into concrete decisions in your code.

Friday, August 01, 2008

GTAC Attendance Application deadline Aug15

Posted by Lydia Ash, GTAC Conference Chair - 2008

A brief reminder that there are only two weeks left to submit applications for this year's Google Test Automation Conference. The application deadline is August 15th, after which the selection process will begin.

The Call For Attendance application is available on the Google website here. http://services.google.com/events/gtac2008

See you in October!