Wednesday, January 24, 2007

TotT: Better Stubbing in Python

So you've learned all about method stubs, mock objects, and fakes. You might be tempted to stub out slow or I/O-dependent built-ins. For example:


def Foo(path):
if os.path.exists(path):
return DoSomething()
else:
return DoSomethingElse()

def testFoo(self): # Somewhere in your unit test class
old_exists = os.path.exists
try:
os.path.exists = lambda x: True
self.assertEqual(Foo('bar'), something)
os.path.exists = lambda x: False
self.assertEqual(Foo('bar'), something_else)
finally:
# Remember to clean-up after yourself!
os.path.exists = old_exists

Congratulations, you just achieved 100% coverage! Unfortunately, you might find that this test fails in strange ways. For example, given the following DoSomethingElse which checks the existence of a different file:


def DoSomethingElse():
assert os.path.exists(some_other_file)
return some_other_file

Foo will now throw an exception in its second invocation because os.path.exists returns False so the assertion fails.


You could avoid this problem by stubbing or mocking out DoSomethingElse, but the task might be daunting in a real-life situation. Instead, it is safer and faster to parameterize the built-in:


def Foo(path, path_checker=os.path.exists):
if path_checker(path):
return DoSomething()
else:
return DoSomethingElse()

def testFoo(self):
self.assertEqual(Foo('bar', lambda x: True), something)
self.assertEqual(Foo('bar', lambda x: False), something_else)

Remember to download this episode of Testing on the Toilet, print it, and flyer your office.

22 comments:

  1. Interesting technique. I think the article will be improved if it also mentioned that the technique comes with a cost of adding an extra unintuitive parameter into your code base. All subsequent maintainers of the code will probably find it very confusing to see that the path_checker is a parameter which makes the codebase less maintainable.

    ReplyDelete
  2. Is that really any better? Now the function signature is polluted with arguments for everything that I might want to use this technique for -- which could be considerable.

    ReplyDelete
  3. Think pretty much anything in this area quickly turns into an ugly hack but modifying Foo's parameters is probably worse than others.

    A transparent solution can be done via inspecting the stack and checking who the caller is (via python's inspect module) - if it's the function you want to test, using the dummy os.path.exists, otherwise use the real one. Would provide an example but whitespace formatting not supported the the comments here...

    That said surely DoSomething and DoSomethingElse should be stubbed out as well? It's only Foo being tested ;)

    ReplyDelete
  4. Another possibility is to make the stub check the argument, return a value for 'bar' and call the original function for anything else.

    os.path.exists = lambda x: x == "bar" or old_exists(x)

    Sorry if my code isn't much cop, I don't use Python much.

    ReplyDelete
  5. Cool!

    I don't use python much, but I support the technique of accepting stubbing as a worthwhile evil and designing it into the code. It's not really "polluting" the interface" because it IS the interface.

    ReplyDelete
  6. Hey,

    How about posting an RSS feed for the PDFs? I could hook it up via a script to print them directly to my printer at the office :)

    Cheers

    ReplyDelete
  7. Is Foo just a helper method for your unit test class? If so, I can see value in the new parameter but if not, it does seem like a hack.

    Full disclosure: I'm pretty new to Python so maybe I'm missing something.

    ReplyDelete
  8. This comment has been removed by the author.

    ReplyDelete
  9. This comment has been removed by the author.

    ReplyDelete
  10. What about extracting the os.path.exists into a helper that is injected at construction of the class containing Foo? This way your not polluting the parameter list but still providing a separation of concerns.

    ReplyDelete
  11. I think this issue -- at least the particular one you give -- is a sign of a smell further up in the stack. If you really are doing lots of file operations, you should really write files. The stubbing and mocking you'll have to do to avoid it just isn't worth it, and automatically wiping your scratch test files before the test runs is easy enough as well.

    Once you are comfortable with the testing of the file-related operations, you shouldn't keep testing them. Either stub those out, or just keep using scratch areas for the files.

    Similarly, if you rely on some other external service (the service in this case being the filesystem), putting in a trivial service is often better than futzing with replacing functions that are otherwise quite sufficient. A good example of a tool that allows this is wsgi_intercept, which lets you attach pretend HTTP apps to arbitrary host names. Of course, that's basically the original technique that you wanted to avoid; that someone else did it in a library kind of makes it okay. If you were mocking something that wasn't yet mocked, I'd recommend something more like your technique.

    Putting in a trivial service implementation should just be a matter of configuration. Configuring an address to http://localhost, or configuring the base file path, or pointing to a fake smtp server. You have to do that anyway regardless of testing, and that's a good place for your mocking.

    ReplyDelete
  12. I hope this goes without saying, but in production code the filesystem should be considered untrusted and an assertion about its contents would never be valid. Instead all possible failures should be handled. (Unless you *are* the filesystem.) I am assuming this is just for the sake of example though.

    ReplyDelete
  13. I think it's only a little bit of a hack, and really just because of the default paramter. You can take the same idea and make it cleaner by wrapping all your filesystem operations (or system calls) into your own functions/classes. For a language such as python that may seem a little extreme, but it is standard operating procedure for languages like C and C++.

    ReplyDelete
  14. I often use class attributes instead of polluting function signatures. There's just one little trick: if you want to assign a function to a class attribute without turning it into a method, you need to wrap it in staticmethod():

    class Whatever(object):
        # hook for unit tests
        path_checker = staticmethod(os.path.exists)

        def foo(self):
            return self.path_checker(self.filename)

    By the way, could you please refrain from inflicting the PEP-8-violating 2-space-indentation internal Google coding style on the rest of the world? Thanks!

    ReplyDelete
  15. Very cool and great idea. I'll print this and send to may dev people...

    Regards!

    ReplyDelete
  16. Caution about the python lambda. Word on the street -- it is going away.

    ReplyDelete
  17. Hey, there's a nicer way of setting a mock object without having it passed in as a parameter.

    At least, it works in C++ and Java - I'm not sure how OO works in Python because I know nothing about it.

    In the class you want to test, say you initialise path_tester = os.file.exists; in a constructor. We replace this with a protected factory method call:
    path_tester = getPathTester();

    Then we implement our protected factory method:
    protected getPathTester() {
    return os.file.exists;
    }


    But when we need to mock it, we create an anonymous inner class which inherits from the original, overriding getPathTester() to return our mock object that says yes or no or whatever (and perhaps, records that it was called x times).
    I found this technique on IBM's developerworks somewhere. Hopefully it makes sense in Python as well...

    ReplyDelete
  18. I use this technique a lot in my tests.

    I avoid having to add code to my production implementation to support the mock objects and expand the mock object to return match paths and to use the underlying implementation as a fall-through case. As a bonus, you can chain multiple mock objects together to mock-out several paths.


    It is pretty trivial to expand my example to take either a list of paths or a function to determine whether the provided path matches.

    My example looks really ugly in the comments, but you can find it on pastebin.com here

    ReplyDelete
  19. That's interesting. Actually, every dynamic language(JavaScript, ActionScript, Ruby, python...) can do that, something like method overriding.

    ReplyDelete
  20. It's interesting. Actually the same trick are among nearlly all dynamic langugages: JavaScript, ActionScript, Python, Ruby, etc. Just a mehtod/property overriding.

    ReplyDelete
  21. It's a python only technique. For those who don't use python, using optional parameters with default value is never a nice idea. For one parameter, it does a great job, but once you start using two unrelated optionnal value, it's start to be a mess, and optionnal parameters must be avoid.

    def Foo(path, foo_checker=default_foo_checker, bar_checker=default_bar_checker, ):
    # ...
    pass

    With something like that, in C/C++/Java, you can call :
    Foo(path)
    Foo(path,my_foo_checker)
    Foo(path,my_foo_checker,my_bar_checker)

    if you want to pass Foo(path,my_bar_checker), you can't with C/C++/Java, or you need to do :
    Foo(path,default_foo_ckecker,my_bar_checker)

    Erk ? Why would I need to know what is the default foo_checker ? Why would I even need to know that there is a foo_checker argument for another obscur test I don't evn want to know about.

    With pythonic code, you can still write :

    Foo(path)
    Foo(path,my_foo_checker)
    Foo(path,my_foo_checker,my_bar_checker)

    but you can also write :
    Foo(path,foo_checker=my_foo_checker)
    Foo(path,bar_checker=my_bar_checker)
    Foo(path,foo_checker=my_foo_checker,bar_checker=my_bar_checker)

    And even this call works:
    Foo(path,bar_checker=my_bar_checker,foo_checker=my_foo_checker)

    So for all those who don't know python and don't feel easy with optionnal parameters : You are right ! Optionnal parameters in C/C++/Java is not really something to abuse, it's not the same with python.

    That said, I won't have done it that way. I still agree with william, it would still be confusing.

    I would have done it that way :

    First, I would have used an "IOObject" to do that kind of IO things, to be able to change IOComponent.

    Then, I would have created that IO object once (per class or per instance, it depends), and finally, I would have changed that IOObject (with for exemple a child) for the test.

    Note that it's more or less what marius is doing. If you still want to use your optionnal parameter, you can pass IOObject to the constructor.

    To oisin : Yes, it would work the same with python. I still prefer Marius technique (the object is create as a static attribute, and only the test unit can change that attribute, rather than subclassing, because I'm not easy with testing a subclass of the class I really want to test. While I agree it's technically exactly the same)

    ReplyDelete
  22. Why isn't dosomething stubbed out?

    ReplyDelete

The comments you read and contribute here belong only to the person who posted them. We reserve the right to remove off-topic comments.