Tuesday, February 09, 2010

Testing in the Data Center (Manufacturing No More)

By James A. Whittaker

W. Edwards Deming helped to revolutionize the process of manufacturing automobiles in the 1970s and a decade later the software industry ran with the manufacturing analogy and the result was nearly every waterfall, spiral or agile method we have. Some like TQM, Cleanroom and Six Sigma are obvious descendants of Deming while others were just heavily influenced by his thinking. Deming was the man.

I repeat, was. My time testing in Google's data center makes it clear that this analogy just doesn't fit anymore. I want a new one. And I want one that helps me as a tester. I want one that better guides my behavior.

We just don't write or release software the way we used to. Software isn't so much built as it is grown. Software isn't shipped ... it's simply made available by, often literally, the flip of a switch. This is not your father's software. 21st century development is a seamless path from innovation to release where every phase of development, including release, is happening all the time. Users are on the inside of the firewall in that respect and feedback is constant. If a product isn't compelling we find out much earlier and it dies in the data center. I fancy these dead products serve to enrich the data center, a digital circle of life where new products are built on the bones of the ones that didn't make it.

In our father's software and Deming's model we talk about quality control and quality assurance while we play the role of inspector. In contrast, my job seems much more like that of an attending physician. In fact, a medical analogy gives us some interesting parallels to think about software testing. A physician's hospital is our data center, there is always activity and many things are happening in parallel. Physicians have patients; we have applications and features. Their medical devices are our infrastructure and tools. I can picture my application's features strewn across the data center in little virtual hospital beds. Over here is the GMail ward, over there is Maps. Search, of course, has a wing of its own and Ads, well, they all have private rooms.

In a hospital records are important. There are too many patients with specific medical conditions and treatment histories for any physician to keep straight. Imagine walking up to the operating table without examination notes and diagnoses? Imagine operating without a constant stream of real time health data?

Yet as software testers we find ourselves in this situation often. That app lying in our data center has been tested before. It has been treated before. Where are our medical notes?

So let's add little clipboards to the virtual data center beds in which our apps lay. Let's add equipment to take vitals and display them for any attending tester to see. Like human patients, apps have a pulse, data runs through code paths like blood through veins. There are important things happening, countable events that lead to statistics, indicators and create a medical history for an attending tester to use in whatever procedure they must now perform. The work of prior testers need not be ignored.

It's an unsettling aspect of the analogy that I have put developers in the role of creator, but so be it. Like other metaphorical creators before them they have spawned intrinsically flawed creatures. Security is their cancer, privacy their aging. Software is born broken and only some things can be fixed. The cancer of security can only be managed. Like actual aging, privacy is a guarantee only young software enjoys. Such is the life of a data center app.

But it is the monitors and clipboards that intrigue me. What do they say of our digital patients? As an app grows from concept into adolescence what part of their growth do we monitor? Where is the best place to place our probes? How do we document treatment and evaluations? Where do we store the notes about surgeries? What maladies have been treated? Are there problematic organs and recurrent illness? The documents and spreadsheets of the last century are inadequate. A patient's records are only useful if they are attached to the patient, up-to-date and in full living color to be read by whatever attending tester happens to be on call.

This is the challenge of the new century of software. It's not a process of get-it-as-reliable-as-possible-before-we-ship. It's health care, cradle to grave health care ... prevention, diagnosis, treatment and cure.

So slip into your scrubs, it's going to be a long night in the ER.


  1. James, you are poet and a philosopher. You could have enriched the world's greatest libraries with your treatise on human condition. What are you doing as a tester? Why James, why?

  2. So I'm not the only one who sees the hospital analogy!

    What I find interesting is this: hospitals on TV always seem to be run by doctors (see MASH, Scrubs, Grey's Anatomy, House...). And yet very, very few of the software projects I've been on were run by software developers. In dozens of projects I've worked on the project managers were almost always MBA types. One project I was on was managed by a marketing major. Very few of those managers had much, if any, experience writing code.

    Granted, I'm drawing a parallel with television pseudo-reality. Maybe the idea that hospitals are run by old, grizzled veteran doctors doesn't hold water. But the analogous setup sure seems to work better for software.

  3. Excellent post! It's not often I hear a new (to me) analogy about software development. And I like the one of health care.

  4. Interesting analogy. As a "doctor", I often feel like there is a health insurance bureaucracy questioning my practices and wanting me to treat every patient the same. And don't forget the pharmaceuticals companies trying to convince me that their magic pills will cure our patients.

  5. James - as one who has led testing for Electronic Medical Record apps, I appreciated your take. In this space there is less 'building' of new code-based functions and more connecting of existing toolsets. This industry tends to late adoption of new approaches/techniques. So hurry up and get real-time monitoring of test procsses hammered out. We'll see it about 2-3 years from now.

  6. Wow. Great analogy. Wonder if this should be a pattern similar to the Gang of Four' other ones?

  7. Deming still applies to software development and testing, but not Deming as applied to manufacturing. Deming as applied to product development is perfectly consistent with your observations.

    I agree with the manufacturing-no-more thing, but I think it's a mistake to couple Deming to manufacturing just because software people mistakenly saw software development as manufacturing and proceeded to collude that with Deming.

    Realizing that software development isn't manufacturing is a great step forward. Perspectives like flow-based product development management reflect these realizations, but don't conclude that Deming is obsolete. They conclude that our presumption that software development is manufacturing was never correct.

    Deming is still applicable to product development, but we have to realize that we're doing product development.

  8. This has been an uncanny week or two in software testing for me, and it includes your post. As the whole Toyota debacle has been unfolding, I've been in the midst of reading "Artful Making" by Austin and Devin. (Your CEO wrote the forward) It mentions Deming in several places. Your post indicates that you feel the ground shifting beneath our definitions of software engineering, development and testing? I do as well.

  9. A very interesting analogy between medicine and software testing
    To carry this further:

    I) Software testing

    A) System testing is like doing a master health check up for a human being
    You run through a pre determined set of tests, designed to check that every functionality of the software / human being works well.
    The extra point: At a particular age the human being is more prone to certain diseases (cardiac, blood pressure, eye sight..) – the health checkup will also be designed accordingly - What is similar in the case of software?
    Modifications in a software are likened to aging - modifications are of two forms – a) handling bugs, primarily during initial development – changing the code to handle the bugs initially found and also bugs found after release b) after the software has been released - to handle new features;
    Each of these ‘aging’ have to be handled differently - type a - handled by feature tests and more importantly - regression tests while type b is handled by feature tests primarily, then by integration tests and less by regression tests.

    B) Unit testing is like doing a particular human organ functionality tests (like doing eye tests or checking for blood pressure)
    Here the requirements are spelt out in detail and particular tests are carried out to check these extreme situations are acceptable by the tested unit / organ

    C) A fault is reported on the field in the software – Now similar to an emergency situation in the hospital the test engineer should be able to narrow down the reported fault to possible scenarios and reproduce the fault within the test environment - providing a means for the developer to then identify the module(s) that is faulty and how and rectify it - also providing a means to test that the fault is now handled.

    And yes we need to keep track of the history of the patient (bugs in a particular software, to aid regression testing), the group of patients (bugs in the type of software being tested – similar bugs are likely in certain scenarios – test patterns!), and the environment where the patient lives in (bugs in the accessory software that the software being tested, interacts with)

  10. Very thought provoking and original. It resonates greatly with our approach to testing the website here at Netflix.

  11. Metaphors are the problem, not the solution.

  12. Isaac wrote: "What I find interesting is this: hospitals on TV always seem to be run by doctors"

    In real life, hospitals are *not* run by doctors. The doctors get to sit in committees with managers, and the managers run the hospitals.

  13. Your analogies explained the use of "Engineering Analytics" in an original and refreshing way. However, in most of the modern data center applications, not only the data flows through the veins of the Apps but the App itself gets patched. Using your analogy -- think of it as a patient getting blood transfusion as well as organ patching while adding a couple or more new limbs to the body !!.

    It would be a great thing if one can not only visualize the path coverage of the App with real-time data in the data centers but also distinguish how much of that is going through "aged" code vs newly "patched" code.

    Regarding your Deming's reference, you have almost "Glenn Becked" ;)

  14. Hi James,

    Definitely a thought-provoking post. I really like the analogy between the medical and software fields. Especially the part about testers collaborating with each other and sharing the thoughts of an application. However, the thoughts that come out from a practical point of view make me wonder as to what should be contents of the medical clipboard. Is it an indication of the problematic areas of the software? Is it general notes about every single part of the application under test? Is it notes of the "we've been here, we've tested that" nature? Or maybe it's a combination of all.

  15. I can understand your analogy with medical environment, altough I feel that is too much metaphore and too few real/practical examples.

  16. Matt Doar wrote: "In real life, hospitals are *not* run by doctors."

    And that's a pity, isn't it? My experience with software development is that having managers who are completely inexperienced in the field practiced by the people they manage results in very poor management decisions.

    Back to James' original post: I'm not sure that "creator" is the right place to put developers in the analogy. Certainly they are present and instrumental in the "birth" of software. But I wouldn't say a developer's role in creation is their primary function on a software project. Reaching again into my own experiences, I spend far more time fixing and enhancing existing software than creating brand new software.

    Software just sort of happens under the right circumstances, if steps aren't taken to prevent it. We developers may help to bring about those circumstances sometimes, but mostly we seem to work to bring software to maturity, keep it going for a few years, and then ease it into end of life. I'm not sure what that makes us.

  17. Rather than "creator", maybe we can call developers their intellectual "parents"? They should know the most about the applications, obviously, but perhaps not all their extra-curricular activities?

    I feel that extends the metaphor for their involvement in treatment as well. ;)

  18. Hi James, Interesting analogy, keep updating. These kind of analogy are rare to find on Internet, yaap you need to dive deep into the subject to get through it.

  19. I would say the parents are those softwares for data centers that were developed 5 years ago. Within the past 2-3 years there are so many new DCIM software solutions that enhance both energy efficiency and data center management. Tracking assets from any device in any location allows the IT manager to literally monitor the life of the data center in real time.


The comments you read and contribute here belong only to the person who posted them. We reserve the right to remove off-topic comments.