Tuesday, January 21, 2014

The Google Test and Development Environment - Pt. 3: Code, Build, and Test

by Anthony Vallone

This is the third in a series of articles about our work environment. See the first and second.

I will never forget the awe I felt when running my first load test on my first project at Google. At previous companies I’ve worked, running a substantial load test took quite a bit of resource planning and preparation. At Google, I wrote less than 100 lines of code and was simulating tens of thousands of users after just minutes of prep work. The ease with which I was able to accomplish this is due to the impressive coding, building, and testing tools available at Google. In this article, I will discuss these tools and how they affect our test and development process.

Coding and building

The tools and process for coding and building make it very easy to change production and test code. Even though we are a large company, we have managed to remain nimble. In a matter of minutes or hours, you can edit, test, review, and submit code to head. We have achieved this without sacrificing code quality by heavily investing in tools, testing, and infrastructure, and by prioritizing code reviews.

Most production and test code is in a single, company-wide source control repository (open source projects like Chromium and Android have their own). There is a great deal of code sharing in the codebase, and this provides an incredible suite of code to build on. Most code is also in a single branch, so the majority of development is done at head. All code is also navigable, searchable, and editable from the browser. You’ll find code in numerous languages, but Java, C++, Python, Go, and JavaScript are the most common.

Have a strong preference for editor? Engineers are free to choose from many IDEs and editors. The most common are Eclipse, Emacs, Vim, and IntelliJ, but many others are used as well. Engineers that are passionate about their prefered editors have built up and shared some truly impressive editor plugins/tooling over the years.

Code reviews for all submissions are enforced via source control tooling. This also applies to test code, as our test code is held to the same standards as production code. The reviews are done via web-based code review tools that even include automatically generated test results. The process is very streamlined and efficient. Engineers can change and submit code in any part of the repository, but it must get reviewed by owners of the code being changed. This is great, because you can easily change code that your team depends on, rather than merely request a change to code you do not own.

The Google build system is used for building most code, and it is designed to work across many languages and platforms. It is remarkably simple to define and build targets. You won’t be needing that old Makefile book.

Running jobs and tests

We have some pretty amazing machine and job management tools at Google. There is a generally available pool of machines in many data centers around the globe. The job management service makes it very easy to start jobs on arbitrary machines in any of these data centers. Failing machines are automatically removed from the pool, so tests rarely fail due to machine issues. With a little effort, you can also set up monitoring and pager alerting for your important jobs.

From any machine you can spin up a massive number of tests and run them in parallel across many machines in the pool, via a single command. Each of these tests are run in a standard, isolated environment, so we rarely run into the “it works on my machine!” issue.

Before code is submitted, presubmit tests can be run that will find all tests that depend transitively on the change and run them. You can also define presubmit rules that run checks on a code change and verify that tests were run before allowing submission.

Once you’ve submitted test code, the build and test system automatically registers the test, and starts building/testing continuously. If the test starts failing, your team will get notification emails. You can also visit a test dashboard for your team and get details about test runs and test data. Monitoring the build/test status is made even easier with our build orbs designed and built by Googlers. These small devices will glow red if the build starts failing. Many teams have had fun customizing these orbs to various shapes, including a statue of liberty with a glowing torch.

Statue of LORBerty

Running larger integration and end-to-end tests takes a little more work, but we have some excellent tools to help with these tests as well: Integration test runners, hermetic environment creation, virtual machine service, web test frameworks, etc.

The impact

So how do these tools actually affect our productivity? For starters, the code is easy to find, edit, review, and submit. Engineers are free to choose tools that make them most productive. Before and after submission, running small tests is trivial, and running large tests is relatively easy. Since tests are easy to create and run, it’s fairly simple to maintain a green build, which most teams do most of the time. This allows us to spend more time on real problems and less on the things that shouldn’t even be problems. It allows us to focus on creating rigorous tests. It dramatically accelerates the development process that can prototype Gmail in a day and code/test/release service features on a daily schedule. And, of course, it lets us focus on the fun stuff.

Thoughts?

We are interested to hear your thoughts on this topic. Google has the resources to build tools like this, but would small or medium size companies benefit from a similar investment in its infrastructure? Did Google create the infrastructure or did the infrastructure create Google?

6 comments:

  1. Hi Anthony,

    I think a lot of advance is taken from shear scale and would wonder what the numbers are of the items you mention in your blog post. We are in the process of doing what you have already done and the upfront investment is quite high with no real quantifiable model to calculate the outcome.... we are doing it because we think/assume this is the best way forward. OK we have very clever/experienced people here so there is some justification... but wonder how long it took Google to achieve this "testing nirvana".

    Cheers

    ReplyDelete
  2. Thank you so much for sharing this with us Anthony,

    While reading it I did have a question that came to mind. You mention "simulating tens of thousands of users after just minutes of prep work". I'm very curious as to what technology you're using to simulate the users? Is it something like a selenium webdriver running a real browser or maybe a headless browser like phantom.js or htmlunit? Or are you using an HTTP based simulation like JMeter or Gatling?

    I work for a company that provides JMeter as a service so we ourselves have dealt quite a bit with large scale distributed testing and I'd love to hear what technology Google is using in-house :)

    Thanks,
    Ophir from BlazeMeter

    ReplyDelete
    Replies
    1. Hi Ophir,

      We have several internal load testing frameworks to choose from, covering various languages and use cases. In my case, I was testing Google Cloud Storage, so I needed one that was geared toward testing a RESTful service. The selected framework is based on FunkLoad (http://funkload.nuxeo.org/). We have other frameworks that are based on JMeter as well. All of these frameworks depend heavily on the job management tools mentioned above in order to start, monitor, and stop load generating jobs on many machines.

      -Anthony

      Delete
  3. Its always useful for small companies, 1 person even, to invest in tooling to make coding / configuration management / testing / deploying / easier and less error prone. But its not something you necessarily do day 1 ( some things you do straight from the start), its something you do when there's enough friction that you could see a tool would be useful, but before there's so much friction that you are hampered by "manual" process, or that you have no facility to do something ( like a particular kind of testing).

    ReplyDelete
  4. Thanks Anthony,
    A nice and informative post. If we look the development scenario from a startup stands, it will be little costly but once they have products development in line they can move to this implementation. Still customers prefer to deliver more in less time so spending time in non-functional testing unless asked by customer is not a good choice.

    But at the other side one implementation can be done as automation of code testing during off hours. Like team can submit the code at the end of the day and all of those codes can be executed with automated deployment and testing strategy with the tools like Jenkins. On completion of execution testing framework should send email for test pass/fail status.

    ReplyDelete
  5. I have a basic question, how test cases creation is done in google. Are you engaging you talented resources in manual mundane task of test case creation.

    ReplyDelete

The comments you read and contribute here belong only to the person who posted them. We reserve the right to remove off-topic comments.