The Chromium Projects

Except as otherwise noted, the content of this page is licensed under a Creative Commons Attribution 2.5 license, and examples are licensed under the BSD License.

The Chromium OS designs and code are preliminary. Expect them to evolve.
For Developers‎ > ‎How-Tos‎ > ‎

Using Valgrind

The Chromium project uses runtime analysis tools on our test suite to detect memory errors in C++ code.  This lets us find bugs earlier in development.  On Linux and Mac, the tool we use is called Valgrind. This page will talk about how Valgrind is used in the Chromium project.  It is assumed that you have access to a Mac or Linux system and have some basic Valgrind knowledge.

There is also experimental support for running Valgrind with Windows binaries on Linux by using Wine. This is documented on a separate page.

Getting Started

First, make sure the system can build Chromium properly already.
Second, make sure you have valgrind binaries in your client.

You don't need to build your app specially to use valgrind. You should avoid running tests and apps directly under valgrind because you should set a few environment variables to make valgrind work as expected. You'll probably want to use one of the wrapper scripts described below; they get it right.

It is best to build Chromium and/or its tests with -O1 (or -O0) if you're going to be valgrinding, so for casual use, just do a debug build.

Later, if you find the runs take too long, switch your optimization level to -O1 and it should go a bit faster.  You may need to make warnings nonfatal, as different warnings pop up at different optimization levels.
On Linux, you can do this by creating the file ~/.gyp/include.gypi containing
{
  'variables': {
    'debug_optimize': '1',
    'werror': '',
  },
}
or by setting the equivalent environment variable (should be documented in the gyp documentation, but isn't yet).  You have to then run 'gclient runhooks --force' to get gyp to regenerate the build files.

You can also use a release build, but you'll want to change its optimization level to -O1, else lots of suppressions won't match.
Also, you should note that your build is for valgrind to avoid stripping Valgrind annotations.
On Linux, you can do this by creating the file ~/.gyp/include.gypi containing
{
  'variables': {
    'release_valgrind_build': 1,
    'release_optimize': '1',
    'werror': '',
  },
}
and then, again, running 'gclient runhooks --force' to regenerate the build files.

Using Valgrind with Chromium itself

The easiest way to run Chromium under valgrind is with the wrapper tools/valgrind/valgrind.sh, e.g.
     For make:
sh tools/valgrind/valgrind.sh out/Debug/chromium
     For scons:
sh tools/valgrind/valgrind.sh sconsbuild/Debug/chromium
This will set up the environment nicely, and will break into gdb at the first error.   Read the script to see what it does; to disable the debugger, remove the --db-attach-yes option.

Using Valgrind with the Chromium test suite

The easiest way to run Chromium's tests is using the wrapper tools/valgrind/chrome_tests.sh, e.g.
sh tools/valgrind/chrome_tests.sh -t ui
This takes care of nasty little details (e.g. it sets some environment variables that make memory allocation more valgrind-friendly, and valgrinds subprocesses without valgrinding python).

Blacklisting Tests

Because some tests run slowly or poorly under valgrind, the chrome_tests.sh script reads a blacklist of Tests To Not Run, called
<module>
/data/valgrind/<exe name>.gtest.txt, e.g. base/data/valgrind/base_unittests.gtest.txt.

Running a Single Test

The chrome_tests.sh accepts the  --gtest_filter option (see the gtest manual) so for you can do things like:
sh tools/valgrind/chrome_tests.sh -t ui --gtest_filter=DownloadTest.DownloadMimeType

Suppressing Errors


We suppress some of Valgrind's warnings, either because they are from system libraries we can't do anything about, or because we already have bugs filed in the Chromium issue tracker.  The chrome_tests.sh script reads overall suppressions from several sources:
  • tools/valgrind/memcheck/suppressions.txt
  • tools/valgrind/memcheck/suppressions_mac.txt (only on Mac)
In general, any suppression that is there because of a bug in chromium should be named bug_NNNNN where NNNNN is the chromium bug number, and the changeset that adds that suppression should include the string http://crbug.com/NNNNN in its description.

Figuring Out Which Test Caused a Warning

It's often difficult to figure out which test is at fault just from looking at valgrind results on the buildbot.  If you're lucky, one of the functions in the backtrace will have the name of the test suite and test case in it.    Otherwise you have to rerun the test locally to figure it out.  There are two ways to go here: either run the chrome_tests.sh wrapper with the --generate_suppressions flag, in which case the error will appear inline with the list of tests (but possibly corrupted if multiple processes are outputting at the same time), or rerun the tests in tiny shards.

To shard gtest-based tests, you can use gtest's environment variables, e.g.

export GTEST_TOTAL_SHARDS=100
export GTEST_SHARD_INDEX=0
test=ui
while test $GTEST_SHARD_INDEX -lt $GTEST_TOTAL_SHARDS
do
    sh tools/valgrind/chrome_tests.sh -t $test > ${test}_$GTEST_SHARD_INDEX.log 2>&1
    GTEST_SHARD_INDEX=`expr $GTEST_SHARD_INDEX + 1`
done
This will run the test program in 100 separate little runs, each of which covers one or just a few tests.  You can then subdivide further if needed by using the --gtest_filter option (see the gtest manual).

If the failure was in the layout tests, sharding is done differently.
chrome_tests.sh (well, really, chrome_tests.py) implements a stateful sharding system for layout tests; the state lives in a text file named valgrind_layout_chunk.txt.  Each time you run chrome_tests.sh -t layout, it runs the next chunk of twenty minutes or so of layout tests.  Thus the valgrind bots running layout tests tend to go red briefly as they hit a test that has a valgrind error, then green again right away as they move on to the next shard.

Here's a script showing how to run all layout tests locally, one test per log file.  You can use this on several boxes at once, all starting at different offsets, to try to get through all the tests in a weekend.

n=0
echo $n > valgrind_layout_chunk.txt
while true
do
   time sh tools/valgrind/chrome_tests.sh -t layout -v -n 1 > runlots.$n.log
   n=`expr $n + 1`
done

For UI tests you can run with --tool_flags="--nocleanup_on_exit", then you can look in valgrind.tmp/memcheck* and see which test caused the failure.  This only works because UI tests are run individually.

Valgrind on the Buildbots

The Chromium Buildbot status page includes output from many, many buildbots.  The valgrind ones are cleverly hidden on a separate Memory waterfall and on the right hand side of the Experimental page (where you'll find one mac and five linux bots).  Each bot runs a different set of tests through valgrind. At the moment, the valgrind bots go red when Valgrind reports an unsuppressed warning or when any of the tests fail. 

The Linux layout tests are sometimes red.  (There are about 8000 layout tests, and the valgrind bot rotates through them in chunks of 100 or so at a time, so it will flicker between red and green as it passes through different areas of the layout tests.)  Most of the layout warnings have bugs filed but only the most common have suppressions.

Workflow

Generally, when you have found a warning with Valgrind, here's what to do:
  1. Search for the stack trace in the bug tracking system and in the suppressions file; maybe it's an old known issue, and the existing suppression just needs widening to handle the optimizer making a different inlining decision, or maybe to handle a change in the signature of one of the functions involved.  One often starts out with a specific suppression, and then has to make it more generic by substituting the "..." wildcard or removing the most distant callers.  Or maybe it's a test that's known to crash and you need to also blacklist it under Valgrind, by adding it to e.g. base/data/valgrind/base_unittests.gtest.txt.
  2. If the culprit is obviously a recent change, talk to the author of the change, and see if they're willing/able to fix it.
  3. If that doesn't work, but you can fix the bug yourself, go for it.  If the bug has an entry in the issue tracker, please mark it as 'Started' before you start working on it. 
  4. If you can't resolve the issue, then file a bug (click here to see a list of open bugs found using Valgrind).
  5. If nobody's fixing the bug, and the tree is red because of it, add a suppression for it to bring the tree back to green (otherwise developers will start ignoring the valgrind red/green status).

Where To Start

Please be sure to mark any bug you're working on as 'Started' in the issue tracker before you start working on it.  (And don't pick a bug if someone else has started it.  It's probably ok to pick a bug that was assigned to someone else long ago but hasn't been started.)

Note:
before trying to reproduce valgrind warnings mentioned in bug reports, you probably need to delete the valgrind suppression files, or at least the suppression for that particular bug, else valgrind won't display it.

Ideas for how to start:

- Pick one of the many open Valgrind-related bugs, reproduce it locally, understand it, and fix it.

- Pick a unit test, run it under valgrind as discussed above, look at the warnings Valgrind gives, match them up with a bug, and then try to fix that bug.  If you succeed (which is often difficult, since if pointer bugs were easy, they wouldn't still be there), consider removing the suppression from the suppression file in the same changeset as the bug fix.  Be sure to include a link to the bug report (e.g. http://crbug.com/10679) in your changeset description.

- Find tests that fail under Valgrind (but have no valgrind warnings), and either figure out why they're failing, or file bugs.
You might start by looking at the valgrind buildbot logs  (they're buried kind of deep; click on a valgrind buildbot, click on a green (no valgrind warnings) run, then click on the tests within the run, and look for FAILED messages.  Then reproduce locally (i.e. run the test normally to verify it passes on your machine, then rerun it under valgrind to see if it fails without a valgrind warning; if it does, you've found one.)

- Look for suppressions which are no longer needed (Valgrind produces a list of which ones *have* been used), and both remove them and close the associated bug reports.  Beware, though: some valgrind warnings don't show up on all hardware, or only show up in one out of twenty runs, so check the bug report carefully and run valgrind on that test twenty times before marking it closed.


Finding Races

The above discussion centered around Valgrind's default tool, Memcheck.
Valgrind has other interesting tools; some of them, e.g. ThreadSanitizer, can find data races.

Challenges

Using valgrind to find bugs in the Chromium source tree is challenging for several reasons:
  • Running a test under Valgrind is 10x - 20x slower
  • The UI tests don't currently shut down gracefully, leading to spurious task leaks, etc.  Evan Stade looked at this for a while.  
  • The display manager on the Mac crashes on some machines when you run Chromium under Valgrind!  (We're reporting this to Apple.)