This page has details to help Chromium sheriffs. For more information and procedures, see Tree Sheriffs and read the theses on Sheriffing.
For passing the torch, you can also leave notes here.
What to watch
- Failures-only waterfall. It will show you only the bots a sheriff would need to look at.
- Console view to make sure we are not too much behind in the testing.
- IRC #chromium on freenode.
- Be available on IM.
- Do not ignore the Reliability tester. It's very important for Chromium stability.
- Do not ignore ChromeOS bots. These bots build and run Chrome for ChromeOS on Linux and ChromiumOS respectively and are as important as win/mac/linux bots. If you're not sure how to fix an issue, feel free to contact ChromiumOS sheriffs.
- Do not ignore ASan bots. This is called "Memory waterfall" but is nevertheless required to be watched by the regular sheriffs. Bugs reported by ASan usually cause memory corruptions in the wild, so do not hesitate to revert or disable the failing test (ASan does not support suppressions).
When to close the tree
- A test went red: Tree maybe closed
- If the cause is obvious (the FooShouldWork test broke, and someone just checked in changes to foo_utils.cc), the tree can stay open. Revert the change, sending the review to the person.
- If the cause isn't obvious, close the tree. Ask everyone on the blamelist to help track it down and revert the patch as soon as found.
- A test occasionally goes red: Tree open
- This is a flaky test. If the change is obvious, revert the change.
- If the change isn't obvious, keep the tree open but disable the test and file a bug. See below for details.
- webkit_tests went red or got new regressions: Tree maybe closed
- Layout tests are just like other kinds of tests, except that sometimes we file and mark their new failures rather than fixing them right away. See below for details.
- One category of bot fails to build or has a swarm of test failures: Tree closed
- If all the debug, release, Vista, XP, etc. builds go red, act as with a single test going red.
- One bot went red: Tree open
- If only one buildbot is having problems (can't update, can't compile, exploding in some other way), the tree can stay open while it's fixed. We have reasonable redundant coverage now. Ask a trooper for help.
- An update failed: Tree maybe closed
- Try again from the internal waterfall. Ask for the url to colleague. If it keeps failing or gives a worrisome error, contact a trooper
- "extract build" is orange, or fails once: Tree open
- Orange "extract build" means it's using the latest built revision and not the one it's supposed to. If it does not work the second time, contact a trooper.
- A slave is hung at a step: Tree maybe closed
- If a slave hangs, sometimes just cancelling the build may not work. In that case call a trooper.
- Small insects crawling on stems and leaves seem to be eating sap: Tree infested
- The tree probably has aphids. Release ladybugs nearby to eat them.
$ cd $TMP_DIR; drover --revert 12345
$ git checkout trunk; git pull; gclient sync
$ git svn find-rev r12345 # -> a git hash
$ git checkout -b revert_foo trunk
$ cd $SRC # a gcl/svn repo
- Unless this is Incredibuild flakiness, REVERT.
- If this is Incredibuild flakiness, just force a clobber.
- Waiting for a fix it not a good idea. Just revert until it compiles again.
- If it's not clear why compile failed, contact a trooper.
Handling misbehaving (flaky, failing, timing-out, crashing) tests
If no recent commit is the source of the problem then proceed with the steps below.
- Head over to crbug.com and file a bug. Make sure to include sample output from the test (since the buildbots don't keep the data forever). Make sure to assign an owner, usually whoever modified the test last from a
- Disable the test by prefixing DISABLED_ to the test case name:
- TEST(FooTest, FooShouldWork) becomes TEST(FooTest, DISABLED_FooShouldWork)
- After that, the test will not be executed when the test binary is run. However, it will still be compiled, so the code will not rot.
- Add a comment above the test with a link to the bug on crbug.com
- FLAKY_ and FAILS_ should only be used temporarily to diagnose why a test is failing on the bots.
- In the change description add the line "BUG=xyz" where xyz is the bug number
- If possible, include a link to the flakiness dashboard for the test in the bug report
- For an excellent example take a look at the following:
Note that ASan, unlike other memory tools on the Chromium waterfall, does not support gtest_exclude.txt files. Please disable the tests failing under ASan as described above (possibly under #if defined(ADDRESS_SANITIZER))
Handling failing perf expectations (like the sizes step)
When a step turns red because perf expectations weren't met, use the instructions on the perf sheriffs page to give you information on how to handle it. It can also help to get in touch with the developer that landed the change along with the current perf sheriff to decide how to proceed.
Coordinating WebKit breakages / fixes
Tips and Tricks
How to read the tree status at the top of the waterfall
It's up to the main sheriffs to keep an eye on the Official waterfall.
- Chromium / Webkit / Modules rows contain all the bots on the main waterfall.
- Official and Memory bots are on separate waterfalls, but the view at the top show their status.
The memory sheriff helps with tending the Memory FYI tree, and the webkit sheriff helps out with the Webkit bots.
Checking whether a test is flaky
Merging the console view
If you want to know when revisions have been tested together, open the console view and click the "merge" link at the bottom.
- Open a GChat session with your fellow sheriffs. This is useful for coordinating outside of IRC. (e.g. lunch breaks, who will pursue what, etc)
- Open a shared GDoc and use it to track open issues. For example, if a test starts flaking, drop in the dashboard links. Take notes about your discoveries, CLs, crbugs, owners, etc. If anything outlasts your shift, put it in the Sheriff Log.
NOTE: If your shift spans a weekend, you aren't expected to sheriff on the weekend (you do have to sheriff on the other days, e.g. Friday and Monday). The same applies for holidays in your office.