The Chromium Projects

Except as otherwise noted, the content of this page is licensed under a Creative Commons Attribution 2.5 license, and examples are licensed under the BSD License.

The Chromium OS designs and code are preliminary. Expect them to evolve.
For Developers‎ > ‎How-Tos‎ > ‎

WebKit Gardening

WebKit gardening is the process of regularly incrementing webkit_revision in Chromium DEPS file, staying as close as possible to the tip of WebKit trunk, while ensuring that these updates don't bring regressions into the Chromium tree. A regression is a failure of any of the build steps on the Chromium waterfall.

The revision increment is commonly called a roll, though you are free to call it anything you'd like. The gardeners use Chromium canary botswebkit waterfall, flakiness dashboard, and rebaseline as their tools. All Chromium engineers who are actively involved in WebKit development participate in WebKit gardening.

Tools

  • Chromium builders on build.webkit.org. These build WebCore along with the Chromium WebKit API and unit tests. Currently, they don't run any tests and are only used for detecting and troubleshooting compile breakages. Our plan is to port parts of test_shell as Chromium DumpRenderTree and run layout tests on these builders.
  • Canary bots. These are experimental waterfall builders, configured to pull from the tip of trunk WebKit, showing regressions as they happen upstream. Canary bots run test_shell_tests, webkit_unit_tests, and webkit_tests (layout tests) steps in addition to building.
    Tip: Your sanity depends on well-performing canaries. Prior to your turn at gardening, check to make sure there are no odd or infrastructure-related failures (purple bots, thousands of layout test failures after innocent changes) on the canaries and ask for help if you don't know how to fix them.
  • Flakiness Dashboard. The name is a bit of a misnomer, since the tool was first developed to monitor flakiness of the layout tests. Naming aside, this is an great tool for layout test regression diagnostics. The dashboard represents successive layout test runs as a timeline, making it easy to quickly pinpoint regression culprits and reasons for failure.
  • Rebaselining Tool. This tool pulls expected results (or baselines) from the build bots.

Prerequisites

A gardener should be both WebKit and Chromium committer. There's a gardening schedule. If you're in rotation, there will be an email notifying you of the upcoming stint.

How to Garden

As a gardener, you directly contribute to the quality of Chromium. Any regressions you bring in may end up in released code. An innocent-looking single layout test failure may bring half of the Web to its knees. Don't bring the Web to its knees. You are the final line of defense.

Here is a set of general rules, distilled from our collective gardening experiences:
  • Approach regressions one-by-one, earliest first. Viewing regressions in the context of changes that caused them will help you better understand what should be done to address them.
  • Fix regressions quickly. When compile is broken, you can't see if new regressions are being checked in. The larger revision range is between the regression and the fix, the harder it will be to land the roll, which brings us to ...
    Tip: Append &reload=60 to the canary waterfall URL to let it auto-refresh the page for you.
  • Roll in small increments. This can't be emphasized enough, so let's repeat: Roll in Small Increments! Don't get stuck in the situation where you landed a 100-revision roll, burned the tree to the ground, and don't know where to begin figuring out what went wrong. If you have a choice, stay within 10-15 revisions per roll.
  • Study revisions before rolling. A quick glance within the range of your roll (http://trac.webkit.org/log/?rev=[NEW_REV]&stop_rev=[OLD_REV]&verbose=ongoes a long way in finding the commits that could cause trouble. Size up your opponent. Doing this frequently will also give you a better sense of the code base and how changes affect it.
  • Use try bots to test your roll. Try bots at least run ui_tests, which should give you enough indication if something's gone awry in the revision range. 
  • Use layout try bots -- because they build in Debug. Any ASSERTs will be revealed on the try bots. Run them both through the standard set and  layout test try bots (gcl try foo --bot layout_win,layout_mac,layout_linux).
    Tip: Linux layout bot runs almost twice the speed of the other platforms and will likely show the same debug crashes as other platforms.
Here's a typical gardening checklist:
  1. Find the Last Good Revision. Use canary waterfall to find the last green cycle for all platforms. Roll to it first (see "Roll" step below).
    Tip: Append &num_events=500 to the waterfall URL to reduce the number of "[next page]" clicks.
  2. Diagnose Regression. From the last good revision, look at the regression. Compile breaks are pretty evident on the waterfall, and so are layout test failures. In addition to turning red on the waterfall, layout test regressions add a link to flakiness dashboard, which shows details on each new failure. Use this information to determine regression type:
    1. Breaks Everyone. The commit introduced a compile break or a test regression for all or multiple ports.
      Tip: Because WebKit waterfall doesn't run pixel tests, you may see broken layout tests on the canaries only -- even though it's a regression for everyone. Be suspicious of this when the failures are just IMAGE (especially on the Mac canary, since the output there almost should typically match platform/mac).
      Tip: Watch for performance regressions on the canaries, since there are no other public WebKit trunk performance dashboards.
    2. Breaks Chromium. This could be compile breaks or test regressions. The likely cause of these are Chromium port differences: KURLGoogle, WebKit API, Skia, V8 bindings, or any Chromium-specific code in WebKit. You can usually tell which ones by studying the commit that introduced the regression. Possible symptoms are:
      1. Compile break after a change JavaScriptCore or bindings files.
      2. Compile break in Chromium WebKit API.
      3. We fail tests that other platforms pass.
      4. Unit test failure.
    3. Breaks No One. The commit added new layout tests or changed expectations. Due to port differences, expected results checked into the platform/mac (de-facto standard WebKit port) may not match Chromium actuals due to text metrics, V8 message or convention differences, Skia-vs.-CoreGraphics rendering, etc.
      Tip: Use flakiness dashboard to examine the history of the affected test(s), and see how expectations have changed using "show expectations" links. Make sure you're looking at the "WebKit Canary" results (link at the top of the dashboard).
    4. Adds New Testing Capabilities. This is usually a subset of "breaks-no-one", but it's worth putting into its own category. To provide regression test coverage, the existing harness is continually extended with new methods, properties, and objects. All of them have to be developed separately for each port. Typically, the initial commit contains only one port's implementation, meaning that the corresponding layout test(s) will appear as failing for other ports.
  3. Apply Treatment.
    1. If this is a "breaks-everyone" regression or "breaks-chromium" that's not related to V8 bindings or KURLGoogle:
      1. Notify the author of the change. Comment on the bug, mentioned in the commit. Ping them on #webkit. 
      2. If the author is unavailable or unable to fix the problem quickly, roll out the change, whether it's a compile breakage or a test regression.
        Tip: Use webkit-patch rollout for expedient and pleasant roll-out experience.
      3. Alternatively, if you see an easy fix to the problem, do it yourself. You can commit build fixes without a review. Just make sure to replace Reviewed by ... with Not Reviewed, build fix. in the ChangeLog.
    2. For V8 bindings or KURLGoogle-related "breaks-chromium" regressions:
      1. If this is a Chromium-specific change, apply Chromium tree rules and roll out the patch.
      2. If you see an easy fix (usually JSC bindings changes should more or less mirror necessary changes in V8 bindings), fix it.
      3. Otherwise, ping for help from V8 bindings experts on #chromium or chromium-dev and collaborate to come up with a fix. Conveniently, we now have experts on V8 bindings across the globe.
    3. For "breaks-no-one" regressions, prepare new or updated baselines using rebaselining tool (use -w flag to pull baselines from the canaries).
      A Special Message From the Layout Test Task Force: Since you have the first-hand (well, technically second-hand) account of this happening, you as the gardener are in the best position to fix it. The worst thing you can do is roll this in as a "breaks-chromium" regression. Test expectation archeology is painful and tedious; many souls burned out doing it. Just look at the test_expectations.txt file: it's like a phyllo dough of random failures with varying degree of documentation attached to them.
    4. For "new-testing-capabilities" regressions, file a bug to implement missing test capability and add affected test(s) to test_expectations.txt.
      Tip: Use "CREATE BUG" link on the flakiness dashboard to make sure all labels are set up correctly on the bug. This will massively help with triage.
  4. Roll.
    1. Change webkit_revision in the DEPS file.
    2. Create and submit change list for review.
    3. Submit your change to try bots.
      Tip: If your change list contains new pixel baselines, the try bot won't like them. Tweak your change to exclude them until landing.
    4. Commit the roll.
    5. Watch the waterfall. Despite your best efforts, you may still turn some builders red. Work with the sheriff on the possible ways to fix. In non-trivial situations, such as undetected "breaks-everyone" or "breaks-chromium", consider reverting your roll and continuing on isolating and fixing the problem upstream (See "Diagnose Regression" step).
  5. Clean Up ExpectationsWhile gardening, you may encounter newly passing tests. If you see those, make sure to remove them from expectations and close corresponding bugs. It will make you feel great.
  6. Repeat.

Things That Can Go Wrong

Gardening is tricky business. Some equated it to dancing on the minefield, or running in front of a speeding train -- figuratively, of course. Nevertheless, it's important to recognize that unexpected bad things will happen and be prepared. Here are a few known "hair-on-fire" situations and solutions to them:
  • Purple canaries. Work with the sheriff or Green Tree Task Force to fix the bots.
  • Windows canary's webkit_tests report "crashed or hung", appears to exit before testing. Log in remotely to the canary, find env.exe in the Task Manager and kill it.

Two-sided Patches

Occasionally, your change may span both WebKit and Chromium. These types of patches are called two-sided, and whenever possible, try to avoid them.  They're quite often a huge amount of trouble and usually take way more effort than what it'd usually take to split it up:

Making a two sided patch a multi-step patch:

Often a 2 sided patch can be done as an initial patch, the other side, and then a cleanup patch.  For example, if I was changing a method in the WebKit API that looked like this:

virtual void setItem(const WebString& key, const WebString& newValue, const WebURL& url, bool& quotaException, WebString& oldValue) = 0;

And I want to change the bool& to a more general Result type, I can make the code look like this:

virtual void setItem(const WebString& key, const WebString& newValue, const WebURL& url, Result& result, WebString& oldValue)
{
    bool quotaException = false;
    setItem(key, newValue, url, quotaException, oldValue);
    result = quotaException ? ResultBlockedByQuota : ResultOK;
}
// FIXME: Remove soon (once Chrome has rolled past this revision).
virtual void setItem(const WebString& key, const WebString& newValue, const WebURL& url, bool& quotaException, WebString& oldValue)
{
    Result result;
    setItem(key, newValue, url, result, oldValue);
    quotaException = result != ResultOK;
}

Since the method started off as pure virtual, I know it must be implemented in Chromium.  Thus I don't need to worry about an infinite recursion problem, even though these both call each other.  Once this is landed and rolled in, I can then change which method Chromium uses.  And, finally, I can then remove this little hack from WebKit.

Every two sided patch is different and will likely need different tricks for different situations, but if you really give it a shot, you can often make your life much simpler.

Landing a two sided patch:

If you absolutely can't split up your patch, this is the procedure:
  1. Make sure the canary had a green run.  If there is a gardener on-duty, coordinate with them on this and the following steps.
  2. Land your patch upstream.
  3. Roll up to last green revision, which should be the revision prior to your landing.
  4. Prepare a patch that rolls one revision forward, including your fix. Make sure it builds locally and/or use try bots for other platforms.
  5. Land the patch.
Warning: If you do step 2 and then step 3 takes a long time for some reason or another, we highly recommend you roll you patch out of upstream.  The problem is that, as long as there's a compile break, your (or the gardener's) options are much more limited and you have much less visibility as to what's going on.  It's much better to wait a couple hours and then try the process over again.

If you are in a time crunch and can't avoid the two sided patch, you're much better off doing a non-two-sided hack initially and then coming back and doing the two-sided patch later.
Subpages (1): Webkit Gardener List