For Chromium sheriffing, we are migrating the wiki into markdown in the repo - see Chromium Sheriffing for the latest information on Chromium Sheriffing.For Chromium OS Sheriffing see Sheriff FAQ: Chromium OSI'm a new sheriff, now what?
What is a sheriff?
Sheriffs should have a strong bias towards actions that keep the tree green and then open. For main waterfall bots that are on the Commit Queue, if a simple revert can fix the problem the sheriff should revert first and ask questions later. For bots that are not covered by the Commit Queue and if the author is online, it's fine to ask them to fix asynchronously (since it shouldn't be blocking anyone, and it's not the author's fault as the change landed through the CQ). What is Sheriff-o-Matic?Sheriff-o-Matic (sheriff-o-matic.appspot.com) is a tool to automate the work of sheriffing. It shows you just the failures and it automatically intersects regression ranges across the bots for you. Currently it is used for the Chromium tree. If you're interested in making it work for other trees, see: crbug.com/409693. Findit is the culprit finder and auto-revert service for compile/test failures and flaky failures. Once a failure/flake occurs on CI/CQ, Findit is automatically triggered to identify and auto-revert the bad commits. Results are surfaced to Sheriff-o-Matic too. Flake Portal is the entry point for flakiness. Here describes its functionalities: detection & ranking flaky tests by negative impact, analysis to find culprit, verification of whether a test is still flaky, etc. All feedback are welcome to findit@chromium.org or via filing a bug here. What is a trooper?Troopers know more about maintaining the build infrastructure. They're the people to look for when the bots need an OS update, a machine goes offline, checkouts are failing repeatedly, and so on.
What is a gardener?Gardeners are watchers of particular component interactions. They generally watch a component's release or development and move the version included forward when it is compatible. This is called "rolling DEPS", and consists of committing a CL that changes the (Subversion) revision number in some Of particular interest to the Chromium projects are the gardeners who watch the interaction between Skia and Chromium, and those who watch the interaction of Chromium and ChromiumOS. Our 7 out of 8 goal
Common tasksClosing or opening the treeThe tree status can be closed or open. These status levels control the activity of the commit queue. If the tree is open, the commit queue runs as normal.
Effectively communicating tree closureAnnotate the tree status with information about what is known about the status of build failures. For example, automatic closure messages such as... ... should be changed to: ... to indicate that committer 'johnd' has been notified of the problem and is looking into it. Once a fix has been checked in, sheriffs often use status: ... to indicate that a fix/revert has been checked in and the tree will likely be opened soon. Alternatively, if the sheriffs decided to revert first and ask questions later, then the tree status should be changed to:
Effectively communicating tree repairsIf the tree has been closed for an extended time, particularly if the breakage covered more than one working timezone (US Pacific, US Eastern, Europe, Asia), it is considered best practice to communicate what was needed to fix the breakage. That way the next sheriff knows what's been happening, and people in other timezones know what to do next time it breaks the same way.
If the fix was simple, it can be listed in the tree-open status message, such as...
If a more detailed fix was needed, send email to the chromium-dev mailing list explaining what happened. It's a good idea to CC the current and upcoming sheriffs too.
Tips and tricksIt's clobberin' time!Sometimes you just need to clobber (i.e. force a full, clean rebuild of) some class of bots (win, mac+ninja, linux asan using make, etc.). You can do this by landing a landmine change. Docs are here: Chromium Clobber Landmines. Note: if a specific CL is causing bots to break unless they are clobbered, that CL should be reverted first and fixed to avoid this. Use chromium extensionsSee Useful extensions for chromium developers for more information. If a specific builder is affected by infrastructure problems, or something else that causes the builder to consistently fail, consider using the tree status to mark the builder as experimental while investigating. By marking it as experimental, the CQ will ignore failures from that builder and will not wait for it to complete, but it will still kick off that builder each CQ run. File a bug for the issue, and set the tree status like this: You can provide a comma-separated list of builders after EXPERIMENTAL=. PFQ BuildsDocumentation is here in the Pre Flight Queue documentation
Sheriff scheduleBuild sheriff calendar (authoritative)The authoritative list is on Google Calendar. Here's how to add the sheriff calendar to yours:
If you just want to see the calendars without adding them to your calendar, just follow these links: To see who the sheriff is, click an event and look at the guest list. (Yes, it would be nice if it showed the people in the event title, but then there's the issue of the event title and the guest list getting out of sync -- no easy answer.) To find when a specific person is going to be sheriff, use google calendar's advanced search box (click the down-triangle in the main search box), select the appropriate sheriff calendar, and type the person's username into the "Who" box. The script/process that updates the calendars can be found in svn://svn.chromium.org/chrome-internal/trunk/tools/build/scripts/tools/sheriff.
Find out who is currently the sheriffThere are a couple ways to see who the current sheriff is:
How to swapSchedule conflicts happen. You may need to swap your assigned rotation dates. A good approach is to pull up the sheriff calendar and reach out to individuals (in person or by hangouts) with rotations with nearby dates to yours -- they're often more willing to swap. It is not ok to only send an email to a mailing list and skip your shift because nobody replied, make a bigger effort if necessary. To swap shifts with someone, add them to the rotation so that the builders and other tools display the proper people as sheriffs. Instructions for swapping in rota-ng are here. Out of officeYou can mark yourself as out-of-office using RotaNG in order to ensure that you're not scheduled to sheriff while you're on vacation or on leave. Note that this only works if you mark yourself as OOO before you're scheduled. If you've already been scheduled to sheriff, you should try to swap with someone. If swapping is impossible, or if you're leaving the Chromium project permanently, file a bug here.
|
For Developers >