Sheriff Log: Chromium OS

2016-02-05
Gardener: stevenjb
  • 584722 chromeos-chrome build failure: "No package 'gtk+-2.0' found" while running pkg-config with media.gyp

2016-02-04
Sheriff: dhendrix
  • 584542: sys-apps/toybox failing to compile on amd64-generic
  • 473899: paygen "Not all images for this build are uploaded", smaug has been seeing this for months.
  • 569358: pool: bvt, board: x86-mario in a critical state. (assigned now)
  • 584447: pool: bvt, board: veyron_mickey in a critical state. (assigned)
  • 571757: [sanity] provision Failure on expresso-release/R49-7760.0.0. Note: This manifested itself as a swarming failing when I updated the bug (#68).
2016-02-03
Sheriff: johnylin,grundler, dbasehore
  • 561036: FIXED: paygen timing out: dshi appears to have fixed this
  • 574915: VMTest failures in desktopui_ScreenLocker - jdufault investigating
  • 578771: GPT Header Issue
  • 579119: Unittest timeout
  • 581639: IGNORE: lakitu_mobbuild fails cloud_SpinyConfig: turning down this build (sosa)
  • 582144: FIXED: security_ASLR: reverting changed fixed problem (https://chromium-review.googlesource.com/324950)
  • 582325: veyron-b: rialto-services emerge fail
  • 582521: FIXED? error in gsutil: samus canary builds succeeded on Feb 02 19:15. Also seen on daisy.
  • 583081: FIXED: autotest-chrome build failures (https://chrome-internal-review.googlesource.com/#/c/247126/)
  • 583535: FIXED: login_* test failures: reverted https://codereview.chromium.org/1646223002 (alchuith, dup:583382)
  • 583684: FIXED: CommitQueueSync repo sync: manifest referred to a tag instead of branch
2016-02-02
Sheriff: grundler,dbasehore
  • 561036paygen timing out on release builders
  • 574915: VMTest failures in desktopui_ScreenLocker (later forked into three bugs)
  • 581639 - lakitu_mobbuild fails cloud_SpinyConfig (known issue)
  • 582521 - samus canary failed because of error in gsutil
  • 583375provision thrashing causing canary/beta build timeouts (kevcheng)
  • 583382: login_* tests failing (may be dup of 574915 or others)

2016-02-01
Sheriff: bleung, puthik
  • 582531 - flaky HWTest for Pineview/ strago-b / sandybridge
  • 583375 - canary and beta builds can cause provision thrashing which can cause hwtests to time out

2016-01-29
Sheriff: bleung, puthik
  • 582521 - samus canary failed because of error in gsutil
  • 581639 - lakitu_mobbuild fails cloud_SpinyConfig
  • 576879 - pool: bvt, board : candy in a critical state.
  • 582325 - veyron-b: rialto-services emerge fail

2016-01-28
Sheriff: bhthompson, shchenhychao
  • 582144security_ASLR test failing on glados, strago, strago-b with Unhandled TypeError

2016-01-27
Sheriff: bhthompson, shchenjchuang
  • 581598: archive stage failure at BuildAndArchiveFactoryImages 
  • 581624: gd-2.0.35 build failed on guado_moblab
  • 581630: docker build failed on lakitu_next
  • 543649: smaug paygen failing with "Not all images for this build are uploaded, don't process it yet" (does not cause canary failure, low priority)
  • 581631: cheets_SettingsBridge: Timed out waiting for condition: Android font size set to smallest
  • 581639: GCETest fail at 01-cloud_SpinyConfig on lakitu_mobbuild

2016-01-26
Sheriff: robotboy, semenzato, jchuang
  • 580184PFQ failed to build related to chromeos/ime/input_methods.h missing
  • 561036paygen timing out on release builders
  • 581382: perf_dashboard_shadow_config.json syntax error led to parse job failure (causing several timeout)

2016-01-25
Sheriff: littlecvr
  • 486098Builder failure HWTest Code 3 - not enough detail to debug
  • 561036paygen timing out on release builders
  • 547055Jecht Group Failed Archive Step

2016-01-22
Sheriff: littlecvr
  • 547055Jecht Group Failed Archive Step
  • 578771Paygen error: GPT_ERROR_INVALID_HEADERS
  • 558266[au] autoupdate_Rollback Failure on ultima-release/R49-7655.0.0
  • 580184Master: PFQ failed to build related to chromeos/ime/input_methods.h missing
  • 580261Update/provisioning timeouts during tests due to slow network
  • 579811lakitu-release build continuously failed at GCETest

2016-01-21
Sherif: deymo, zqiu, hungte
Chromeos Gardener: jennyz
  • 580184: Master: PFQ failed to build, related to missing chromeos/ime/input_method.h

2016-01-20
Sheriff: stevefung, dlaurie, hungte
Chromeos Gardener: jennyz
  • 579565: M49: PFQ Failing chromite unit testing on lumpy.

2016-01-14
Sheriff: stevefung, dlaurie
  • 322443: M49 PFQ failing unit tests
2016-01-14
Sheriff: vapier, zeuthen
  • 577549: lakitu_mobbuild_paladin fails at mariadb
  • 577542: build_packages fails at chromeos-mrc on strago canary and paladin build
  • 577836: lakitu_mobbuild_paladin fails at serf
2016-01-13
Sheriff: cychiang
  • 576905: pool: bvt, board: veyron_mighty in a critical state.
  • 576992: util-linux-2.25.1-r1 build failure on cyan canary build
  • 577025: TestFailure(paygen_au_dev,autoupdate_EndToEndTest.paygen_au_dev_full,Failed to perform stateful update on chromeos2-row2-rack10-host9)
  • 571747: TestFailure(sanity,provision,Failed to perform stateful update on chromeos4-row2-rack3-host1)
  • 505744: TestFailure(sanity,provision,Unhandled AutoservSSHTimeout: ('ssh timed out', * Command: )
  • 571884: [bvt-inline] security_ASLR Failure: No such file or directory: '/proc/32189 32187/maps'. (on PFQ)
  • 577549: lakitu_mobbuild_paladin fails at mariadb
  • 577542: build_packages fails at chromeos-mrc on strago canary and paladin build
2016-01-12
Sheriff: cychiang
  • 576525: chromeos-bootimage build failure on nyan_blaze: Unknown blob type 'boot' required in flash map
  • 576526: cheets_PerfBootServer failure at wait_for_adb_ready
  • 529612: lakitu_mobbuild: cloud_CloudInit fails in VMTest
  • 576549: lakitu_mobbuild canary build fails at GCE test because of quota exceeded
  • 576545: rambi-a-release group clapper build_packages fails at net-misc/strongswan
  • 571749: TestFailure(sanity,provision,Failed to perform stateful update on chromeos4-row5-rack8-host11)
  • 571747: TestFailure(sanity,provision,Failed to perform stateful update on chromeos4-row2-rack3-host1)
  • 505744: TestFailure(sanity,provision,Unhandled AutoservSSHTimeout: ('ssh timed out', * Command: )
  • 576608: security_AccountsBaselineLakitu fails with Baseline mismatch
2016-01-06
Sheriff: moch, zachr
  • 572745[bvt-cq] graphics_GpuReset Failure on falco-chrome-pfq
  • 574870: [sanity] dummy_PassServer.sanity_SERVER_JOB Failure on veyron-b-group canary
  • 574915: VMTest failures in desktopui_ScreenLocker, securityASLR, login_LoginSuccess
  • 574303provision Failure on cyan-release

    2016-01-05
    Sheriff: moch, zachr
    • 574501: amd64-generic ASAN vmtests failing (desktopui_ScreenLocker, buffet_InvalidCredentials, buffet_IntermittentConnectivity)

    2016-01-04
    • 574197 Peach group Canary failing since 12/29
    Gardener: stevenjb@/jdufault@
    • 574104 : LKGM builder needs to be updated to git
    • 573961 : Peach pit failures
      • Forcing a rebuild, looks like it might be infra flake: 'Failed to install device image using payload at...'
    • 574198 : PFQ flake, security_SandboxStatus
    2015-12-28
    Sheriff: itspeter
    Investigating across all the build status over the weekend of 12/25-12/27. Below are outstanding / repeating failures:
    • pineview-release-group HWTest [x86-XXX] [bvt-inline]
      • from build #1602- #1604
      • 569357: Indicates the following need to be recovered.
        • chromeos4-row5-rack13-host3,chromeos4-row5-rack13-host7,chromeos4-row5-rack13-host9
        • chromeos4-row5-rack13-host3,chromeos4-row5-rack13-host5,chromeos4-row5-rack13-host7
    • strago-release-group HWTest [ultima] 
      • [bvt-inline] from build #1597 - #1601, #1603, #1604
      • [sanity] from build #1602
        • 505744: TestFailure(sanity,provision,Unhandled AutoservSSHTimeout: ('ssh timed out', * Command: )
    • strago-release-group HWTest [cyan]
      • [sanity]: from build #1597, #1600, #1601, #1602, #1604
        • 547536: provision flake: Failed to install device image using payload            #1597, 1600, 1602, 1604
        • 568708: DownloaderException: Could not find *_full_* in Google Storage     #1601
      • [bvt-inline]: from build #1598, #1599, #1603
    • Still not able to root cause the [bvt-inline] across different boards.
    2015-12-25
    Sheriff: itspeter
    • 547536: provision flake: Failed to install device image using payload
      • Repeatedly occurred on strago, 
    • 571884: [bvt-inline] security_ASLR Failure: No such file or directory: '/proc/32189 32187/maps'.
      • Repeatedly occurred on jecht, strago
    • 571730: Flaky VMTest security_ASLR: Command <pidof debugd> failed
      • Repeatedly occurred on jecht (572093 merged), glados, auron-b, 
    • 568708: DownloaderException: Could not find *_full_* in Google Storage
      • Repeatedly occurred on rambi-d, ivybridge
    • 505744: TestFailure(sanity,provision,Unhandled AutoservSSHTimeout: ('ssh timed out', * Command: )
    2015-12-23/24
    Sheriff: wuchengli
    • 485197: Provision failure downloading stateful.tgz
    • 568708: DownloaderException: Could not find *_full_* in Google Storage
    • 546630: Peppy Paladin Provision Error
    • 571874: BuildPackages failed on gobi-firmware
    • 571730: Flaky VMTest security_ASLR: Command <pidof debugd> failed
    • 547536: provision flake: Failed to install device image using payload
    • 551003: cyan-cheets: [Errno 28] No space left on device
    • 548114: Autotest client terminated unexpectedly: We probably lost connectivity during the test..
    • 571884: [bvt-inline] security_ASLR Failure: No such file or directory: '/proc/32189 32187/maps'.
    • 569357: pool: bvt, board: x86-mario in a critical state.
    2015-12-21/22
    Sheriff: tfiga, tbroch, martinroth
    • 571599: daisy: Missing Manifest files in overlay-daisy
    • 571505: hwlab down due to DB capacity ( PFQ fail hwtest [ sanity ] )
    • 48735: missing Manifest in overlay-guado-private
    • 571221: Builders failing at "running steps via annotated script" stage

    2015-12-17/18
    Sheriff: josephsih, tbroch, martinroth
    • 569620: bvt-inline and paygen time out in canaries.

    2015-12-14/15
    Sheriff: gedis, benzh
    • 569620: bvt-inline and paygen time out in canaries
    • 569487: security_ASLR failures in lakitu canary
    • 569726: cautotest is down
    • 569979Paygen fails on all canary builders
      • according to @deymo, should cycle green. If not, ping @deymo, @gedis, @fdeng
    • 569983Unittest fail test_count_jobs: When n jobs are inserted, n jobs should be counted within a day range
    2015-12-14
    Sheriff: kitching
    Gardner:
    • 439136: Existing issue with google-breakpad on auron-b group canary
    • 569487: security_ASLR failures in lakitu canary

    2015-12-11
    Sheriff: kitching, aaboagye
    Gardner: achuith
    • 569163: Many CQ paladins failed at CommitQueueSync step.
      • See also b/26161444
      • Subsequent CQ run was unencumbered.
    • 568473CL:317573 chumped right before 18:00 PST, current canary builds should finish EOD
      • Still seeing canary failures even though CL:317573 landed. Recommend that they revert for now as it's blocking lakitu release.
        • It's actually a different error, but similar error string. Fix is in CL:317780.
          • Chumped before the 1800 PST launch of canaries. Hopefully that will be the last of the paygen issues.
    • CQ master failed due to CL:317573 being chumped (gob_util.py got Conflict: change is closed), should finish EOD

    2015-12-10
    Sheriff: aaboagye
    Gardner: achuith
    • 568473: Paygen error (Payload integrity check failed: Unsupported minor version: 3)
      • Fix should be going in, in CL:317573
    • 568496: tricky PFQ graphics_GpuReset Failure
      • Was just a one-off due to the DUT rebooting.
    • 550826: amd64-generic ASAN failed on buffet_Registration / buffer_BasicDBusAPI
    • One hiccup with the CQ master PublishUprevChanges stage.
      • Bug filed here - 568780: CQ PublishUprevChanges stage uses repo list as it existed before applying changes

    2015-12-08
    Sheriff: drinkcat
    Gardner: ,
    • 567936: lakitu-incremental failed with GCETest errors (for some reason it did not run for a week...) => Fixed+Verified
    • 550826: amd64-generic ASAN failed on buffet_Registration / buffer_BasicDBusAPI
    • 567989: SyncChrome issue is killing all the canaries, fortunately does not affect paladins (yet?) => Fixed+Verified
    • 568095: daisy_skate Bluetooth issue that I believe is causing some test flakiness (spotted it in Chrome PFQ daisy_skate) => bluetooth update reverted across all kernels
    • 568473: Paygen error (Payload integrity check failed: Unsupported minor version: 3)
    2015-12-08
    Sheriff: dgreid, scollyer
    Gardner: achuith, afakhry
    • Signer timing out
    • 567797: HW and VM tests failures due to adb_wrapper.AdbWrapper.KillServer error on chromeos

    2015-12-07
    Sheriff: jcliang
    • 529905: bobcat failed to setup_board
    • 47849: update-signal-relay build failed on daisy_winter canary
    • 543649: smaug canary failing paygen stage

    2015-12-04
    Gardener: stevenjb
    • 566057 AdbClientSocketTest.TestFlushWithoutSize and AdbClientSocketTest.TestFlushWithData flaky
    • Continuing to investigate: 
      566152: VMTest failures in login_RemoteOwnership, login_LoginSuccess, login_OwnershipApi, login_GuestAndActualSession
    • 566503 VMTest failure: security_NetworkListeners

    2015-12-03
    Sheriff: olofj, wiley, posciak
    Gardener: stevenjb
    • 565228: Multi canaries failing after 5 failed attempts to start VMs
    • 565349: dev server fails to start in mario-incremental
    • 566152: VMTest failures in login_RemoteOwnership, login_LoginSuccess, login_OwnershipApi, login_GuestAndActualSession

    2015-12-02
    Sheriff: olofj, wiley, posciak
    • 564870 : ERROR: <class 'chromite.lib.parallel.ProcessSilentTimeout'>
    • 514802: Provision fails with "start: Job is already running: autoreboot"
    • 564336: buildbot internal failure is not supposed to cause tree throttling

    2015-11-26
    Sheriff: cywang
    Gardener: jennyz
        • 561939: image signer stage is slow
        • 561990: Rikku: missing Manifest of chromeos-factory-board package
        • 563877: CQ failing to create valid manifest
        • 563878: crbug.com/new shortcut broken
        2015-11-25
        Sheriff: sonnyrao, avakulenko, cywang
        Gardener: ihf
          • 561208: {Rambi-a, jecht} group machines not available in test lab for HWTests.
          • 561214: HWTestDumpJson ERROR: No JSON object could be decoded
          • 561244: bvt test got aborted but the real test job completed successfully
          • 554043: UnitTest timeouts in the CQ
          • 560915: disable Bluez flaky unit test
          2015-11-24
          Sheriff: sonnyrao, avakulenko, yoshiki
          Gardener: ihf
          • 556785: builders fail due to timeouts. build/unit test stages take over 1 hour and process is killed due to timeout.
          • Test failure in lakitu_mobbuild canary, reverted CL: 239368
          • 561036 paygen timing out on release builders

            2015-11-20
            Sheriff: ejcaruso, briannorris, wnhuang
            Gardener: afakhry
            • 554222: provision failure on falco and daisy_skate PFQs. AutoservSSHTimeout.

              2015-11-18 and 2015-11-19
              Sheriff: wnhuang, davidriley
              Gardener:
              • 558366storm group canary build_package failed at wireless-regdb
              • 47849: update-signal-relay build failed on daisy_winter canary
              • 207003: peach_pit build_packages failed at exynos-pre-boot
              • 557578: veyron_minnie-cheets fails at chromeos-bsp-minnie-private
              • 452759: unit test timeouts on auron, rambi-a, glados
              • 516795: builds failing for exceeding 8 hour time out (auron, rambi, veyron, slippy, ivybridge)
              • 558457: ChromePFQ is all red.
              • 557449: cros_trunk is red.
              2015-11-17
              Sheriff: waihong, bfreed, dhendrix, jchuang
              Gardener:
              • 207003: butterfly & leon build_packages failed in chromeos-touch-firmware-samus
              • 557245 (was 549044): Canary failure: The Paygen stage failed: Image signing timed out
              • 557214: build310-m2 failing to repo sync, causes CommitQueueCompletion to fail: tricky-paladin did not start
              • 557106 and 557107: Samus canary failures (HW issues)
              • 557238: Veyron_minnie recovery image signing issue ("veyron_minnie has broken appid setting")
              • 557314: Tree closer: Pre-CQ Sync stage fails to pick up CLs
                • I do not know how to find pre-cq problems in general, but these provide clues:
                  • https://uberchromegw.corp.google.com/i/chromiumos.tryserver/builders/pre-cq
                  • https://uberchromegw.corp.google.com/i/chromeos/builders/Pre-CQ%20Launcher
              • 557364: Need to recover Rialto BVT machines to get TPM into a good state.
              • 207003: peach_pit build_packages failed in chromeos-touch-firmware-pit.
              • 552648, 536689, 535928: Multiple network_VPNConnect.l2tpipsec_xauth failures

              2015-11-16
              Sheriff: kcwu, waihong, bfreed
              Gardener:
              • 556529: Samus build_packages failed in chromeos-touch-firmware-samus
              • 25691600: Network/Hardware Issue with chromeos4-devserver2, possible cause of 540587: provision Failure (Failed to perform stateful update)
              • 556671: veyron canary: timeout_util_unittest failed
              2015-11-13
              Sheriff: kcwu
              Gardener:
              • 540587: provision Failure (Failed to perform stateful update)

              2015-11-10 and 2015-11-11
              Sheriff: jrbarnette, dianders, jchuang
              Gardener: 
              • 551279 x86-zgb paladin timeout in p2p, modem-manager-next unittest (3 fails in a row again)
              • 553424: login problems, including "Malformatted response" in login_OwnershipTaken and "Timed out going through login screen" in others.
              • 554043: unittest timeouts

              2015-11-06 and 2015-11-09 
              Sheriff: waihong, rspangler, hychao, dhendrix
              Gardener: dzhioev
              • 552452 glados group canary: Failed to create HWID v3 bundle
              • 543958 veyron-b-release-group: The priority of the inactive kernel partition is less than that of the active kernel partition.
              • Attempt to resolve some spammy, mass-autofiled bugs (there seem to be a lot that have gone unnoticed for several weeks):
                • 553442: Remote power management failing for many builders
                • 553579: video_VideoDecodeAccelerator failure seen on many builders
                • 553424: login_OwnershipTaken failing on multiple builders
                • 553575: p11_replay/chaps causing network_VPNConnect.l2tpipsec_cert test to fail
                • 553548: video_VEAPerf fails on many BVT machines
                • 549910: touch_TouchscreenScroll failures on Samus and Sumo
                • 553226: cyan, celes, veyron_rialto missing from KernelVersionByBoard "expected" file in autotest
                • For next sheriff rotation: If you get bored, please look at other autofiled issues and attempt to triage and find owners for them: https://code.google.com/p/chromium/issues/list?q=label:autofiled

              2015-11-04 and 2015-11-05
              Sheriff: dlaurie, cychiang
              Gardener: tdanderson
              • 551451 Failing BrokerFilePermission.* sandbox death tests are preventing a PFQ uprev
              • 547057 Paygen timeout
              • 547434 Paygen autotest client terminated unexpectedly
              • 548037 Paygen command execution failure
              • 551586 Paygen failed to create cache file
              • 545065 login_GuestAndActualSession_SERVER_JOB failure
              • 500094 Builder load in unittest causes some unittests to timeout
              • 542558 Mario HWtest pool health
              • 550768 bluez timeouts causing builders to fail randomly (NOT FIXED YET)
              2015-11-03
              Sheriff: bleung, deymo, cychiang
              • 550768 strago-paladin and tricky-paladin timeout while building bluez-5-r40
              • 550826 amd64-generic ASAN failed on buffet_Registration / buffer_BasicDBusAPI
              • 550840 [samus] bvt-inline test failed to remove /var/tmp/messages.autotest_start
              • 549472 [bvt-inline] security_SandboxStatus Failure on lumpy-chrome-pfq/R48-7595.0.0-rc1
              • 548535 [bvt-cq] video_ChromeRTCHWDecodeUsed Failure on tricky-chrome-pfq/R48-7589.0.0-rc1
              • 549044 The Paygen stage failed: Image signing timed out. Failure on samus-canary/7608.0.0
              • 546457 veyron_mighty interal server error on HWTest Failure on veryon_group_canary/R48-7608.0.0
              • 542558: pool: bvt, board: x86-mario in a critical state
              • 544654 [paygen_au_dev] autoupdate_EndToEndTest.paygen_au_dev_full Failure on candy-release/R48-7608.0.0
              • 551279 x86-zgb paladin timeout in unittest
              2015-11-02
              Sheriff: bleung, deymo, reveman
              Gardener: 
                • 548755 BranchUtil failure on canary master -> external and internal manifest out of sync
                • 546871 panther: git package seems corrupt while building

                2015-10-29
                Sheriff: chihchung, semenzato, shchen
                Gardener: abodenha
                  • 549044 Paygen image signing timed out
                  • 547541 CQ Failing in PublishUprev repeatedly
                  2015-10-28
                  Gardener: abodenha
                  • 548693 video_ChromeRTCHWDecodeUsed test has been flaking since Oct 9
                  • 548544 Compile failure on 8010 Builder #48.0.2548.0
                  2015-10-27 and 2015-10-28
                  Sheriff: abrestic, dbasehore
                  • 548257 paygen failures which moved to ASAN only failures
                  • 547057 paygen timeouts
                  • 548723 autotest timeouts on ivybridge and slippy devices
                  • 548755 BranchUtil failure on canary master
                  • 548804 manifest broken by duplicate remote

                  2015-10-27
                  Gardener: stevenjb
                  • 548358: PDFExtensionTest.Load failing on cros_trunk

                  2015-10-23
                  Sheriff: johnylin, alecaberg, shawnn
                  Gardener: jonross
                  • 431486: Multiple PFQ failure: shill uprev build failure
                  • 546865: Chrome PFQ master failed while running annotated script
                  • 518591: auron-release-group: HWTest flaky for test provision
                  • 546871: panther: git package seems corrupt while building
                  • 546921: lakitu-incremental: build error for chaps token_manager
                  • 546630: PFQ failure: Peppy Paladin provisioning failures.
                  • 415617: PFQ failure: moblab_RunSuite test failing in lab when trying to determine test platformguado_moblab paladin 
                  • 545779 PFQ not upreving builds. I believe it is related to the linked bug.
                  • 547055 Canary: Jecht group failed archive step
                  • 547057 Canary: Rambi Paygen failure, timeout after tests pass.
                  • 547116 CrOS trunk, Linux ChromiumOS, v8 roll broke a test
                  OLDER ENTRIES MOVED TO THE ARCHIVE so this page doesn't take forever to load.  See Sheriff Log: Chromium OS (ARCHIVE!)
                  Comments