Sheriff Log: Chromium OS

2015-08-31
CrOS gardener: 
Sheriffs: cywang
  • 526629: HWTest [clapper] [bvt-inline] test timeout failure?
    • test seems finished after the test was stopped by 'timeout'?
  • 526641: pineview group: failed to build sys-apps/util-linux-2.25.1-r1
2015-08-24
CrOS gardener: 
Sheriffs: dhendrix
  • 524814: Canaries are falling over on autoupdate_EndToEndTest.paygen_au_canary_test_full
  • Provision failures that seemed to fix themselves.
    2015-08-21
    CrOS gardener: tbrazic
    Sheriffs: ejcaruso, bfreed, hychao
    • 517876: DUTs lost RPC connections
    • 523189: login_OwnershipTaken failures on multiple boards

    2015-08-20
    CrOS gardener:
    Sheriffs: ejcaruso, bfreed, itspeter
    • 522851: [Pri-0] Google cloud storage exception causing archive steps fail across major builtbot
      • Closing tree because this cover all the errors underground.
      • crosreview.com/294781 reverts crosreview.com/286913 which changed how we build and install rsa_id files.
      • Note: We cannot login to bots to confirm the missing file, but the log entry "CommandException: No URLs matched: /b/cbuild/external_master/buildbot_archive/daisy-incremental/R46-7381.0.0-b24819/id_rsa" gives the best clue.
    • 522785: buildpackages [x86-alex] [afdo_use] is flaky in pineview group canary
    • 518591: samus-release: provision failure, infra flaky, succeed in next build.
    • 503526: ivybridge-freon-release-group: DUTs pool is too small
    • 496036sandybridge-release-group: DUTs pool is too small
    • 523170: gizmo canary fails during BuildPackages
    • 523173: nyan group canary timed out during paygen
    • 523174: auron group canary has issues with rpm unit tests
    • 523139: sandybridge-freon group canary, x86-alex_freon paladin not found on in master and won't build

    2015-08-19
    CrOS gardener: girard
    Sheriffs: dlaurie, wiley, itspeter
    • 522533: lab problems creating various failures
    • 522540: amd64-generic and x86-generic chromium PFQ failures in security_OpenFDs test
    • 522528: HWTest run_suite failures
    • 522410: ap-daemons fails to build on storm, blocking many pre-cq runs
    2015-08-18
    CrOS gardener: girard
    Sheriffs: dlaurie, wiley, owenlin
    • Spontaneous network failure! Tests won't succeed without network.
    • 522130: LKGM timeouts on Chrome PFQ (fixed)
    • 522139: Paygen timeouts
    • 522141: sandybridge-freon-release-group builder needs to be removed
    • 522147: mario-incremental failing (fixed)
    2015-08-17
    CrOS gardener: girard
    Sheriffs: davidriley, waihong, owenlin
    • 521642: daisy-skate build failed on database error
    2015-08-14
    Sheriffs: davidriley, waihong, kcwu
    • 520931: provision Failure on samus-release, looks like hardware flaky.
    • 521046: VMTest kernel_CryptoAPI failed on lakitu canary
    • 521018Several canary groups timed out on the step "steps" (no attribute 'PrintBuildbotStepFailure')

    2015-08-13
    Sheriffs: jrbarnette, armansito
    • Multiple canary failures in the AM, especially 520311.
      • Pinned Chrome; waiting for the overnight canaries to prove whether that fixed it.

    2015-08-07
    CrOS gardener: michaelpg
    • 516978: No space left on device -> stateful partition sizeincreased
    • 518015: Bots haven't signed the new CLA; LKGM candidates not uploading -> CLA enforcement rolled back
    2015-08-06
    Sheriffs: cychiang, deymo, adlr
    • 517308: security_test_image faild with wrong fs type to mount recovery_image.bin. --> The image seems good and running security_test_image locally can pass.
    • 517388: mipsel-o32-generic full failed at ChromeSDK due to --hash-style defaulting to gnu
    • 517348: [paygen_au_dev] autoupdate_EndToEndTest.paygen_au_dev_test_full Failure on peach_pi-release/R46-7335.0.0
    • 517351: [sanity] provision Failure on lumpy-release/R46-7335.0.0 stage_artifacts timed out
    • 517460build_package failed at chromeos-chrome
    Chrome Gardener: stevenjb
    • 517238: RESOLVED: ExtensionTestMessageListener::WaitUntilSatisfied() causing flake across a large number of tests
      This turned out to actually be 515914 - browser_tests step fails even though all tests pass
    • 516978: P0 STILL IN PROGRESS: piglit file collision cause build_image failure. --> No space left on device
      This is causing PFQ failures
    • 517593: Frequent browser_tests time out with (TIMED OUT) in log
      This isn't causing any detected failures because the tests get retried, but it does slow down the tests a little and generates confusion when other issues (e.g. 
      515914) show up.

    2015-08-05
    Sheriffs: cychiang, dianders, denniskempin
    • 516978: piglit file collision cause build_image failure. --> No space left on device
    • 493752: Lab DHCP failures lead to "Host did not return from reboot". It causes PFQ build failure.
    • 517027: jecht and veyron_pinky not available in the lab.
    • 516795: veyron group canary is too close to its timeout. ---> seeing other boards fail of the same reason.
    • 514700: Samus failures due to stateful partition of Samus DUTs being bad.  ext4_lookup errors, "cannot remove" errors, etc.
    • 515880: No more samus devices in lab that are good (probably because of above bug).
    2015-08-04
    Sheriffs: djkurtz, dianders, denniskempin
    • 488291: Looks like vm_disk: failed_SimpleTestVerify is flaky and  throttled the tree on amd64-generic ASAN. 
    • chromeos-hwid broke in paladin several times.  Probably <https://chromium-review.googlesource.com/#/c/283583/>.
    • 516750: Samus failures due to stateful partition of Samus DUTs being bad.  ext4_lookup errors, "cannot remove" errors, etc.
    • FYI: binutils roll happening
    • 515528: AU tests failing.  I don't think they always ping this bug, though...
    • 516793: cbuildbot timeouts are hard for sheriffs to decipher.
    • 516795: veyron group canary is too close to its timeout.
    • 491290: SSH issue

    2015-08-03
    Sheriffs: djkurtz, moch, vbendeb
    • 516283: chromite: Unittest failure on veyron_pinky in mobmonitor/checkfile/manager_unittest: URLError: <urlopen error [Errno 111] Connection refused>
    • 516286: chromite: Unittest failure on "auron group canary" in sync_stages_unittest timeout?
    • 516295: zako BuildPackages fails in media-libs/fontconfig
    • 515528[paygen_au_canary] autoupdate_EndToEndTest.paygen_au_canary_test_delta Failure on falco_li-release/R46-7315.0.0
      • The autoupdate_EndToEndTest.* tests seem to be failing across many boards over the past ~3 days:
      • https://code.google.com/p/chromium/issues/list?can=2&q=autoupdate_EndToEndTest+modified-after%3Atoday-4&sort=-modified&colspec=ID+Pri+M+Stars+ReleaseBlock+Cr+Status+Owner+Summary+OS+Modified&x=m&y=releaseblock&cells=tiles
    • 516371Signer test failing due to unexpected kernel parameter (CL reverted)
    • 516430x86-alex paladin: bvt-cq timed out
    • 516457HWTestStage fails with run_suite.py
    • 488291amd64-generic ASAN failing on desktopui_ScreenLocker
    2015-07-31
    Sheriffs: vbendeb, moch, pprabhu
    • 5155Sheriffs: 60CrOS Tree Closure: ChromeOS killing DUTs in the lab
    • 515905SyncChrome fails on all Chrome PFQ (non-issue - caused due Chrome pinning because of the above issue, PFQ expected to fail)
    • 515937Temporary workarounds in lab to get DUTs off of a bad build
        2015-07-29 and 2015-07-30
        CrOS Gardener: alemate (29), afakhry (30-31)
        Sheriffs: pprabhu, grundler, zuethen, 
          • 501178: 13 groups timeout... HWTest of many boards are not running. Closed the tree.
          • 515479: kvm is missing on a lot of precq bots breaking vmtest (Fixed by davidjames)
          • 515201: Chrome crashes (AddInputDevice) affecting other services(Fixed in Chrome; see also 515154 43375)
          • 514700: Samus: file system corrupted after Kingston FW update (reverted Kingston FW update)
          • 434755: daisydog constantly restarting (Fix in CQ - x86_64 v3.8 kernel missing /dev/watchdog since last fall)
          • 515576: shill crashes took a bunch of DUTs "out of service": alex/zgb/peppy/lumpy/...  (chumped Fix in shill)
          • HW TestLab Network failure: outbound traffic was losing 60% or more packets (subsided; still monitoring)
          • 515905: SyncChrome fails on all Chrome PFQ (Chrome pinned to old version?)
          • 515302SitePerProcessBrowserTest.DiscoverNamedFrameFromAncestorOfOpener is failing on the official cros-trunk.
          • 515567: Compile failures on cros-trunk due to #include the generated header "ui/gfx/vector_icons2.h".
          • 516052: More Flaky tests in AutofillInteractiveTest on the official cros-trunk.
          2015-07-27 and 2015-07-28
          CrOS Gardener: bruthig, alemate
          Sheriffs: avakulenko, abrestic, vapier/henrysu
          • 514257: Lost a bunch of DUTs due to AFE going down last night
          • 514364: coreboot branch update caused some builders to go red.
          • 512996 BrowserEncodingTest.TestEncodingAutoDetect timing out on cros_trunk
          • 513593: [bvt-inline] security_SandboxStatus Failure on lumpy-chrome-pfq/R46-7296.0.0-rc1
          • 514401 Multiple build's are failing syncchrome with error: "reference is not a tree: 8a0429e414914781450ca007f20b0e511b3acff7"
          • 514499 provision_AutoUpdate failures on Rambi
          • 460860 more login_Cryptohome failures
          • 514504 desktopui_ScreenLocker failure on Auron (flake?)
          2015-07-22
          CrOS Gardener: bruthig
          Sheriffs:
          • 513593: [bvt-inline] security_SandboxStatus Failure on lumpy-chrome-pfq/R46-7296.0.0-rc1
          2015-07-21
          CrOS Gardener: jonross
          Sheriffs: quiche, shchen
          • 512417 New FecSendPolicy test is crashing on CrOS Trunk
          • 512427 Chrome PFQ failing on autotest-tests-ownershipapi
          • 512435 Canary board timeouts, missing boards for HWTest
          • 465862 Flake in desktopui_ScreenLocker, with ASAN
          • 509274 Canary timeout on candy
          • 512577 CrOS Trunk failing on ClickModifierTest
          • 513618 mipsel-o32-generic-full failed building Chrome

          2015-07-20
          CrOS Gardener: jonross
          Sheriffs: bleung, hungte, shawnn
          • 512010 Chrome OS Canary failure in BranchUtilTest failures: no such option --nobuildbottags
          • 512024 Master release failure
          • 511680 Samus HWTest failure
          • 491361 Strago VMTest failure
          • 508637 Jecht-family failure
          • 511542 winky DUT shortage
          • 512174 CrOS Commit /Queue HWTest failures: /var/log/storage_info.txt does not exist

          2015-07-17
          Sheriffs: bleung, shawnn
          • Known-issues causing multiple canary failures:
            • 484726: autoupdate_Rollback failing, possibly due to DNS issues in lab
            • 508637: rikku-family stuck at login screen
            • 510909: Paygen kernel hash issue. Should be resolved.
            • 505744: AutoservSshTimeout
          • 511317: login_OwnershipTaken timing out (silently?) 
          • 511502: libstrongswan missing symbols
          2015-07-15
          Sheriff: dbasehore, alecaberg
          • 510759: Paygen lock not acquired
          • 481092: builder must call SetVersionInfo first
          • 510909: paygen failure, kernel hash doesn't match
          • 509837: amd64 ASAN flake

          2015-07-10
          Sheriff: zqiu, marcheu, wuchengli
          • 510074 amd64-generic-llvm builder unittest failures
          • 509779 Flaky HWTest failures
          2015-07-10
          Sheriff: furquan, charliemooney
          • 465862 Flaky screen lock test
          • 508637 Rikku: Login screen problem in HWTest
          • 491290 Flaky SSH Failure

          2015-07-06
          Sheriff: puthik, rspangler, seanpaul
          • 507279 Lumpy/Falco pfq hwtest failing on timeout 
          • 501966 pool: bvt, board: lulu in a critical state
          • 507372 Drone refresh/execute took over 50s
          • 470701 Flaky BVT security_Firewall failure, "Mismatched firewall rules"

          2015-07-01
          Sheriff: posciak
          • 505918 CheckFileModificationTest timeouts
          • 506030 Payload generation failures
          • 506037 autotest-chrome build failures on missing dependencies

          2015-06-26
          Sheriff: cernekee, reinauer, kpschoedel
          • 505108: wolf-paladin and wolf-canary are failing, lab is closed.  Somebody will fix this Monday.
          • 505051: "Mismatched firewall rules" test failure on x86-generic build
          • 504947: HWTest failures on ivybridge, daisy
          • 504861: ASAN buildbot failures
          • 504860: HWTest did not complete due to infrastructure issues
          2015-06-25
          Sheriff: cernekee, reinauer, wiley
          • 504602: builders cannot get veyron_pinky chrome prebuilts
          • 504476: pre-cq builders are not running
          • 504400: Crash in SpokenFeedbackEventRewriterDelegate
          • 472895|AFDO generate| should only run if Chrome has changed
          2015-06-24
          Sheriff: dtor, wiley, gedis
          • 465862amd64-generic ASAN failure (desktopui_ScreenLocker fails - hitting regularly)
          • 430836: autoupdate_Rollback failure
          • 491598: platform_Powerwash flake
          • 412795: Refresh Packages is down
          • 488580: image_to_vm failing
          • 472895: Canaries failed while syncing Chrome
          2015-06-23
          Sheriff: bowgotsai
          • 488291flaky on login_LoginSuccess test
          • 465862amd64-generic ASAN failure
          • 503188: HWTest failure
          • 488580parrot canary: image_to_vm failing

          2015-06-22
          Sheriff: bowgotsai
          • 488291flaky on login_LoginSuccess test
          • 465862desktopui_ScreenLocker keeps failing on x86-generic ASAN VMTest
          • 502897: AU test failed on "No module named netifaces"
          • 502909Multiple HWTest failures
          • 502910CommitQueueSync failure
          • 503001: cbuildbot command failed on multiple PFQ builders

          2015-06-16/17
          Sheriff: josephsih
          • 500640Multiple HWTest failures
          • 500423: Payload integrity check failed: install_operations[490](MOVE): MOVE operation cannot have extent with start block 0
          • 481092: ManifestVersionedSync: RuntimeError: builder must call SetVersionInfo first
          • 483661: x86-generic full: vmtest failed in SimpleTestUpdateAndVerify
          • 501178CanaryCompletion: 20 groups timed out
          • 444876Clear and Clone chromite: remote: User Is Over Quota

          2015-06-15
          Sheriff: filbranden
          • 500640Multiple HWTest failures
            • Hardware is borked and we had a deputy outage, so we had to work around it by disabling the hw_tests that were failing.
            • CL 277672 disabling the tests.
            • ChromeOS Infra team to revert that CL once the hardware is working again.
          • 500394build_package fails on several PFQ builtbot since 6/12
          • 500423: Payload integrity check failed: install_operations[490](MOVE): MOVE operation cannot have extent with start block 0
            • Still open, if it keeps the tree closed we need to prioritize a revert (even possible?) or push a rushed fix.
          2015-06-15
          Sheriff: itspeter
          • 500394build_package fails on several PFQ builtbot since 6/12
          • 500423: Payload integrity check failed: install_operations[490](MOVE): MOVE operation cannot have extent with start block 0
          2015-06-12
          Sheriff: itspeter
          • amd64-generic ASAN VMTest failure
            • 465862Build haven't been succeed since 2015-06-10, desktopui_ScreenLocker keeps failing
          • x86-generic ASAN VMTest failure
            • 488291Seems to be flaky on login_LoginSuccess test
            2015-06-05
            Sheriff: tyanh
            Gardener: girard
            • Infra failures on Canary bots
              • 491290: autoupdate_Rollback Failure on rambi-release/R45-7142.0.0
              • 497035: autoupdate_EndToEndTest.paygen_au_canary_test_full Failure on link-release/R45-7142.0.0
              • 497059: HWTest suite prep aborted on Stumpy_moblab Canary
            • 497092: Chrome PFQ failing at BuildPackages across all builders - reverted patch - next build should be okay
              2015-06-04
              Sheriff: tyanh
              Gardener: girard
                • Infra failures on Canary bots
                  • 496552: [au] autoupdate_Rollback; Unhandled AutoservRunError: command execution error
                  • 496526: [sanity] provision; Unhandled HTTPError: HTTP Error 500: Internal Server Error
                  • 476324: HWTest provision failure=
                  • 460925: HWTest/sanity provision; Unhandled TimeoutException: Call is timed out
                  • Host did not return from reboot
                • 496523 mipsel fails continuously on PFQ

                2015-06-03
                Gardener: jonross
                • 460925 Chrome PFQ seeing Infra failures in HWTest, assigned to Infra
                  • Affecting: daisy_skate, falco, lumpy, peach_pit, 
                • 465162 469495 Infra failures blocking the PFQ
                • A series of Infra failures for Chrome OS Canary bots were auto-filled. Not enough bots in the pool
                • CL 274236 removed two bots from the Waterfall x86-generic-tot-chrome-pfq-informational, amd64-generic-tot-chrome-pfq-informational)
                • 496273 mipsel-032-generic having gyp error preventing a build.
                • 496293 Chrome OS PFQ is trying 45.0.2421.0 which does not contain the fix to 494912 whioh would unblock the PFQ
                • 496325 Flaky OAuth test on Linux Chromium OS
                2015-06-02
                Sheriff: robertshield, aaboagye, ssl
                Gardener: jonross
                • Chrome PFQ failures still holding up upreving.
                  • 494041494912: daisy_skate && lumpy PFQ failing on video_VideoSanity.
                    • Test was landed in this change assigned to developer who landed the tests
                    • Seems like an actual regression in video playback for ARM.
                    • Regression is from Chrome side between 45.0.2416.0 and 45.0.2417.0. Reverting the video changes in this diff does not seem to fix the issue. Trying to revert the WebRTC roll, but its failing locally. I've reached out to the WebRTC sheriff
                    • Suspects 80f289fe303323361d07c5b58b23f8499903a154c794eda78e9ba3c46b550b433e9fe5a248d40104 as the bug is not present with hardware acceleration off. Apparently my local build has an issue, as I still have the failure with hardware acceleration off.
                  • 413961: falco failed on stateful_update. It looks like the download got interrupted.
                • Canary Failures
                  • I believe these were network related... some are reporting high flakiness
                  • 476368495463428345 have been filed against the Canary failures.
                  • 493219: nyan group failed with Connection reset by peer.

                2015-05-30/31, 2015-06-01
                Sheriff: ssl, aaboagye
                • 482284: ivybridge-freon-release-group: The BuildImage [stout_freon] [afdo_use] failed - cannot open ‘/dev/loop0p1’ for reading: Permission denied
                • p/40797: oak fails to cross-compile img-ddk properly
                • 477928: (quawks, x86-zgb) autoupdate-Rollback - ssh: Could not resolve hostname chromeos2-row2-rack7-host6: Name or service not known (network went away briefly?? DNS issue?)
                • 493533: asan bots failed quipper unittests. x86-generic ASAN has been broken since 5/28.
                  • Mainly due to the quipper unittest failing, but also due to login_LoginSucess failure. See 488291.
                  • Waiting on this CL.
                • Chrome PFQ failures
                  • 494041, 494912: daisy_skate && lumpy PFQ failing on video_VideoSanity.
                  • 494909: lumpy PFQ failing on desktopui_FlashSanityCheck.
                • 495281: One off wolf-tot-paladin failed during HWTest [sanity] - provision: ABORT: reboot command failed
                • 493718: Chrome PFQ Failing to uprev Chrome commit - on tricky, which is experimental is a mistake. Disregard these emails.
                • 493219: Some canaries failed due to FAILED RPC CALL. (beltino-a/b group, sandybridge-freon, ivybridge-freon) Manifested as timed out and Connection reset by peer.
                • CL:274726: Fix lakitu build_image error when modifying kernel command line.

                OLDER ENTRIES MOVED TO THE ARCHIVE so this page doesn't take forever to load.  See Sheriff Log: Chromium OS (ARCHIVE!)
                Comments