Add your notes to the top, blog style.
2013-05-21 Tue & 2013-05-22 Wed semenzato, dgarrett, cywang - Ongoing -- (hwtest) Flaky power_Resume test on canary builders: 242788, 220014
- Ongoing -- (buildbot) autotest-telemetry build failed on PFQ, ASAN builders: 242770
- Ongoing -- (autest) Flaky autoupdate_EndToEndTest: 235608
- Ongoing -- (vmtest) Hung then killed on Falco, Peppy canaries: 242470
2013-05-13 Mon & 2013-05-13 Tues charliemooney, sheu - Ongoing -- Lots of problems with the AU rebooting canary builders: 235608
- Fixed -- The PFQ's are mad about thier dependencies when building expected_deps: 240601
- Fixed -- Some PFQ's were crashing due to a typo: 239754
- daisy_spring canary closed tree with media-libs/secomx build failure: crbug.com/239474. Possibly due to new clang syntax checking for cros_workon-able packages.
grundler,olofj,milleral - stout canary closed tree with AUtest failure: crbug.com/234725
- mario incremental failed: reopened tree since it feels like flake
2013-05-07 Tue grundler,olofj,milleral - dennisjeffrey CL killed the Commit Queue. Since it moved an autotest from one package to another, it affected successive tests as well. Needed to add a "!" (remove) dependency to remove/update the origin of the files before installing the new package. kudos to davidjames for clobbering everything and explaining how to fix.
- dgreid changes CL 49812 and CL 49921 enabled functionality that is broken in chrome version from two days ago that ChromeOS is currently using. ToT chrome is fixed but chromeOS didn't pick up the ToT last night due to other Chrome nightly build failures. dgreid will resubmit once ChromeOS has newer Chrome.
- CL adding apiclient to test image broke on canaries with a dev_install failure on VMTest. See crbug.com/238653, and CLs 49815 and 50308.
2013-05-06 Mon josephsih, piman - mario incremental: BuildPackages failed due to a platform2 ebuild (https://gerrit.chromium.org/gerrit/#/c/37366/). Revert the patch, and the builder cycled green.
- link canary: autest [au] failed report. crbug.com/237122
network_LTEActivate flakiness crbug.com/238404
2013-05-02 Thu josephsih - link canary, parrot canary failed at vmtest: "Unhandled JSONInterfaceError : Unable to get browser_pid over automation channel on first attempt."
- Root cause: "crossystem hwid" failed. cat: /sys/devices/platform/chromeos_acpi/HWID: No such file or directory.
- Filed a bug crbug.com/237719 which was merged to crbug.com/223728
- x86-alex canary: vmtest failed "Unhandled AutomationCommandTimeout: Chrome automation timed out after 45 seconds for {"skip_image_selection": true, "command": "SkipToLogin"}"
2013-05-02 Thu posciak, garnold, seanpaul - x86-alex canary failed hwtest step with "Unhandled PackageInstallError: Installation of pyauto_dep(type:dep) failed"
- Couldn't root cause it, so filed a bug at crbug.com/237508 and reopened
- security_HciconfigDefaultSettings autotest failures due to https://code.google.com/p/chrome-os-partner/issues/detail?id=15059
- Session manager did not restart after logout error on CryptohomeIncognitoUnmounted, filed crbug.com/237601
- Filed crbug.com/237690 for address sanitizer segfault on amd64-generic during vmtest
2013-05-01 Wed posciak, garnold, seanpaul - x86-mario canary failed au step with “FAIL: Unhandled timeout: timed out”
- stumpy, stout & daisy also failed on autest step
- suspect there was an AU outage/problem last night which caused this
- Filed http://crbug.com/237122 to track
- 03:04 lumpy nightly chrome pfq failed in VMTest
- this crash (https://storage.cloud.google.com/chromeos-image-archive/lumpy-chrome-pfq/R28-4071.0.0-rc2/chrome.20130501.035940.437.dmp.txt) is being tracked in http://crbug.com/233241
- 05:34 stout32 hwtest failed with “ERROR: All hosts with HostSpec ['board:stout32', 'pool:bvt'] are dead!”
- All stout32 hosts in cautotest are marked “Repair Failed”
- Filed http://crbug.com/237127
- 05:34 parrot canary failed in unittest
- seanpaul not sure what the problem is, so filed http://crbug.com/237143
- I think it's caused by https://gerrit.chromium.org/gerrit/49643, reverted with https://gerrit.chromium.org/gerrit/#/c/49721/ and reopened
2013-04-29 Mon waihong, - Parrot canary failure reported 2-day ago. The recent 2 parrot builds goes green and other builds also look good. Reopen the tree.
- Daisy canary failed, autoupdate_EndToEndTest could not verify that update was successful, crbug.com/23626
2013-04-26 Fri
rcui, taysom, spang - Link failed again in power_Resume
- Stout BVT: power_Resume: Sanity check failed: did not try to suspend - crbug.com/235847
- Lumpy canary failed on repeat of crbug.com/231095
- Lumpy paladin failure in desktopui_ScreenLocker test crbug.com/235949
- Parrot flaky test login_CryptohomeUnmounted crbug.com/223728
- Lumpy chrome crash crbug.com/231095 - this may be a new problem but we have only seen it on lumpy
- Asan builder failing BuildPackages on the chromium.memory waterfall - crbug.com/235988
2013-04-25 Thurrcui, taysom, spang 2013-04-18 Thur mtennant, jrbarnette, dshi (hwlab), mukai (Chrome on ChromeOS) - devinstall_test failure on all canaries - crbug.com/233217
- chrome crashes in CrosLanguageOptionsHandler::GetLanguageListInternal on a few builders - crbug.com/233241
2013-04-17 Wed mtennant, jrbarnette, dshi (hwlab), mukai (Chrome on ChromeOS) 2013-04-15 Mon vapier, jwerner - crosbug.com/p/17615 (power_Resume failure "Could not find start_resume_time entry" due to SSD hardware flake)
- crbug.com/22168 (unexpected reboot during login_LoginSuccess... can probably happen during all UITests)
- VMTest testUpdateKeepStateful error (cannot connect to KVM instance)... suspected flake
- crbug.com/232085: python 2.7 upgrade breaking hwtests
- coreboot repo shuffling; any coreboot related errors -> reinauer
2013-04-12 Fri fjhenigman, yusukes, dbasehore, rbyers (Chrome on ChromeOS) - crbug.com/230529
- Couple cased of lab flake
2013-04-11 Thu fjhenigman, yusukes, dbasehore sjg now, rbyers (Chrome on ChromeOS) 2013-04-10 Wed katierh, clchiou, haruki, gedis(shadow) - Unhandled AutomationCommandTimeout for {"skip_image_selection": true, "command": "SkipToLogin"} - already noted at crbug.com/223728
2013-04-09 Tue katierh, clchiou, gedis(shadow) - Daisy power_resume failure - already noted at crbug.com/189108
- ConnectionHealthChecker failures across the board - reverted https://gerrit.chromium.org/gerrit/#/c/47248 - crbug.com/229752
- butterfly autoupdate_EndToEndTest.npo_test_delta flake - bug filed crbug.com/229749
2013-04-08 Mon petkov, quiche, pstew 2013-04-05 Fri petkov, quiche, pstew 2013-04-03 Wed gabeblack, dgreid, sheckylin - 01:19 autoupdate_EndToEndTest.parrot_nmo_test_delta flakiness.
- 8:30 everything broken, EndToEndTest, Autoupdate, desktop_VideoSanity, all failing on different boards.
- 10:45 try to re-open after disabling VideoSanity, AUTest and power_Resume flakes.
2013-04-02 Tue rminnich, sonnyrao: west coast
2013-04-01 Mon rminnich, sonnyrao: west coast - 10am Link Canary Failed due to Archive step Time Out
- 10am Daisy Canary has been red all weekend -- found out about crbug.com/224871
- 1pm stout canary failed with Archive time out - opened crbug.com/225505
- 2pm x86-zgb canary failed with Archive time out - crbug.com/225505
- 8pm build packages started failing due to a gtest uprev and an associated python bug - https://gerrit.chromium.org/gerrit/#/c/46420/
- Chrome ebuild also failed to uprev due to above issue
2013-03-29 Fri bfreed, vbendeb: west coast 2013-03-28 Thu bfreed, vbendeb: west coast - 17:55pm - again connectivity issue, on x86 generic ASAN
- 17:37pm tree reopened
- 17:26 pm - another connectivity failure, davidjames took "amd64 generic ASAN" builder down as it seems more prone to experiencing this problem
- 16:29 pm Tree reopened, crbug.com/224811 filed
- 16:16pm "Could not resolve host: commondatastorage.googleapis.com"
- 16:15pm - tree reopened
- 15:39pm - "Unable to look up nv-tegra.nvidia.com (port 9418) (Name or service not known)" crbug.com/224819 filed to deal with external dependency
- 2:45pm: "no space left on device" on incremental builder, fixed by davidjames.
- 2pm: Same pool:bvt issue as below, this time with x86-zgb.
- 1pm: As with now-closed crbug.com/220032, "All hosts with HostSpec ['board:parrot', 'pool:bvt'] are dead". Suspect lab issue.
- Can view the list by going to http://cautotest/afe/#tab_id=hosts, then selecting Platform "parrot", then selecting Label "pool:bvt".
- 3am: vmtest failure closed tree on "amd64 generic ASAN". Subsequent builds worked, so maybe denniskempin fixed it.
2013-03-27 Wed wdg,dparker: west coast - 4pm: crbug.com/223728 Closed tree on "butterfly canary" Command "crossystem hwid" failed
- 2pm: crbug.com/223728 Closed tree on "mario incremental" Command "crossystem hwid" failed
- 1pm: Shill build failure closed tree on "x86 generic ASAN" and "amd64 generic ASAN". Reverted shill change https://gerrit.chromium.org/gerrit/#/c/46667/
2013-03-26 Tue wdg,dparker: west coast - 3pm: crbug.com/224403 Closed tree on "x86-zgb canary" autotest_rpc_client.py -- writing off as test flake but starting to think we blame brand new chrome version...
- 3pm: crbug.com/224077 Closed tree on "daisy canary" Device rebooted during power_Resume.
- 3pm: crbug.com/161406 Closed tree on "x86-mario canary" Unhandled AutomationCommandTimeout
- 2pm: crbug.com/223956 Closed tree on "x86 generic full". login_CryptohomeUnmounted failed but may be an underlying test framework issue.
- 8am: crbug.com/223956 (not a tree-closer, but...) Build 1187, Parrot Canary: Failed cbuildbot failed vmtest failed report
2013-03-25 Mon adlr,dhendrix: west coast - crbug.com/223661 (python free()'ing invalid pointers) strikes multiple times.
2013-03-22 Fri adlr,dhendrix: west coast - 1pm: crbug.com/217288 timeout during archive
2013-03-20 Thu quiche,wiley: west coast - 8am: XXX chromium.chromiumos VMTest failure
- 2am: crbug.com/222603 update engine failure on parrot-canary
- 1am: crbug.com/222021 desktopui_VideoDecodeAcceleration failure on x86-zgb
- 12am: crbug.com/222021 desktopui_VideoDecodeAcceleration failure on x86-mario
2013-03-20 Wed quiche,wiley: west coast - 11pm: crbug.com/222021 desktopui_VideoDecodeAcceleration failure on x86-alex
- 7pm: crbug.com/222021 desktopui_VideoDecodeAcceleration failure on x86-alex, x86-mario, x86-zgb
- 7pm: crbug.com/222660 AUTest failure on x86-mario
- 5pm: crbug.com/222021 desktopui_VideoDecodeAcceleration failures on x86-alex, x86-mario, x86-zgb
- 5pm: buildbot failures on amd64-generic-incremental, due to disk filling up
- 1pm: crbug.com/222021 desktopui_VideoDecodeAcceleration failures on x86-alex, x86-mario, x86-zgb
- 8am: crbug.com/222041 build_RootFilesystemSize failure on link
- 8am: crbug.com/222021 desktopui_VideoDecodeAcceleration failures on x86-mario, x86-alex
- 8am: daisy incremental failure: kernel gerrit mirror out-of-sync
- 4am: chrome PFQ failure on amd64-generic: kernel gerrit mirror out-of-sync
- 1am: crbug.com/222041 build_RootFilesystemSize failures on link, stout
- 1am: crbug.com/222021 desktopui_VideoDecodeAcceleration failures on x86-alex, x86-mario, daisy, x86-zgb, stumpy
2013-03-19 Tue tbroch,thieule: west coast - 1pm: crbug.com/221258 kernel warning in power_Resume on daisy
- 8am: crbug.com/222041 build_RootFilesystemSize fails as rootfs <100MB across most x86 systems
- 8am: crbug.com/187993 experimental_desktopui_VideoSanity
- 8am: network problem leading to vmtest fail
2013-03-18 Mon tbroch,thieule: west coast - 3pm: Transient network problem while emerging chrome
- 9am: crbug.com/217288 UploadArtifact task timeout (1800secs)
- 8am: crbug.com/215358 intermittent (hopefully) 'Exception: Missing uploads.'
- 8am: crbug.com/187993 experimental_desktopui_VideoSanity
2013-03-14 Thursday dlaurie, sbasi - 8am: ARM build broken overnight due to build flags change, reverted here: https://gerrit.chromium.org/gerrit/#/c/45430/
- 8am: GDB issues causing problems for Chrome PFQ, this "fixed itself" on retry
- 1pm: Commit queue stuck, mario-paladin waiting for alex-paladin
2013-03-07 2013-03-08 Tues-Wed ferringb, charliemooney
2013-03-08, Fri sjg, sabercrombie
2013-03-08, Fri rspangler, sabercrombie - Flake on amd64 generic ASAN uploading results to google storage
- Canaries failing due to https://gerrit.chromium.org/gerrit/#/c/44890/; can't download libva-1.1.0.tar.bz2. Uploaded what we hope is the right file. It turns out that was not the right thing to do. The problem stemmed from two versions of libva carrying the 1.1.0 designation, which led to an old cached version messing up the download process on the canary buildbots. Mike Frysinger removed these old files.
- Stout canary failed with "NoHostsException: All hosts with HostSpec ['board:stout', 'pool:bvt'] are dead!" - http://crosbug.com/39746. johndhong and jrbarnette investigated; lots of systems are in Repair Failed state, probably due to a DHCP problem this morning. They kicked off a verify on all hosts, and the hosts started coming back on their own.
- Paladins failed with "ERROR: Project name mismatch for /mnt/host/source/src/platform/depthcharge (found chromiumos/platform/depthcharge, expected chromeos/platform/depthcharge)". Probably caused by rev 1 of https://gerrit-int.chromium.org/#/c/33529/.
- mario paladin was stuck waiting for stout paladin, but stout was idle. Aborted mario paladin build; all paladins seem to be building normally now,
2013-03-04 - 2013-03-05, Mon-Tue dianders, dkrahn - dianders: ASAN failures (use after free). Appears to be intermittent, but a real bug. http://crbug.com/179796
- dianders: ASAN failure "No such file or directory: '/home/.shadow'". Digging into logs showed cryptohome not starting. Digging more showed "cryptohome: symbol lookup error: /usr/lib64/libchaps.so: undefined symbol: __asan_handle_no_return". Liam identified as https://gerrit.chromium.org/gerrit/#/c/44508/. Reverted and chumped. Re-opened http://crosbug.com/32017 to track. Re-opened tree.
- Parrot canary failed with http://crosbug.com/32539.
- dianders: Some strange transitory failures across many builders with "update_scripts Sync buildbot slave files failed ( 9 secs )". Didn't seem serious and went away on its own, but David James tracked it down as http://crbug.com/180099.
- dianders: Failure with SDK builder on vboot_reference (it couldn't find <tss/tcs.h>). Filed http://crosbug.com/39531. Chumped in a CL that ought to fix this.
- dianders: Tree was closed overnight with x86 generic full failure. A timeout building chromite? Didn't reproduce...
- dianders: Hit the x86 generic full failure again. Filed http://crosbug.com/39565.
- More 'update_scripts' failures: tracking in crbug.com/180099.
- dianders: Got a BVT failure in experimental_desktopui_VideoSanity on x86-alex canary. Filed http://crosbug.com/39586.
2013-02-28 - 2013-03-01, Thur-Fri sheu, dgarrett - Im Westen, (fast) nichts Neues
- git infrastructure issue takes down a bunch of builders: crbug.com/179141
- rename of gerrit-int repos without updating manifest takes down more builders: crosbug.com/39448
2013-02-26 - 2013-02-27, Tue-Wed grundler, benchan - daisy powerResume failing on chromeos1-host5-rack4 crosbug/39260
- documented daisy repro case. (crosbug.com/39153)
- link canary failure (crosbug/p/17893) (found dups of this bug too)
- stout canary failure (crosbug.com/39272)
- alex/stumpy failed power_Resume due to new warning in Kernel - was reverted (crosbug.com/p/17609)
- parrt-canary failed due to "Session manager did not restart" (after following a chain of "merged into" --> http://crbug.com/167671)
2013-02-22 - 2013-02-25, Fri-Mon ? 2013-02-20 - 2013-02-21, Wed-Thutaysom, garnold, zork 2013-02-19 - 2013-02-20 Mon, Tues
ellyjones, reinauer, sque
Feb 14, 15 Thu, Fri snanda, posciak
Feb 12, 13 Tue, Wed dparker, semenzato Feb 8, 11 Fri, Mon jaysri, milleral, olege - Filed crosbug.com/p/17781 for power_Resume gen6_gt_check_fifodbg issue
- Someone put a test that belongs in autotest-chrome into autotest-tests again, so BuildTarget is having to repeat emerging of autotest-tests again.
Feb 4, 5 Mon, Tue rharrison - Failure on security_RestartJob for x86-mario canary, looks like flake, filed crosbug.com/38628. Reopened tree
- Failure on power_Resume for link canary. looks like crosbug.com/37596. Reopened the tree
- Came into a red tree on Monday(failure on stumpy), all the builders were green. Assuming it was flake, possibly from the fun with the HW lab over the weekend
Jan 31, Feb1 Thu, Frivpalatin, bleung, hungte, benrg Jan 29, 30 Tue, Wed bfreed, bhthompson, katierh, petermayo
- tree still red Tuesday morning due to crosbug.com/38334 - revert of nss/nspr upgrade is resulting in segfault in local shlibsign. These are security packages that might also cause the sandbox failure of crosbug.com/38309.
- tree throttled Wednesday morning due to crosbug.com/33611 - timing out on VMTest update steps. Failed to get through normal channels to find a flaw, rebooting the mario paladin build slave was sufficient.
Jan 25, 28 Fri, Mon mtennant, gabeblack - crosbug.com/38334 - revert of nss/nspr upgrade is resulting in segfault in local shlibsign. Found after hours by vapier. P0 TreeCloser unresolved.
- crosbug.com/38309 - Chrome crash on startup in renderer thread. Causing major problems and possible overnight red tree. P0 TreeCloser unresolved.
- crosbug.com/38324 - vmtest testInterruptedUpdate failure in canary builds.
- crosbug.com/38303 - git clone command in Chrome/Chromium PFQ builders suddenly asking for password. Resolved.
- crosbug.com/38279 - shill unittests segfault, intermittent, fix at: https://gerrit.chromium.org/gerrit/#/c/42113/. Tree throttled as fix worked its way through commit queue then all canaries. Bug got through commit queue originally because it is intermittent. Resolved.
- crosbug.com/38238 - stout canary - vmtest - testInterruptedUpdate - cannot allocate memory
Jan 23,24 Wed,Thur rcui, sjg, dbashore
Jan 22, Tu jamescook (chrome-on-cros) - crosbug.com/38117 - PyAutoFunctionalTests.FULL flakily reporting sig 6 from an intentional Chrome crash
Jan 17 Thurs, Fri mkrebs, rminnich, sheckylin
- crosbug.com/37343: Xorg signal 6
- crosbug.com/33611: on stout, seems not to be fixed, missing pxe rom for virtio.
- crosbug.com/33611 again: "amd64 generic full" closed the tree this time.
- If you're going to be helpful and post error messages, the best way to be sure you don't say anything you should not is to mention the error, the software, but not the file name.
- We ought to just fix this vm error due to a missing pxe_virtio.bin. I will see what I can do.
- crosbug.com/37682: Repeat failure in HWTest: "x86-mario-release/R26-3571.0.0/bvt/platform_CryptohomeMount ABORT:".
- crosbug.com/38054 (created): login_CryptohomeMounted failed with "Login timed out". Couldn't find a similar bug that was open (crosbug.com/33613 seemed to be the closest closed issue).
- [mkrebs] Saw a bunch of "Chrome PFQ Failing to uprev Chrome" emails on the 17th. Was allegedly a failure to build a certain package, but the fix was taking a while to land. They seem to be fine now, but my best guess is that ellyjones@ actually got them working early on the 18th (on IRC he mentioned something about restarting the mario paladin around that time).
Jan 15 Tue, Wed chinyue, sonnyrao, yusukes - crosbug.com/37889: x86-alex-release/R26-3560.0.0/bvt/experimental_kernel_fs_Inplace_SERVER_JOB FAIL: HTTP Error 500: Internal Server Error
- crosbug.com/37899: "desktopui_ScreenLocker failure in bvt on parrot-r26" hits on stumpy canary, stout canary, x86-alex, parrot bvt, possibly x86-generic as well
- chromium for chromium-os builder started failing VMTests due to automation timeouts around 1pm on Wednesday, might affect Chrome on ChromeOS starting Thursday
Jan 11, 14 Fri, Mon
dgreid, pstew - crosbug.com/37716: HWTest [bvt] failed at login_CryptohomeMounted: Cryptohome created a vault but did not mount (and Host did not return from reboot) - parrot canary
- crosbug.com/p/11474: power_Resume test failing with "gen6_gt_check_fifodbg.isra.6+0x36/0x48()"
- crosbug.com/37861: power_Resume test failing with "EarlyWakeupError(1)"
Jan 9, 10 Wed, Thu djkurtz, jrbarnette, olofj
Jan 7, 8, Mon, Tue clchiou, jwerner, josephsih
All canaries have been failing randomly in login_Cryptohome* tests due to crbug.com/168540. Chrome team has pushed a fix that should get synched during the night between Jan 8th/9th. If the same issue still shows up after that, please let them know! - crbug.com/168540: parrot-canary: login_CryptohomeMounted
- crosbug.com/37684: Updater failed and many *_SERVER_JOB failed on daisy canary
- crosbug.com/37682: HWTest [bvt] failed on platform_CryptohomeMount on x86-mario canary
- crosbug.com/32181: try_new_image: Host did not return from reboot. Connection timed out.
- crosbug.com/37676: stumpy-canary and lumpy-canary died from an experimental test because the crash server timed out on symbolizing the crash dumps
- crbug.com/168540: parrot-canary and kiev-canary: login_CryptohomeUnmounted. This can probably happen on all the login_Cryptohome* tests.
- crosbug.com/p/17115: power_Resume fails on stout... flaky NIC sometimes fails to resume
- crosbug.com/37596: power_Resume abort bvt
- crbug.com/168540: x86-alex canary: login_CryptohomeMounted : Session manager did not restart after logout
2013 Jan 3, Jan 4, Thu, Fri dbasehore, wfrichar, hychao
dkrahn, sque, miletus - crosbug.com/37337: vmtest login_CryptohomeMounted: browser hang during shutdown (multiple occurrences)
- crosbug.com/37461: vmtest Unable to connect to X server causing 2400 second timeout (multiple occurrences)
- crosbug.com/37504: desktopui_VideoSanity fails to load video (not a tree closer)
- crosbug.com/36949: stout BVT. platform_Pkcs11Events (not a tree closer, multiple occurrences)
- crosbug.com/32539: python2 sig 6 during login_BadAuthentication test
- crosbug.com/33613: login_CryptohomeIncognitoUnmounted of VMTest has failed in login timed out for >5 times
- crosbug.com/37522: Login_BadAuthentication failed during HWTest (BVT) on Alex
Dec 20, Dec 21, Thu, Fri rspangler, dhendrix, dgozman - crosbug.com/37337: vmtest login_CryptohomeMounted due to chrome crash (multiple failures)
- crosbug.com/37372: vmtest login_CryptohomeUnmounted due to chrome or X crash
- crosbug.com/35458: vmtest login_CryptohomeUnmounted times out waitng for UI to restart at the end of the test
- crosbug.com/33611: vmtest unable to connect to remote host (ssh: connect to host 127.0.0.1 port 9222: Connection refused)
- crosbug.com/32382: vmtest desktopui_ScreenLocker failing
- crosbug.com/p/11474: stumpy-canary is failing power_Resume test with warning in i915_drv.c.
- crosbug.com/36986: daisy incremental build failure, believe git mirror was out-of-sync ("git-2_branch: changing the branch failed")
- kiev, daisy, stout paladins failed a build, and mario paladin was stuck waiting for them. Killed mario and forced a rebuild. (In retrospect, just killing mario paladin was probably sufficient)
- crosbug.com/37461: vmtest Unable to connect to X server causing 2400 second timeout
- crbug.com/167342: trying to get some Chrome devs to look into Chrome shutdown crash (which in turn caused session manager timeouts and VMTest failures)
- crosbug.com/37368: vmtest login_CryptohomeMounted timeout waiting for login prompt
Dec 18, Dec 19, Tue, Wed kochi (non-PST), dlaurie, puneetster - started open with status "hwtest failure = dependencies_info not being generated properly -> crosbug.com/37326".
- crbug.com/140385: login_CryptohomeMounted timed out happend 3 times on x86 generic incremental.
- crosbug.com/37332: desktopui_ScreenLocker fail with timeout on mario incremental. happened only once.
- crosbug.com/37333: empty dependency_info causing hw_tests failure: LOTS
- autotest-tests failing the first build and succeeding on retry, suspect desktopui_VideoSanity, email sent to developer
- butterfly-canary failed with "Could not parse devserver log" possibly crosbug.com/34768, was successful on next build
- amd64-generic-full failed vmtest login_CryptohomeMounted due to chrome crash, filed crosbug.com/37337
- tlsdate issue determining its release number and causing failures in uprev step, fixed with https://gerrit.chromium.org/gerrit/39915
- login_CryptohomeUnmounted causing Chrome/X to crash, filed crosbug.com/37372
- 12/19 11AM: Still seeing lots of vmtest failures due to issue 37337
- x86-zgb canary failed BuildTarget step for zgb_he phase because build_packages was killed, filed crosbug.com/37388
Dec 14, Dec 17, Fri, Mon sabercrombie, thieule, zoro, rongchang Dec 10, Dec 11, Mon, Tues charliemooney, tbroch, milleral(10th), yjlou(11th) - llvm.org went down, taking out chromiumos sdk during buildtarget
- crosbug.com/37120: A BuildTarget reported back with a warning from a python crash while building chrome.
- crosbug.com/37129: buildbot threw an exception during VMTest due to a failed assertion.
- crosbug.com/37086: Daisy TPM related activities need >= 2min to complete not current 45sec. Fix in and propagating.
Dec 4, Dec 5, Tue, Wed quiche, anush, spang - crosbug.com/35908: hit this on an x86-generic-full build and a daisy-canary build
- crosbug.com/36986: daisy incremental build failure, believe git mirror was out-of-sync
- crosbug.com/36969: link canary BVT failure, tree cycled green
- x86-mario canary failure, google storage flake
- crosbug.com/36949 - stout BVT. platform_Pkcs11Events (not a tree closer, multiple occurrences)
- crosbug.com/36661 - stout BVT. platform_Pkcs11ChangeAuthData (not a tree closer)
- crbug.com/157246 - caused a snow BVT failure (not a tree closer)
- false alarm email for buildbot failure stout-canary. sbasi checked the BVT results, and says the tests passed.
suspects network flake causing buildbot to believe the BVT failed.
Nov 30, Dec 3, Fri, Mon dparker, piman, fjhenigman (Mon. only)
- google storage flake during archive step on x86-alex
- crosbug.com/36886 - kiev BVT. Power_resume fail on reading RTC fail after 10 retries.
- crosbug.com/36554 - daisy BVT. platform_CryptohomeChangePassword fails to migrate password
- crosbug.com/35458 - mario-r23 BVT. login_CryptohomeUnmounted times out waitng for the UI to restart at the end of the test.
- stout-canary. HWtest failure due to infrastructure problems in the hwtest lab.
- x86-mario canary. ABORT on security_ptraceRestrictions. Believed to be a test flake or lingering fallout from test lab going down (?)
- crosbug.com/p/11474. stumpy-canary x 2. Power_resume error with warning in i915_drv.c.
- crosbug.com/36004. Power_resume failure reading RTC on kiev & lumpy canaries.
Analysis of the BuildTarget warnings
WARNING: The following packages failed the first time, but succeeded upon retry. This might indicate incorrect chromeos-base/autotest-tests-0.0.1-r3342 autotest-tests-0.0.1-r3342: ERROR:root:Dependency pyauto_dep does not exist so the problem could be a change introduced between those two times. milleral on irc suggested a "test was likely added to autotest-tests.___.ebuild that needs to be in autotest-chrome.____.ebuild" but I don't see a change there at the right time.
Nov 20-21, Tue, Wed reinauer, sleffler, fjhenigman - crbug/36566 CQ build failures in update_engine with "unrecognized command line option "-Wno-c++11-extensions""; fixed by kliegs
- crbug/29895 filed by Prashanth for x86-alex-r23 bvt failure in power_Resume
- crbug/35908 desktopui_UrlFetch.not-live FAIL hit three times overnight in the chrome pfq
- All quiet on Tue the 20th
Nov 16, Nov 19, Fri, Mon waihong (tpe), keybuk, garnold
Nov 14 - Nov 15, Wed, Thu jamescook (cros gardener) - crbug.com/161329 BVT chrome sig 11 on shutdown, crash in ash GetDisplayManager() due to metrics logging, official builds only
- lumpy (perf) failing HWTest, "All hosts are dead" in [try_new_image] results status.log, infrastructure problem, fixed
- crbug.com/161073 ChromeOS Crash in WindowOpenPanelTest.ClosePanelsOnExtensionCrash
- crosbug.com/36370 Snow: BVT login_LoginSuccess failure due to cryptohome / TPM issue (only affects chromeos1-rack5-host3, maybe preMP hardware issue?)
Nov 9, Fri puneetster, sheu, kinaba Nov 6, Tue gpike, grundler, reveman - crosbug.com/35907 parrot canary, Crash in HWTest - enterprise_DevicePolicy
- crosbug.com/36058 Chrome PFQ, Chrome/Init getting a lot of SIGBUS errors preventing Chrome from revving during VMTests
- crosbug.com/36097 parrot canary, desktopui_NaClSanity: Failed to installed SecureShell extension
- crosbug.com/35648 x86-alex canary, experimental desktopui_DocViewing failure closed tree
Nov 5, Mon josephsih - crosbug.com/32028 x86-alex canary, Archive bug, command timed out: 9000 seconds without output (davidjames fixed it.)
- crosbug.com/36032 chromiumos sdk failed SDKTest. make: *** [build/shims/shill-pppd-plugin.so] Error 1
- crosbug.com/35908 amd64 generic full: Timeout in UrlFetch.not-live
Oct 31 - Nov 1, Wed, Thu taysom, petermayo, wdg - crosbug.com/35865 Kiev paladin hwclock bug, same on link, timeout in URLFetch
- crosbug.com/35908 Daisy flake; said there were no changes but widevine was failing to link properly
- crosbug.com/35648 daisy, parrot problems
- reverted change I02955c8e
- Google died but it got better.
- crosbug.com/35958 daisy incremental ran out of space, clobbered chroot
- CQ got stuck
Oct 29 - Oct 30, Mon, Tue wfrichar, pstew, cwolfe Oct 23 - Oct 24, Tue, Wed katierh, olege, mkrebs Oct 19 - Oct 22, Fri, Mon jrbarnette, mtennant, hungte Oct 17 - Oct 18, Wed, Thudgreid, dbasehore - Day starts with tree closed due to Link now being over-size. crosbug.com/p/35412
- crosbug.com/34788 lumpy canary: HWTest failed likely due to lab networking issue
- crosbug.com/35469 link canary: warning on build due to missing coreboot dependency
- Single VMTest failures on all canaries (passed afterwards)
Oct 15 - Oct 16, Mon, Tue
bfreed, vbendeb - crosbug.com/35199 hit the mario and zgb canaries.
- A few hours later, canaries now fail HWTest with "TimeoutError: Timeout occurred- waited 8400 seconds." cmasone is investigating network outage.
- crosbug.com/35347 link canary: desktopui_DocViewing fails in doc_viewing.DocViewingTest.testOpenOfficeFiles with "Extension could not be installed".
- crosbug.com/35354 link canary: desktopui_NaClSanity fails in secure_shell.SecureShellTest.testLaunch with "Extension could not be installed".
- crosbug.com/35357 link canary: desktopui_DocViewing fails in doc_viewing.DocViewingTest.testOpenOfficeFiles with "Chrome automation timed out after 45 seconds"
- Throttling the tree. I see consistent failures on various tests and on "try-new-image-*".
- Not sure if this is server overload or chrome causing the failures. Nothing points to chrome-os, best I can tell.
- A set of 3 CLs broke shill in a lumpy PFQ. https://gerrit.chromium.org/gerrit/#/c/35702/ fixed it.
- crosbug.com/35388 x86 alex canary: HWTest during SuitePrep: Connection timed out
Oct 11 - Oct 12, Thu, Fri
rcui, sjg - Link failed on BVT HWTest again
- crosbug.com/35222: HWTest fails power_Resume with 'Autotest client terminated unexpectedly'
- Noticed that failing test has a status log which shows success. According to sosa this is a network flake. Ignoring.
- crosbug.com/33613: login_CryptohomeIncognitoUnmounted timeout.
Oct 9 - Oct 10, Tue, Wed rharrison, bleung, sonnyrao
- crosbug.com/35147: Daisy full failing due to issue with binutils (Appears to be a repeat of crosbug.com/34667)
- crosbug.com/35148: amd64 generic incremental timed out after 8 hours on BuildTarget (Pinged troopers@, since this bot appears to be sick)
- crosbug.com/34567, crosbug.com/35151, crosbug.com/35150: Link failed on BVT HWTest
- crosbug.com/33613, crosbug.com/35151, crosbug.com/35150: x86-zgb failed on BVT HWTest
- crosbug.com/35162: qemu-kvm failed to link with glib-2.32.4-r1
- crosbug.com/35173: Came into very red tree due to bad WebKit roll and failure of the PFQs to prevent Chrome on ChromeOS from updating. This issue was created from the fact that we were patching WebKit in ChromeOS, there is a thread discussing that we shouldn't do this again. Many late arriving bots failed after the fix was in and the tree had to be reopened.
- crosbug.com/35201: some canary builders (parrot, stumpy, kiev) failed in svn update. Connection reset by peer
Oct 3 - Oct 4, Mon, Tuegpike, sjg, kamrik - crosbug.com/34990: power_Resume.py failed trying to treat IP address as a float
- crbug.com/150568: butterfly R24 Chrome crash in ExtensionAppProvider (same bug has hit R23 recently) (twice)
- crosbug.com/34825: svn flakiness downloading / unpacking chromeos_chrome (again)
Oct 1 - Oct 2, Mon, Tue piman, rspangler, ellyjones
Sept 27 - Sept 28, Thu, Fri rspangler, keybuk, rongchang - crosbug.com/34825: svn flakiness downloading / unpacking chromeos_chrome (twice).
- ManifestVersionedSync failed on all canaries. rcui, ferringb determined gerrit replication was failing and fixed it.
- chromium:150568: canaries failed with "FAIL: Unhandled JSONInterfaceError: Chrome automation failed" (multiple times)
- VMTest timeout: x86_generic_incremental.
Sept 25 - Sept 26, 2012, Tue, Wed dianders, davidjames, yoshiki crosbug.com/34571 crbug.com/150604: Numerous test failures in BVT and VMtest with Unhandled JSONInterfaceError: Chrome automation failed prior to timing out ...- crosbug.com/34126: Chrome PFQ vmtest failure - alex and lumpy - Failed to installed SecureShell extension - Fixed, but see 34796 below
- crbug.com/152189: Daisy chrome PFQ: create_nmf.py: Not a valid NaCL executable - Fixed
- crosbug.com/34785: desktopui_DocViewing failed on lumpy canary - Any repeats?
- crbug.com/151855: hitting canaries (like butterfly build 367); originally this was thought to be crbug.com/150604 but that's because I didn't dig deep enough (I just saw the "Chrome automation failed..."). You need to dig into the artifacts and look for the "dmp.txt" file to see the real chrome crash. - Hitting all the time
- crosbug.com/34796: Secure Shell did not get correct exit message
- Saw some strange try_new_image failures in https://uberchromegw.corp.google.com/i/chromeos/builders/stumpy%20canary/builds/1934. milleral thought they were just warnings so no bug filed, but he's going to look at them. Failures are due to crosbug.com/34788.
- crosbug.com/34576: 'desktopui_LoadBigFile: ERROR: The big file did not load' during x86-mario hw
Sept 21 - Sept 24, 2012, Fri, Mon olofj, dparker, chinyue
Sept 19 - Sept 20, 2012, Wed - Thu marcheu, thieule, falken, sbasi, armansito Sept 13 - Sept 14, 2012, Thu - Fri jaysri, gabeblack, sheckylin Sept 12, 2012, Wed semenzato ,pstew Sept 11, 2012, Tues wdg, semenzato ,pstew Sept 7 - Sept 10, 2012, Fri - Mon rcui, tbroch , josephsih
tlambert, vbendeb, kochi
tlambert, vbendeb, kochi (9/5-6 JST)
Sept 4, 2012, Tues mtennant, sonnyrao, vapier, kochi (9/5-6 JST) - Tree started the day closed, due to crosbug.com/34102, a vmtest flake due to chrome timeout. See run for mario incremental.
- Two internal Chrome PFQ builders are also failing, since at least last Thursday, which has effectively caused the version of Chrome to be pinned.
- http://chromegw/i/chromeos/builders/lumpy%20nightly%20chrome%20PFQ (crosbug.com/34129)
- http://chromegw/i/chromeos/builders/alex%20nightly%20chrome%20PFQ/ (crosbug.com/34126 created and assigned to UI). Efforts to enlist Chrome sheriffs and ChromeOS chrome gardener did not get anywhere.
- Another instance of crosbug.com/34102. The current owner is out of office today, krisr re-assigned to craigdh.
- This time crosbug.com/34102 hit the "x86 generic full" builder. The bug is getting attention from test team now.
- Another instance of crosbug.com/34102 on Mario Incremental -- added logs to the bug
- x86-alex failed HWTest, sosa commented on IRC "looks like a false negative as i was rebooting/restarting the devservers when this happend so the update payloads weren't avialable on the devserver" -- re-opened and watching other canaries still running HWTest
- meanwhile, hit another instance of 34102 on Mario Incremental
- then another instance of 34102 on x86-mario Canary -- HWTest didn't seem to run (was orange)
Sept 3, 2012, Mon mtennant, sonnyrao, vapier - Labor Day holiday in United States
Aug 31, 2012, Fri adlr, ferringb - Sameer checked in a kernel change that caused all(?) machines to oops, reboot after ~10 seconds. Reverted the change. crosbug.com/34081
- crosbug.com/34102
Aug 29, 2012, Wed miletus, garnold, mkrebs - Tree closed due to "Kernel image is larger than 8 MB" (crosbug.com/34039). Reverted changes that added parted to initramfs.
- Note: Reverts finally got merged in at about 8pm, so builds started before that could still fail (depending on their kernel size).
- tree closure following x86 generic full VM test failure due to python crash; filed http://code.google.com/p/chromium-os/issues/detail?id=34025, tree re-opened.
- Autotest failure: "Not logged in" error in platform_Pkcs11Persistence (possibly crosbug.com/32166).
- Autotest failures: several more "supplied_Compositor sig 11" failures (crosbug.com/33906). Also a "supplied_nacl_helper_boo sig 11" failure, which I added to that issue since it's also Chrome.
Aug 28, 2012, Tue miletus, garnold, mkrebs - x86-alex and x86-mario canaries failed in hwtest (login_CryptohomeIncognitoUnmounted and login_CryptohomeUnmounted, respectively); investigation reveals network issues related to http / mysql server, tree re-opened.
- lumpy, x86-mario and x86-zgb canaries failed in hwtest; latter two due to login issues, former on desktopui_{KillRestart,AccurateTime}. variety of failing bots suggests a transient flakiness. lab sheriff (jrbarnette) informed, tree re-opened.
- Autotest failures: Bunch of failures with "Login timed out" and "chrome_200_percent.pak". Turns out the chrome_200_percent errors are a red herring (they don't cause failures: crbug.com/143850). These are really login issues (crosbug.com/33841).
- Autotest failures: "supplied_Compositor sig 11" in desktopui_DocViewing (crosbug.com/33906).
Aug 24 Fri djkurtz (TPE), dgreid, katierh - lumpy canary failed enterprise_DevicePolicy http://crosbug.com/33435
- alex canary failed, enterprise_DevicePolicy, power_Resume (one login failure and an instance of crosbug.com/33435
- zgb canary failed imaging chromeos-rack6-host7 - multiple network failures on this board
Aug 22 - Aug 23 Wed/Thu taysom, dhendrix, dgozma - x86-alex canary and x86-zgb canaray failed in HwTest during login
- x86 generic incremental failed in flaky FMTtest
- For login problems (crosbug.com/33841)
Aug 20 - Aug 21 Mon/Tue cywang (TPE) Aug 16 - Aug 17 Thu/Fri waihong (TPE), posciak (MTV), bfreed (MTV) - x86-mario canary failed with a Chrome crash: crbug.com/143495
- chromium.chromiumos amd64 failing most of the day, crosbug.com/33613
- flaky chromiumos-sdk: gtk-doc failing in configure, but intermittently
- Flaky tegra2 full archive step's been failing intermittently on archive stage due to crosbug.com/30031, will be getting rid of tegra2 bots Fri or Mon
- Several packages failed with "select error: (4, 'Interrupted system call')", suspect something killed a build: crosbug.com/33617
- mario and alex canary failed due to HWTest losing connections, will be resolved itself.
- Failed to connect to virtual machine: crosbug.com/33611
- security_ptraceRestrictions failing: http://code.google.com/p/chromium-os/issues/detail?id=33531
- security_ASLR failing: http://code.google.com/p/chromium-os/issues/detail?id=33590
- filed issue http://crosbug.com/33613 for recent >5 builds failed in login timed out.
Aug 10 - Aug 13 Fri/Mon sleffler (SFO), quiche (MTV) Aug 8 - Aug 9 Wed/Thu sheu (MTV), bhthompson (MTV)for chromeos-factory - Intermittent flakes from security_SeccompSyscallFilters tracked in crosbug/33403. Revert of promotion to bvt chumped in.
- parrot canary failure due to 27c54ab in third_party/coreboot; fix chumped in.
Aug 4 - Aug 5 Sat/Sun - I'm not actually sheriff today, but this is a note to sheriffs over the weekend and early Monday: there's a possible unit test failure in shill that made its way into the tree which could fail in build and cause a failure. If this happens, feel free to submit https://gerrit.chromium.org/gerrit/29242/ in order to fix it. It's waiting for normal review, but if it does end up causing trouble, chumping it is the right thing to do. (pstew)
Aug 2 - Aug 3 Thu/Fri fjhenigman (WAT), benrg, snanda Jul 31 - Aug 1 Tue/Wed ?
Jul 25 - Jul 26 Wed/Thu dennisjeffrey (MTV), sosa (MTV), hungte (TPE) - bot hung after successfully completing archive stage but before the report stage; forcefully killed by buildbot after 9000 seconds. Seems to be a rare flake. Filed http://crosbug.com/32944.
- lots of errors connecting to Google Storage (curl failures). Google Storage team was contacted and they fixed the problem on their end. Followed-up by filing http://crosbug.com/32986 to track the task of updating the version of gsutil used on the chromeOS builders (a recommendation by the Google Storage team).
- another "python2 sig 6" error. Updated existing bug http://crosbug.com/32539, which is currently under investigation.
Jul 23 - Jul 24 Mon-Tue dkrahn(MTV), dtu(MTV) Jul 19 - Jul 20 Thu-Fri puneet(MTV), rminnich(MTV), seanpaul(MTV)
Lots of failures to network issues, the biggest symptom being curl fails.
July 17 - July 18 Tue/Wed msb(MTV), kamrik(WAT)
- crosbug.com/32539: pyautolib sig6 crash - test passes but leaves a crash file behind. Saw this thrice.
- Bunch of tegra flakiness issues. Told to ignore.
Jul 13 - Jul 16 Fri/Mon grundler(MTV), sabercrombie(MTV)
canaries were mostly fine on Friday. More failures on Monday: - crosbug.com/32439: "zgb failed on update-engine". Saw similar AU timeouts on lumpy, x86-mario, and zgb.
UPDATE: "Issue was devserver overloading and deploying apache and fixing crashes that happened every test run has resolved this issue." - crosbug.com/32385: "mod_image_for_recovery failed on arm-daisy canary". Saw this once.
- crosbug.com/32539: pyautolib sig6 crash - test passes but leaves a crash file behind. Saw this once.
Jul 11 - Jul 12 Wed-Thu nirnimesh(MTV), piman(MTV)
Canaries repeatedly kept breaking due to update_engine problems. - butterfly canary failed VMTest with 'No space left on device' on image (not host). Updated on existing bug crosbug.com/32454
- x86-zgb canary failed HWTest with "Host did not return from reboot." Updated on existing bug crosbug.com/32181
- tegra2_kaen canary failed HWTest with "update-engine failed". Updated on existing bug crosbug.com/32129
Jul 5 - Jul 6 Thu-Fri chinyue(TPE), dhendrix (MTV), ferringb (MTV) - Thu Jul 05, 06:30 UTC: amd64 generic full failed: update_engine unittest takes too long to finish. (http://crosbug.com/32096)
- Fri Jul 06 - ?: update_engine unittest fails on multiple internal builders during the FilesystemCopierAction test (http://crosbug.com/29841#c42)
- Thu Jul 05, 07:33 UTC: stout canary failed: ManifestVersionedSync took too long (6+ hours) and thus BuildTarget didn't have enough time to finish. Seems a glitch, re-opened tree.
Jul 3 - Jul 4 Tue-Wed nirnimesh(MTV), rharrison(WAT) - chromium.chromiumos bots were dying in the VMTest, Chrome sheriffs fixed that.
- Potentially saw this filter through to x86 alex canary. File crosbug.com/32382
- mario canary failed a couple of times due to HWTest losing connections over night, resolved itself.
- amd64 generic full failed due to unit tests taking too long. Filed crosbug.com/32380. This occured again on x86 alex canary.
- FilesystemCopierActionTest.RunAsRootSimpleTest in update_engine failed for no apparent reason. File crosbug.com/32366
- stumpy canary failed in HWTest with "StageBuildFailure" and "500 Internal Server Error". Filed crosbug.com/32361
- Saw instance of prebuilts getting a 500 on upload
29 Jun-2 Jul Fri-Mon rspangler(MTV), mtennant(MTV), waihong (TPE)
27-28 Jun 2012 Wed-Thu benchan (MTV), dparker (MTV), josephsih (TPE) - x86-alex canary and x86-zgb canary failed => crosbug.com/32181.
- Failded at HWTest [bvt]: try_new_image FAIL: Host did not return from reboot.
- This might be related with crosbug.com/31748: system failed to respond on the network to cause reboot timeout. Alex and zgb seem particular hard hit.
- amd64 generic full failed => crosbug.com/30518
- Failed at cros_run_vm_update in VMTest. Networking sometimes failed to come up maybe due to a bug in VM network driver.
- lumpy canary failed => crosbug.com/32195 . Unhandled AssertionError: Could not create /home/chronos/Consent To Send Stats. during VMTest. No obvious cause. Reopened the tree and kicked the builder to see if problem reoccurs. Other canaries are passing.
- lumpy/stumpy/tegra2_kaen canary failed => crosbug.com/32228.
- Failed at HWTest [bvt]. Seemed to be network problem.
25-26 Jun 2012 Mon-Tuesdianders (MTV), bfreed (MTV), clchiou (TPE) - ~8am MTV: amd64-generic-inc is failing, but looks like a builder issue (as found by kliegs / ellyjones). Tree still open. Looking for a trooper; fixed by pschmidt. resolv.conf was empty on the builder
- Kaen canary has been failing since last Friday. 2086 - 2090 were various HWTest failures. Now it doesn't even do the update. http://crosbug.com/32129 for the update problem. Not a closer, so assuming bug filed is enough.
- Autotest failure in bvt on x86-mario-r22 R22-2490.0.0. Flake? Don't see info about the failure.
- chromium.chromiumos failure: http://crosbug.com/32139
- All canaries died. Theory by davidjames is <https://gerrit.chromium.org/gerrit/19401>. Revert is here: <https://gerrit.chromium.org/gerrit/#change,26077>
- x86-mario canary died. Reported http://crosbug.com/32166.
- tegra2_kaen canary died the same way it was dying Friday night. That is an improvement over the weekend failures. http://crosbug.com/32012.
- tegra2_kaen and x86-mario canaries died. tegra2_kaen canary => crosbug.com/32012; x86-mario => crosbug.com/32166
- Think x86-mario may be a flake and just a longer timeout needed? Need owner
- Not sure about tegra2_kaen
- parrot canary failure http://crosbug.com/32173
- Retry didn't help. Trying a clobber retry.
- kliegs reverted lumpy hwtest connection to the bots: http://chromegw/i/chromeos/changes/2521
- Uprev failing; kliegs manually modified .repo/manifests on mario paladin and kicked bots. This looks to have fixed uprev failures and vmtests also passing. Still hobbled by lumpy hwtest failures (timeouts take 30mins).
- All canaries failing with HWTest [bvt] Suite prep 502 Proxy Error (crosbug/31921). tammo: Tree throttled, as I have no idea what to do about this.
- Tree throttled for vmtest failures; MTV sheriffs left for the day w/o resolution (PSA posted to chromium-os-dev@)
- Paladin's stuck so force stopped alex+stumpy paladin's and clobber+force build mario.
- Lumpy paladin hw tests are timing out backing up the CQ by ~15mins. Attached to existing crosbug/31916.
- Autotest failure in bvt on stumpy-r22 (R22-2471.0.0): after the test passed, Chrome crashed, and there was no stacktrace due to http://crosbug.com/31151 ; ddrew created http://crosbug.com/32038
- Looks like a network issue caused gsutil to hang (link canary); created crosbug/32028.
19-20 Jun 2012 Tue-Wed taysom, wfrichar, kliegs, vapier
- Tree closure due to RPC failure by build server http://crosbug.com/31981
- Tree closure due to failure to upload prebuilts to Google Storage (gsutil flake; at http://crosbug.com/31580)
- Tree closure due to race condition in cleaning up. Appeared to be the same as http://crosbug.com/30031
- Chromiumos-tegra2 failed due to disk full - the build people with access to that server were in Las Vegas
15 & 18 Jun 2012 Mon & Fri sosa, quiche, djkurtz - Fri Jun 15, 06:15 UTC: "parrot canary" closed: crosbug.com/31883
- Sat Jun 16, 07:30 UTC: network flake during BVT on lumpy canary
- Sun Jun 17, 07:32 UTC: 502 Proxy Error during suite prep on x86-alex canary
jrbarnette, rcui, kinaba
11-12 Jun 2012 Mon-Tue bleung, petkov, thieule 7-8 Jun 2012 Thu-Fri craigdh
4 Jun Mon fjhenigman, dtu, tlambert - 8:37am PDT - amd64 generic incremental closed tree when vm16-m2 disk filled up - could not find a trooper but Peter Mayo helped - thanks Peter
- 2:05pm PDT - x86 zgb canary failure first thought to be upload_symbols flake, but investigation indicates those errors are not fatal - looking for real cause
- 2:40pm PDT - paladins blew up real good, vapier identified and fixed it as a permissions issue - thanks vapier
- 3:46pm PDT - mario incremental http://crosbug.com/30880
- 7:14pm PDT - lumpy canary http://crosbug.com/18587
1 Jun Fri fjhenigman, dtu, tlambert
31 May Thu sque, pstew, josephsih 30 May Wed sque, pstew, josephsih - Unable to generate file identifier for ec.obj. hungte reverted it. (http://crosbug.com/31386)
- Dependency failure for autotest-deps-piglit-0.0.1-r1450 on amd64-generic (looks like a flake, but http://crosbug.com/31389)
- update_engine_unittests flake (http://crosbug.com/29841) -- this issue continues to close the tree.
- Network problem?
29 May Tue miletus, semenzato, sonnyrao - amd64-generic full failed on Archive, opened new bug crosbug.com/31332
- VMTest flak on mario-incremental - crosbug.com/31067
- unpinned Chrome from 21.0.1150.3
23-24 May 2012 Thu-Fri sabercrombie(MTV), vbendeb(MTV), kochi(TOK) - closure by VMTest flake (http://crosbug.com/31067)
- tree broken by libssl update. ellyjones fixed it. Ongoing problems caused by failure to rebuild binpkgs dependent on openssl.
- Chrome build broke various UI tests: http://code.google.com/p/chromium-os/issues/detail?id=31291. Pinned Chrome to 21.0.1150.3.
- mario-incremental VMTest failure with two apparent variants of http://crosbug.com/31067
- link paladin u-boot build failure -- change reverted
- cros_mark_as_stable broken. fix chumped in.
- gcc change broke builds. reverted.
22 May 2012 Tue cwolfe, marcheu, micahc - the experimental "unified lumpy paladin" is down on disk full; it is being moved to another machine so does not need a cleanup
- mario-incremental failed with "login_CryptohomeIncognitoMounted ... Chrome did not reopen the testing channel after login as guest" http://crosbug.com/20286
- unclutter was causing retries in various builds; fixed by cwolfe https://gerrit.chromium.org/gerrit/23227
- link canary failure on chromeos-u-boot; fixed by sjg
- everything else failed on svn server problems; fixed by maruel and nsylvain
18-21 May 2012 Fri-Mon dkrahn, puneetster, hashimoto - VMTest failure: crosbug.com/31067.
- Tree broken by manifest change: olofj fixed.
- Archive failure on lumpy canary: crosbug.com/30854.
- Update engine failure on tegra2_kaen canary continues: crosbug.com/31019.
- Multiple occurances of storage error: 'transfer failed with bytes remaining': davidjames filed crosbug.com/31103.
- gpsd timeout on x86-generic full: crosbug.com/31096.
- Another google storage failure on tegra2-full: 'No valid URLs found' exception: davidjames in contact with storage team.
- Autotest timeout on amd64-generic for test: SimpleTestUpdateAndVerify. Subsequent VMTest stage passed.
17 May 2012 Thu dlaurie, grundler, yusukes(TOK)
16 May 2012 Wed dlaurie, grundler, yusukes(TOK) - Came in to red tree, all canaries failed in vmtest. http://crosbug.com/30952 Eventually we pinned Chrome to 21.0.1137.5, but that was not usable for ARM so it is being moved forward again.
- Chromium OS also has vmtest failure attributed to http://src.chromium.org/viewvc/chrome?view=rev&revision=137395
- 2pm: amd64-generic-incremental failure: collect2: ld terminated with signal 7 [Bus error]. This was my fault (dlaurie) for not applying the binhost change to other targets when I pinned chrome.
- 2:45pm: amd64-generic-incremental is out of disk space, escalated to troopers
- arm-daisy canary build failed (closed tree) due to missing dependency in chromeos-bootimage (was built in parallel and "usually" built)
- 5pm: network timeout trying to retrieve cros_sdk
14 May 2012 Mon mkrebs, bhthompson, falken(TOK)
[TOK] - mkrebs: filed crosbug.com/30880 for "tegra2_kaen canary": JSONInterfaceError => GetNextEvent => "received empty response"
[TOK] - x86-mario and stumpy canaries failed, maybe same as http://crosbug.com/30854
- ferringb@ on IRC: some duplicated output issues, "if you see anything screwed up, for example, if vmtest has parts of unittest logs in it, please open bugs for it w/ links to the specific failures"
- also: builders that hang without output for a long time are probably out of disk space. CLs are coming.
4 May 2012 Fri snanda, ellyjones, katierh - crosbug.com/30575 - arm generic full builder had a timeout on gpsd though it builds locally and did not break other builders. Will watch the next build (already in progress)
3 May 2012 Thu pstew, gpike - BVT test run last night still has failures in graphics_WindowManagerGraphicsCapture, but now they are just failures and not segmentation faults. Test disabled, so it should not feature in the BVT on 4 May. crosbug.com/27587.
- Monitoring login_CryptohomeIncognitoUnmounted failure which seems to have failed VMTest on a couple of platforms last night, but is cycling green.
- Issued crbug.com/126133 for crashing on Link paladin in chrome!BaseTab::AdvanceLoadingAnimation. Chrome gardener is flackr@, not bshe@ as the waterfall shows, due to swap. This issue is claimed is and verified in early builds on May 4.
- Persistent "Timed out waiting to revert DNS." messages on Link paladin builds. crosbug.com/30472 This appears to be a side effect of the bug above causing tests to end prematurely. Submitted a CL (making its way through the queue) which will landed land before the Chrome change so we were able to confiurm that this fixes this secondary issue.
- Transient failure in VMTest on x86-generic. Filed bug crosbug.com/30518.
- A couple of glitchy builds due to some dependency swaps for the parted package. Should have cycled through all builds, but contact benchan@ if parted features in any build failures tomorrow.
2 May 2012 Wedpstew, gpike - BVT failure in graphics_WindowManagerGraphicsCapture (segmentation fault). Assigned crosbug.com/30402 to ihf@ who wants to "wait and see" how it does in BVT toight.
- Canaries are red: crosbug.com/30376. Fixed by scottz@ who reverted the offending change.
- Creation of "swap.conf" in chromeos-init conflicts with platform-specific swap.conf: crosbug.com/30397. Reverted this, and michahc@ will land a more comprehensive change.
- UploadPrebuilts phase for multiple architectures failing with "GSResponseError:: status=502, code=None, reason=Bad Gateway." Appears to have been a temporary server failure -- monitoring.
1 May 2012 Tue jrbarnette, thutt, inter alia - Tree started the west coast day green
- Occasional update_engine unit test failures due to crosbug.com/29841
- There are ongoing changes underway trying to get to the root cause.
- One canary failure due to an ill-timed change to the dev server.
30 Apr 2012 Mon jrbarnette, inter alia - Apparently, nothing has happened for the past week and a half.
- The tree started the west coast day (and week) green.
- Minor failures during the day; known bugs (to be documented later).
- At the time of West coast sign-off, there is an ongoing outage due to multiple canary failures
- HWTest updates got 404 errors downloading stateful.tgz; root cause unknown
- jrbarnette is declaring it "transient" - time will tell whether this is right.
19 Apr 2012 Thugmorain, piman, kamrik - Found the tree red with HW test stage failed for alex, zgb, stumpy and lumpy canaries. In all cases the HW test stage failed at an early stage before even running any test. with an error message "FAIL: Update failed. Timed out waiting for system to mark new kernel as successful."
- While trying to figure what it was, most of the builders cycled green. The two HW test failures in the new builds seem to be browser crashes, opening the tree.
- 19:43 UTC - Tree went red again on HW test failure on ZGB and Alex with error message that looked like a browser crash. After some investigation it appears to be http://crosbug.com/29701 which was reported several hours earlier. Also reported in http://crosbug.com/29725
13 Apr 2012 Fri dgreid, tlambert - Had an instance of 28631, re opened.
- platform_CryptohomeAuthTest was failing for most of the day and not closing the tree, Found the offending commit a reverted.
12 Apr 2012 Thu dgreid, tlambert - Error with HwTest reimaging systems cleared up.
- One instance of 26646
11 Apr 2012 Wed dianders, thieule, chinyue (TPE) - scottz: Unfortunate user error failure on HWTest: http://code.google.com/p/chromium-os/issues/detail?id=29322
- crosbug.com/26646 again on stumpy canary
- chinyue 06:32 UTC: VMTest failed, crostestutils.lib.dev_server_wrapper.DevServerException: Timeout waiting for the devserver to startup. (reopen http://crosbug.com/20251)
- chinyue 07:01 UTC: http://crosbug.com/26646 again on mario incremental
- chinyue 07:16 UTC: http://crosbug.com/26646 again on x86-mario canary
- chinyue 09:26 UTC: http://crosbug.com/29278, still investigating...
- dianders 12:27 MTV: x86-mario canary in HWTest.
- Not much was logged in the link pointed to by the waterfall. It sounds like that's because this wasn't a failure of the test but perhaps a failure in running the test (?).
- scottz says he knows the problem and working on it. Filing a bug for himself. TBD: bug #?
- thieule 12:40 MTV: alex-he failure is 26646
- thieule 3:04p MTV: Temporary bots failure due to chromeos-chrome needing libjpeg, vapier says they should cycle green once they pull in chromeos-chrome 20.0.1098.1_rc-r2.
- thieule/dianders 5:30 MTV: Lots of canaries died due to failure to build private version ixchariot. dianders reproduced locally. Found that ixchariot used the cros-binary eclass, which had changed today. Revert of the eclass fixed ixchariot build, so chumped it in.
10 Apr 2012 Tues dianders, thieule, chinyue (TPE) - x86-zgb canary builder failed (http://crosbug.com/29192)
- gclient sync failed when building chromeos-chrome (http://crosbug.com/29193)
- dianders 10:30a MTV: tree was left closed at start of shift with message: http://crosbug.com/29193. Note that at around 10am that bug had been marked as fixed. David James said that several syncs had passed, so not keeping tree closed for this.
- dianders 10:30a MTV: David James pointed that http://crosbug.com/29138 was still causing VM test bots to fail (old temp files still left over). He is fixing.
- dianders 11:03a MTV: Noticed that Lumpy paladin builder failed with something similar to yesterday's http://crosbug.com/29161. Ben confirmed that this was the same as http://crosbug.com/26646.
- dianders 12:30p MTV: chromium-os-sdk is broken (and has been for a little bit--didn't notice with all of the other redness). Proposed fix is here: https://gerrit.chromium.org/gerrit/19907.
- dianders 12:45p MTV: Checking to see if latest mario incremental failure is another http://crosbug.com/26646. Asking Ben (who is AFK) and digging myself.
- Ben says it's a dupe. Updated the bug. Note that according to Ben there doesn't appear to be any good way to tell between this bug and any other hang of chrome at bootup. ...but if we got sig 6 or sig 11, we'd know it was a crash and different.
- dianders 1:30p MTV: Kaen paladin is dead, which blocks all internal paladins. Escalating to troopers (both via email and IRC).
- Latest update on the machine: It found errors during a disk check and is now trying to fix the errors. Continuing to escalate.
- Going to move to another machine. http://crosbug.com/29224 is the bug to track that.
- Fixed now.
- dianders 2:45p MTV: Since CQ was so flaky for internal stuff, I ended up chumping people's changes in if they passed enough stuff (as suggested by davidjames). Ignored instances of 29224 and the fact that they hadn't gone through Kaen.
- thieule 4:47p MTV: arm generic full builder runs out of disk space. davidjames mention that the builder only has 60GB of disk so it can only hold about 3 builds. Opened http://crosbug.com/29246.
- thieule 5:04p MTV: arm-ironhide canary fails to emerge kernel, http://crosbug.com/29247.
6-9 Apr 2012 Fri-Mon tbroch(fri), sjg(mon), mtennant, tammo
4-5 Apr 2012 Wed-Thur tbroch, dparker, kinaba
2-3 Apr 2012 Mon-Tue sheckylin, olofj, dkrahn - autotest repo failed to replicate automatically, davidjames replicated manually and logged crbug.com/121806.
- Noticed alex-canary has failed the last two builds due to lab environment issues. Talked to johndhong, opened crosbug.com/28847.
- Reverted a CL that broke the commit queues: https://gerrit.chromium.org/gerrit/19493.
- Persistent bug crosbug.com/26646 (‘Timed out waiting for login’ in VMTest)
- New bug crosbug.com/28789 ('Timeout waiting for the devserver to startup' in VMTest)
29-30 Mar 2012 Thu-Fri grundler, dlaurie, seanpaul - crosbug.com/27992 on "lumpy canary"
- crosbug.com/26646 Lots of VMTest failures "Timed out waiting for login"
- crosbug.com/28581 stumpy-canary has been broken for a week, was isolated and hopefully fixed Friday
- upstream merge of modemmanager-next caused breakage due to interface change. shill was updated to compensate.
- A few builders ran out of space. Commit was landed to auto-clean ccache directory.
- R19 x86-alex pre-flight stuck after vmtest failure, discovered and escalated to troopers late Friday...
- Detailed notes at goto.google.com/adcgi
27-28 Mar 2012 Tue-Wed reinauer, taysom - New problem: crosbug.com28631 Failed cbuildbot failed archive failed report. Assigned to ferringb
- crosbug.com/26646 flake: VMTest fails in various login tests. This happened several times
- crosbug.com/28374 flake: Timed out waiting for system to mark new kernel as successful. Multiple times but not as many as 26646
- Problem with chrome - reinauer will need to describe.
23-26 Mar 2012 Fri-Mon puneetster, snanda (PST), waihong (non-US)
adlr, bfreed (PST), kochi (non-US) - ARM build failure (-nopie error): internal builds are only affected; toolchain is rebuilt in chroot and this is fixed.
- crosbug.com/28226 cgroup unhandled crash
- crosbug.com/26646 flake: VMTest fails with Timed out waiting for login prompt
- Transient gclient sync failure on Chrome.
- VMTest failed with Error parsing data because invalid syntax, but the Report log says Exception __main__._ShutDownException: _ShutDownException('Received signal 15; shutting down',)
- BuildTarget failed with Unavailable repository 'gentoo' referenced by masters entry due to https://gerrit.chromium.org/gerrit/#change,18809.
20 Mar 2012 Tue sonnyrao/dennisjeffrey (PST), yjlou (non-US) - Mon 22:51 UTC Tree closed. "amd64 generic full" bot running out of memory. Filed http://crbug.com/119009 and re-opened tree.
- Build team switched amd64-generic over to a Builder with more memory and we closed 119009
- Tue 06:30 UTC Tree closed. Transient download error. Tree re-opened.
- Tue 20:17 UTC Tree closed. tegra2_seaboard failed BuildBoard due to Arm hardening options being enabled in GCC.
- Saw another http://crosbug.com/26646 flake.
15 Mar 2012 Thu keybuk (PST), katierh (PST)
13 Mar 2012 Tue waihong (TPE)
12 Mar 2012 Mon vbendeb (PST), bhthompson (PST)
- 11:05 PST - kernel rebase to 3.2 had to be reverted:
https://gerrit.chromium.org/gerrit/#change,17846 https://gerrit-int.chromium.org/#change,13546
but another problem crept in (http://crosbug.com/27657), bhthompson working to resolve - 11:25 PST. The failing cashew unittest has been temporarily disabled (https://gerrit.chromium.org/gerrit/#change,17847). Tree status changed to opened.
- 12:40 PST The tree is closed again with the same unittest failure. The ebuild uprev has not happen.
- The uprev in fact happened at 12:23 (after the failed build started), reopening the tree
- With the kernel revert in place and the cashew unit test disabled, the tree cycles green and stays fairly clean in the afternoon apart from a couple of flakes.
- The kernel image size problem still needs to be dealt with, as does the cashew unit test, but they should not be impactful to tree status in the interim.
09 Mar 2012 Fri vbendeb (PST), bhthompson (PST) - Had to revert sshfs-fuse update to 2.4 as it was breaking on ARM due to instability marker in the new ebuild https://gerrit.chromium.org/gerrit/#change,17773
- Kernel 3.2 update was pushed on Friday but the impacts were not felt until the evening, leaving the tree red for the weekend.
08 Mar 2012 Thu quiche (PST), micahc (PST), hungte (non-US) - 17:48 PST - tegra2_kaen canary failure, sjg pushed https://gerrit.chromium.org/gerrit/#change,17630 to fix.
- 16:11 PST - x86-zgb canary, crosbug.com/27521
- 15:36 PST - lumpy canary failure, during PublishUprev
- 15:00 PST - assist with resolving chrome-PFQ failure (in chrometest)
- 12:43 PST - tegra2 seaboard full failure, reverted gerrit.chromium.org/gerrit/17523
** update: not reverted. quiche prepared the revert CL, but didn't push it. (confused by UI)
- 04:36 PST - alex_he canary ManifestVersionedSync failure (crosbug.com/27521)
07 Mar 2012 Wed quiche (PST), micahc (PST), hungte (non-US) - 16:36 PST - assist with chromium.chromiumos failure (gerrit.chromium.org/gerrit/17551)
- 12:41 PST - mario incremental failure due to TreeCloser crosbug.com/26646.
- 09:59 PST - CleanUp failed on x86 generic full. crosbug.com/5409, CL in review.
- 07:20 PST - CleanUp failed on amd64 generic full. petermayo rebooted the bot.
- 02:24 PST - x86-mario canary failure due to HWTest. crosbug.com/27287
06 Mar 2012 Tue - 11 PST - amd64-generic full hit crosbug.com/25618
- 6 PST - tegra2-full complaining of disk full
- Periodic crosbug.com/26646
05 Mar 2012 Mon - 9:43 PST - chromiumos-sdk hitting link errors in SDKTest; marcheu and zbehan got it fixed.
- Periodic crosbug.com/26646 all weekend.
01 Mar 2012 Thurs - 04:00; llvm change landed tightening const strictness, breaking stumpy/lumpy canaries. https://gerrit.chromium.org/gerrit/#17133. Revert chumped in, canaries restarted manually.
- 10:14 PST - ScreenLocker smoke test failing. keybuk says the test was relying on a bug. http://crosbug.com/27146. Reopened for now.
- More chrome flakiness on internal builders - crosbug.com/26646
- 4:30 PST- x86 chrome PFQ failing to emerge chromeos-base/chromeos-0.0.1-r153, trying clobber build.
28 Feb 2012 Tue kliegs (EST), davidjames (PST), nsanders (PST) - Paladin bots aren't closing the tree - crosbug.com/27060
28 Feb 2012 Tue kliegs (EST), davidjames (PST), nsanders (PST) - 8:00 PST - chrome buildspec fixed (was missing new chromite dependency whitelist). chrome PFQ's running now to roll ebuild. Finished @9:45 AM and canaries were kicked off manually. Change was picked up by canaries and they went green.
- 7:39 PST - Canaries still red, Chrome 1055 buildspec was unsuccessful so no new chromeos-chrome ebuild. This means vapiers revert was not picked up so unused variables still creating errors on the canaries. Leaving tree throttled and have pinged Chrome sheriffs and PMs to help resolve
27 Feb 2012 Mon26 Feb 2012 Sun - 6:40pm, ferringb reopens for http://crosbug.com/26646 flake on mario-incremental.
- 5:43pm, vapier reopens, files http://crosbug.com/26886
25 Feb 2012 Sat- 11:08: All internal canaries are taken out by http://crosbug.com/26886. Tree remains closed till sunday.
24 Feb 2012 Fri dtu (MTV), mkrebs (MTV) - 10:48 PST - All full and incremental builders, along with chromiumos on the Chrome waterfall, fail due to path conflict in DEPS. rcui reverts.
- 12:26 PST - x86-alex_he canary closes on chromium-os:26646. mkrebs reopens.
23 Feb 2012 Thu- 14:06 PST - Builder could not find "configure" executable on x86 generic full, arm generic full, tegra2 full, and tegra2 seaboard full. Fixed by zbehan.
- 18:44 PST - Bug 26646 hit x86-mario incremental.
22 Feb 2012 Wed - 12:22 PST - ExtensionTerminalPrivateApiTest.TerminalTest failed in Linux ChromeOS Tester. Possibly fixed, so no bug filed.
- 14:38 PST - Chromium-OS:26646
18 Feb 2012 Sat 17 Feb 2012 Fri
16 Feb 2012 Thur
15 Feb 2012 Wed
14 Feb 2012 Tue dianders (MTV), ferringb (MTV), clchiou (TPE), tammo (TPE) - overnight - bvt failure in zgb (couldn't access www.google.com). Interim failure?
- overnight - x86-mario canary failure => http://crosbug.com/26347 according to tree status history; clchiou has throttled because of this
- 8:35am - tree was open/green when dianders got in (ellyjones opened)
- 11:00am - x86 pineview full: ferringb IDed as an instance of http://code.google.com/p/chromium-os/issues/detail?id=21559
- dianders: nope. Actually http://crosbug.com/24935; I put a pling in that bug. Still agree that it shouldn't close tree, since it's not a new issue.
- 12:47p - lumpy canary: Another instance of http://crosbug.com/26347. Inserted pling and bumped to P1. dianders: kicked lumpy canary build (1:15p) when I noticed that it wouldn't retry for a while.
- 2:00p - dianders noticed that chromium.chromiumos build was broken (though no email).
- Filed: http://crosbug.com/26377
- Didn't close tree, since this doesn't appear to be a treecloser.
- 2:16p - gcc 4.6.2 (CL 15461) landed at 1:58p without a required dependency keyworded breaking amd64 i7 full, reverted via CL 15845.
- 3:49p - x86 generic incremental: vmtest, Actually http://crosbug.com/24935; time to escalate?
- 4:31p - x86 generic incremental: vmtest, http://crosbug.com/21559 (supplied_chrome crash). Added note to the bug, including a little bit of debugging. Not sure there's much we can do here.
- 6:22p - x86 zgb-he canary, update delta again: http://crosbug.com/24280.
- 6:31p - lumpy, cros_au_test_harness.py timeouts: http://crosbug.com/24crosbug.com/26347280.
13 Feb 2012 Monmsb, mkrebs (MTV), kamrik (EST) 10 Feb 2012 Fri
msb, mkrebs (MTV), kamrik (EST) - x86 canary bots are all red as of 2230 PDT last night - filed as http://crosbug.com/26168
- full buildbots have been running erratically for a week - e.g., the last run of x86-generic-full was on 2012-02-08 and the last run of arm-generic-full was on 2012-02-08 (with the previous run on 2012-02-01!)
- Tree closed at 2012-02-09 2244 PDT, reopened by ellyjones at 0628 PDT, reclosed by ellyjones at 0730 PDT. Still closed as of 0741 PDT.
- After much wailing and gnashing of teeth, ellyjones, kliegs and zbehan track the problem down to the binutils-2.21-r3 image produced by the chromiumos-sdk bot; if the same package is compiled locally (from the same git commit-id), the failure disappears. Tree still closed as of 0828 PDT.
- vapier points out that the act of rebuilding switches you back to bfd instead of gold, thus hiding the problem from earlier tests; back to square one
- CLs 15340 and 15176 were reverted and the SDK builder fired to rebuild the SDK. The new SDK works (verified around 12:40 PST)
- The problem was due to the -frecord-gcc-switches flag and the and the way how gold gets linked with glibc, which is linked with GNU ld.
- supplied_chrome sig 11 in security_ProfilePermissions.login again
- crosbug.com/25742
- http://chromegw/i/chromeos/builders/lumpy64%20PFQ/builds/534/steps/VMTest%20%5Blumpy64%5D/logs/stdio
9 Feb 2012 Thu
sque, dkrahn (MTV) - Nightly chrome PFQ for amd64-corei7 failed causing ToT canary builders to continue to use chrome 19.0.1034.0 and fail again at 4:30am. Chrome pinned to 19.0.1033.0_rc-r1 to allow ToT canary builders at 10:30am to cycle green. Issue crbug.com/113475 has been logged and is assigned. When it is fixed and all nightly chrome PFQs cycle green chrome needs to be unpinned.
- Alex PFQ: supplied_chrome sig 11 in suite_Smoke/login_LoginSuccess.default
- x86-generic PFQ: "Unhandled IOError: close() called during concurrent operation on the same file object."
9 Feb 2012 Thu
djkurtz (TPE) - tegra2 kaen PFQ fails to build uboot (reverted by ferringb; original issue investigated by clchiou)
- supplied_chrome sig 11 in suite_Smoke/login_CryptohomeUnmounted
- CHUMP caught a CL pushed past CQ that broke the tree, reverted by vapier
8 Feb 2012 Wed
sque, dkrahn (MTV) - repo sync issue, should have been reverted by ferringb.
- Chrome nightly PFQ failed, should check 4:30am build, should have new build spec by then.
- lumpy64 canary failed in chrome build. There was a change to a language .xtb file that broke Chrome and was reverted, but the revert was not pulled inot Chrome OS:
- amd corei7 nightly chrome PFQ: Chrome build failure, likely same cause as above.
6 Feb - 2 Feb 2012 Mon/Tue dparker, wfrichar (MTV)
31 Jan - 1 Feb 2012 Tue/Wed jrbarnette, dgarrett (MTV), falken (Tokyo) 27-30 Jan 2012 Fri/Sat/Sun/Mon ellyjones, dennisjeffrey, vbendeb
25-26 Jan 2012 Wed/Thu - Lumpy64 and Pineview archive failures found to be a problem with dumpsym choking on files of the wrong architecture, filed as http://crosbug.com/25496. Mkrebs working on a fix as a p0 item, so tree reopened.
- Lumpy PFQ failure due to chrome aura crash: http://crosbug.com/25454 (Build log: http://chromegw/i/chromeos/builders/lumpy%20PFQ/builds/595). ChromeOS tree reopened since this is a chrome aura issue.
- Same aura crash also believed to be behind stumpy PFQ failure: http://chromegw/i/chromeos/builders/stumpy%20PFQ/builds/603
- Filed bug 25467 for Lumpy canary failure with cmt_drv.so and gobi.so
- Filed bug 25468 for dump_syms problem (ERROR : Unable to dump symbols for /build/x86-pineview/usr/lib/debug/boot/vmlinux:
dump_syms: src/common/linux/dump_symbols.cc:169: const Elf32_Shdr*{anonymous}: :FindSectionByName(const char*, const Elf32_Shdr*, const Elf32_Shdr*, int): Assertion `nsection > 0' failed.
23-24 Jan 2012 Mon/Tue - Chrome PFQ was still red at the beginning of the shift. It was red since last Thursday, meaning chrome would not roll and pick up critical fixes, holding up dev channel builds
- Aura has started to flake - filed crosbug.com/25454
19-20 Jan 2012, Thu/Frianush, puneetster, miletus 17 Jan 2012, Tuesday
nirnimesh, sosa, josephsih - Out of disk space on a bot - issue 110480 filed for long term fix.
13 Jan 2012, Friday jennyz, achuith, zvorygin 12 Jan 2012, A Rainy Thursday jamescook, keybuk, vapier (east coast) 11 Jan 2012, A Wednesday jamescook, keybuk, vapier (east coast) - Hit kvm ssh timeout again crosbug.com/24280
- build_image failure with "mount: you must specify the filesystem type" crosbug.com/24975
- x86-alex & tegra2_seaboard toolchain master bots dead due to sync error -> trooper reset them
- x86-alex 0.11.241.B factory & pre-flight bots dead for a while crosbug.com/24983
- x86-zgb release factory-980.B bot has been down for a while crosbug.com/24971
- google-breakpad failed its unit tests crosbug.com/24982
- new dev-libs/glib pkg failed in toolchain fortify smoketest; dev was informed of CQ usage and multiple CL's landed to resolve
10 Jan 2012, Tuesday davidjames, tbarzic - Saw a couple shutdown crashes in Chrome:
1/9/2012, Monday davidjames, tbarzic, jglasgow (east coast) - Tree still throttled; Looking for jhorwich to organize a sheriff summit.
- Found ZGB PFQ reporting errors (http://chromegw/i/chromeos/builders/zgb%20PFQ/builds/193).
"Found nothing new to build, trying again later. If this is a PFQ, then you should have forced the master, which runs cbuildbot_master Found no work to do." Tried to force a build via the web page, but not sure if that is what the error message means. - Found TOT PFQ has not run since Friday, despite a long (28) queue of changes. http://chromegw/i/chromeos/builders/TOT%20Pre-Flight%20Queue. Filing bug. Also observed that the ToT CQ has 749 pending requests. No troopers on IRC or responding to email.
1/7/2012, Saturday jhorwich, tlambert, jglasgow (east coast) - Tree was throttled; it's possible to push things past the PFQ with "Publish and Submit" after verify.
- You may need to remove gerrit as a reviewer to do this, since it will -2 you on the PFQ.
- If future sheriffs use this as a workaround for the VMTest problem, keep the following in mind:
- You need to watch the tree for non-VMTest failures.
- Consider "Publish and Submit" for CLs that were rejected by PFQ failures involving VMTest IFF shutdown related; this allows people to make forward progress towards deadlines despite the VMTest issue.
- Watching is no more onerous than reopening the tree every 27-35 minutes because of VMTest barfing (this was my Thr night and Fri day).
- jhorwich has the ability to get core dumps now; I'm talking to Randall Monday about making this a crossystem option. I honestly believe that he is a victim of VMTest in this, like the rest of us, and that we need to examine the (non)role of unit tests in diagnosing the test framework vis a vis tree closures:
- In case that's not clear, let me bluntly say that a chrome failure from a passed test should not be a tree closer.
- If we want to test Chrome fragility on shutdown, that should be a separate test; to my mind it would be of dubious value:
- Chrome crashes -> restart Chrome -> gaia login
- Chrome doesn't crash -> restart Chrome -> gaia login
- We need a sheriff's summit; if no one else calls one next week, I will.
1/6/2012, Friday
jhorwich, tlambert, jglasgow (east coast) - jglasgow: Found stumpy PFQ failing, filed crosbug.com/24790, decided to fix rather than revert since VMTest was holding tree closed anyway
- VM tests failures dues to Chrome crashes are a huge problem, but better debugged by jhorwich and those with Chrome experience.
- Sorry to disagree here, but the VMTest failures are a meta-failure in the test infrastructure, and do not effect the validity of the test results. They are bugs, but they are bugs that should not result in tree closure.
- jglasgow: Filed crosbug.com/24795 for PFQ uprev failures
1/5/2012, Thursday
jhorwich, tlambert, jglasgow (east coast) - Filed a tree closer crosbug.com/24733 because of a test platform_ToolchainOptions was failing because of bluez. Thanks zbehan for helping to look at this. It is not clear what change caused this to start failing.
- Filed a tree closer crosbug.com/24760 because uboot was failing to build on tegra2. Thanks David James who pointed this out, and vpalatin who quickly grabbed the bug. Lots of red on the tree due to chrome sig11 certainly affected the sheriff's ability to notice this -- but we should have been more vigilant in making sure we understood all the red builders.
- Chrome sig11 bug quite prevalent today. jhorwich noted 9 instances during MTV shift. Got a good stack trace on x86-alex canary build 1478, added to crosbug.com/19204
- Only other closure during shift was a straightforward build breakage (gerrit 13738) which was reverted
- jhorwich reproduced a chrome sig11 on local VM, is going to attempt to debug root cause Friday
- tlambert reopened over the sig11; mostly jhorwich was faster
- Added entry to the Sheriffs FAQ
- we need to update the builder/closer list
- temporary link to "all" for when you can't find the builder
12/28/2011, Wednesday sonnyrao, mtennant - Noticed that sheriffs cannot push through gerrit with red tree. Filed crosbug.com/24630.
- Starting around 2pm started to see Chrome sig 11 crashes on internal zgb PFQ and TOT PFQ. Filed crosbug.com/24646.
- Most Chrome sig 11 core files were unusable, but identified one as crosbug.com/19204. Re-opened bug.
12/28/2011, Wednesday sonnyrao, mtennant 12/27/2011, Tuesday derat, dlaurie, nkostylev 12/26/2011, Monday derat, dlaurie, nkostylev - transient x86-alex canary vmtest "Timed out waiting for login prompt" (crosbug.com/23199)
12/21/2011, Thursday cwolfe - transient alex PFQ vmtest "Timed out waiting for login prompt" (crosbug.com/23199)
- transient link PFQ vmtest sig 11 on login_BadAuthentication (didn't find a bug; should be one already)
- transient x86-pineview-pull svn error with webrtc (didn't find a bug; should be one already)
- Test failures on Chromium.ChromiumOS Linux ChromeOS Aura (crbug.com/108434 and crbug.com/108436)
12/21/2011, Wednesday marcheu, quiche - Another occurrence of crosbug.com/23413 (on x86 pineview full)
- Build failure in chromiumos sdk. Due to race condition in groff ebuild. (crosbug.com/24481).
- workaround by marcheu
- fixed by vapier
- Build failure in x86 generic commit queue. Due to missing sandbox exception for fontconfig. (crosbug.com/24488, fixed by vapier)
- Build failure in Chromium.ChromiumOS (aura). Fixed in ToT chrome.
- Build failure on x86 generic PFQ (due to https://gerrit.chromium.org/gerrit/13273). ferringb reverted.
- Build failure on tegra2_kaen-aura canary (due to https://gerrit.chromium.org/gerrit/13216). marcheu reverted.
- Another occurrence of crosbug.com/23413 (on zgb PFQ)
12/20/2011, Tuesday rharrison, kamrik shadowing - Came onto a red tree, due to Stumpy PFQ being forced directly instead of TOT PFQ
- Filing a bug about the error message not being descriptive enough, crosbug.com/24421
- Created a CL https://gerrit.chromium.org/gerrit/13235 to make error message more descriptive
- Kicked the TOT PFQ and reopened the tree
- VMTest Failure on amd full generic, created crosbug.com/24422
- Looked into crosbug.com/22577
- Pinged nkostlyev, altimofeev, and flackr to make sure it was being looked at
- Approved CL for altimofeev to change test run order to try to get more information
- Another occurrence of crosbug.com/23199 (on link PFQ)
12/15/2011, Thursday ihf, gpike - Hit: chrome crash in suite_Smoke/desktopui_ScreenLocker. crosbug.com/22577
- One case of: chrome crash in suite_Smoke/security_ProfilePermissions.login. crosbug.com/23258
- The 3AM build of x86-zgb_he full release-R17-1412.B Build #21 failed: FAIL Archive (1:35:59) with BackgroundException.
- The previous one (#20) failed in VMTest stage while unzipping the image.
- ... and before that, #19 failed in VMTest stage during ../platform/crostestutils/generate_test_payloads/cros_generate_test_payloads.py
- The 3AM build of x86-mario full release-R17-1412.B Build #21 also failed during ../platform/crostestutils/generate_test_payloads/cros_generate_test_payloads.py. The first problem may have been: "mount: you must specify the filesystem type"
12/14/2011, Wednesday ihf, gpike - Hit issues of: chrome crash in suite_Smoke/desktopui_ScreenLocker. crosbug.com/22577
- Hit another issue of: VMTest ERROR: Test that updates to itself. crosbug.com/20427
- Problems on alex_he canary with recovery and vmlinuz images. Filed crosbug.com/24242.
- chromiumos sdk broken: xmlrpc-c-1.18.02: curlmulti.c: curl/types.h missing. Filed crosbug.com/24235.
12/12/2011, Monday glotov - stumpy-canary link error in power_manager, can not reproduce locally. Clobber does not help as well. Filed crosbug.com/24091.
- Lumpy-binary fails on building chromeos-u-boot-0.0.1-r336: boot_kernel.c:206:26: error: 'CHROMEOS_BOOTARGS' undeclared. Filed crosbug.com/24136.
12/10/2011, Saturday
cwolfe (drive-by, times unknown)
- ARM release bots attempting to run vm_tests. Same as (crosbug.com/21536). Probably from gerrit/12702, e-mailed rcui
- Widespread build errors on pepper-flash "HTTP Error 403: User Rate Limit Exceeded" (crosbug.com/23511)
- stumpy canary link error in power_manager; can not reproduce, probably just needs a clobber after the 403 clears up
- Still some VMTest problems
12/9/2011, Friday
rspangler, chocobo, jglasgow, ellyjones
- 1130 PST: VMTest chrome-static crashes timed out (crosbug.com/21559)
- 1255 PST: VMTest timed out (crosbug.com/23413)
- 1300 PST: VMTest login timeout (crosbug.com/23199)
- 1405 PST: VMTest failure (crosbug.com/20427)
- 1530 PST: VMTest flakiness (crosbug.com/23778)
- 1600 PST: And more VMTest problems (crosbug.com/22577)
12/8/2011, Thursday
rspangler, chocobo, jglasgow, ellyjones
12/7/2011, Wednesday
thutt, thieule, yusukes
- 1033 PST: Canary builders broke (crosbug.com/23882), this was fixed and canary builder subsequently passed. Some builders used dash instead of bash.
- 1041 PST: Aura Chrome PFQ incorrectly configured, petermayo is working on a fix
- 1127 PST: Chrome crashed during VMTest (crosbug.com/23884)
- 1404 PST: Autotest client terminated unexpectedly (crosbug.com/20427), this could be related to crosbug.com/22333?
- 1717 PST: Chrome crashed during VMTest (crosbug.com/23884)
- 1729 PST: Chrome crashed during VMTest (crosbug.com/22577)
12/6/2011, Tuesday
thutt, thieule, yusukes
- 1059 PST: AU VM Test failure (crosbug.com/22333)
- 1140 PST: Timed out waiting for login prompt (crosbug.com/23199)
- 1415 PST: Chrome SEGV (crosbug.com/23675)
- 1722 PST: Timed out waiting for login prompt (crosbug.com/23199)
- 1740 PST: Commit queue hung and was restarted (crosbug.com/23864)
11/28/2011, Monday
ers, sleffler, stevenjb
- BVT failures for zgb are chrome sig 11's that appear unrelated but the dump logs are zero length
- BVT failures for mario sig 11 in synTPenh
- 1146 EST: looks like http://crosbug.com/23199 occurrences are still closing the tree frequently
- All three chrome PFQ builds had failed with "Clear and Clone chromite" errors (couldn't find branch named 'release'). A forced build on the arm generic chrome PFQ resulted in success, so I reopened the tree and forced builds on the other chrome PFQ bots.
11/10/2011, Thursday
- 0800 PST: 500 internal server error uploading prebuilt. Bug filed: http://crosbug.com/22804
- 0900 PST: x86 PFQ failure in autotest due to pygtk rev. pygobject was updated and PFQ clobbered.
- 1100 PST: TOT PFQ faliure in autotest due to pygtk/pygobject rev. Next build was successful.
- 1500 PST: Adobe pulls all Linux Flash 10 binaries. Bastards. http://crosbug.com/22837 I updated the adobe-flash ebuild to use flash11.
- 1700 PST: VMTest failures due to broken flash, my ebuild did not install into correct directories...
NOTE: autotest/pygtk/pygobject failures were related to python ebuild from a couple days ago
11/3/2011, Thursday
- 0700 PST: Chrome build was broke early in the morning. We kept the ChromeOS tree open. Resolved around 11:30.
- Red canaries were expected to run overnight, but they are still red Friday morning.
11/2/2011, Wednesday
- 1356 PST: tegra2_seaboard-tangent-binary failed with a too large u-boot image. reinauer fixed this.
- 1415 PST: transient VMTest failures from 2:15 to around 3:00.
Still open:
- Want to understand how to make sure we get chrome stack crawls. WIP at: <http://crosbug.com/21559> and <http://crosbug.com/22047>, which I think are different bugs.
- amd64-generic-full still failing (less important).
- BVT tests getting Synaptics sig 11s (http://crosbug.com/13377) and chrome sig 11s (not too surprising given the ones we see below).
10/28/2011, Friday
- 1625 PST: chromium.chromiumos failure. VMTest stage timed after 9000 seconds. ericroman reverted a webkit roll.
- 1620 PST: reopened tree
- 1540 PST: restarted most internal builders. (restarted any builder that had a build fail due to the network issue; did not restart builders that were idle at the time of network failure)
- 1530 PST: network issues resolved
10/27/2011, Thursday
- 2150 PST: network issues causing failure on internal builders (crosbug.com/22216)
- 1718 PST: chromium.chromiumos closes due to gclient sync failure on chromeos-chrome. ericroman reopens.
10/26/2011, Wednesday
- 0922 EST: VMTest Failed due to not being able to access update server
- Bug filed by petermayo as http://code.google.com/p/chromium-os/issues/detail?id=22111
- Reopened, since it only occurred for one bot
- 1443 EST: Tree closes because of failure to fetch webkit from svn.webkit.org. Not supposed to happen. Is crosbug.com/17959
- 1529 EST: Tree closes because of a build failure in chromium's chromium. Not supposed to happen; sosa and petermayo are fixing this.
- 1643 EST Another instance of crosbug.com/17959 from svn.webkit.org.
10/25/2011, Tuesday
- 1022 EST: Failures due to issues with cros_run_vm_test from http://gerrit.chromium.org/gerrit/#change,10599 . Reverted as http://gerrit.chromium.org/gerrit/#change,10647
- Multiple re-closures due to slow builds hitting this issue after the revert
10/24/2011, Monday
- 5:34p: Mosys ebuild failure. Reverted here: http://gerrit.chromium.org/gerrit/10605
- 5:12p: Another flaky sig11 in ChromiumOS (x86) (chromium.chromiumos). Haven't investigated, but it went away.
- 4:51p: Another flaky sig11 in alex-binary. http://crosbug.com/21559
- 3:15p (dianders): ChromiumOS (x86) (chromium.chromiumos) build failed. 2 issues:
- http://crosbug.com/22025
- First sig11 didn't give a stack crawl. Seems to be a different problem than http://crosbug.com/21559 (??)
- 2:31p (dianders): Stumpy canary 493 fails. Different than 492, but probably also flaky. http://crosbug.com/22019 filed.
- Early afternoon (dianders): Digging into overnight BVT failures. 2 of them thought to be another instance of http://crosbug.com/13377
- Morning (dianders): Digging into stumpy 492. Filed http://crosbug.com/22005 w/ info. Going to see what happens w/ 493.
- 9:45a PT: Started (West coast) day with:
- Tree opened with http://crosbug.com/21624 caveat (though already fixed). Kicked binary builders to try bugfix.
- Linux ChromeOS build failing (http://build.chromium.org/p/chromium.chromiumos/waterfall?builder=Linux%20ChromeOS). Looks like a flaky build. http://crbug.com/100538 seemed to be talking about this test, so added a comment.
- Chromium OS SDK looks like it's still probably broken. http://crosbug.com/21973
- Several emails about BVT failures
- Last stumpy canary (492) was a failed one.
- security_ProfilePermissions.login ERROR: Unhandled JSONInterfaceError: Automation call {'username': 'performancetestaccount@gmail. ...
10/21/2011, Friday
See also this doc: https://docs.google.com/a/google.com/document/d/17eHo0cN9gOEcdH43AQujNIKosRFjaHHeVdJyZ6jYJYY/edit
- 4:00pm PT: vpalatin points out that the (less important) amd64-generic-full is failing. http://crosbug.com/21970
- 3:16pm PT: Failure w/ shflags and testUpdateKeepStateful (http://crosbug.com/21966).
- 3:16pm PT: fix to chromium.chromeos waterfall <http://gerrit.chromium.org/gerrit/10526>
- 2:14pm PT: fix to crosbug.com/21945 is pushed.
- 2:14pm PT: another isntance of 21945
- 2:14pm PT: another kernel build failure (same problem--revert hasn't made it everywhere).
- 1:30pm PT: ...kernel build failure again (another case fixed by revert below)
- 1:00pm PT: kernel build failure; fix by reverting -> http://gerrit.chromium.org/gerrit/10509
- 10:38am PT: stumpy canary failure attributed to Bigstore; reportedly a power event in the data center.
- 10:23am PT: ellyjones reports kernel failure; fix: -> http://gerrit.chromium.org/gerrit/10497
- 9:00am PT: Started the (West coast) day with
- http://crosbug.com/21945
- Chrome SEVG failures lumped under http://crosbug.com/21559
- Chrome PFQs all down
- chromium.chromeos broken (and has been for several days).
10/20/2011, Thursday; 10/19/2011, Wednesday
10/17/2011, Monday
dgozman
- 9:40am. Tegra build fails. http://crosbug.com/21751.
10/12/2011, Wednesday
olege, semenzato, gpike
- 9pm: getting lots of chrome sig 11 during vmtests. Cause unknown.
- 5:32pm: webkit.2011101101.patch needs update. Updated http://crosbug.com/21624, got petermayo to work on a fix; restarted 3 bots after fix was done
- 5:15pm: cmasone kindly fixed a crash reporter bug introduced this morning.
- 4:43pm: hit 19204 again
- 4:16pm: webkit.2011101101.patch needs update. Opened http://crosbug.com/21624
- 2:21pm: autoupdate vmtest failed. Under psychological pressure, Chris Sosa admitted seeing this before. Opened http://crosbug.com/21610
- 9am: disabled desktopui_UrlFetch.not-live, thereby sweeping http://crosbug.com/21566 under the rug.
- sheriffs could not submit a change bypassing the commit queue. Chris Sosa fixed this.
- afternoon: svn checkout for chromeos-chrome failed again. Opened http://crosbug.com/21598.
- 8am. "arm generic full" failed on BuildTarget. svn checkout failed during building chromeos-chrome. Built fine on the next try.
- 0am - 8am. Multiple occurrences of http://crosbug.com/21566.
10/11/2011, Tuesday
olege, semenzato, gpike
times in PST unless otherwise marked
- 2:30pm: 21517 is fixed (xorg.conf missing in arm builds). This was making the arm canaries red.
- 12:20pm. Arm build broken by change 55311 at 11:45, fixed by change 55319 at 12:40.
- 11am. Another occurrence of http://crosbug.com/21402, assertion failure in google breakpad.
- 10:30am. http://crosbug.com/19204 happened twice. Raised priority and reopened.
- 8am. Oleg reverted a change apparently responsible for vmtest failure on stumpy. (http://gerrit.chromium.org/gerrit/#change,9841)
10/6/2011, Thursday
Sheriffs: derat, stevenjb
10/5/2011, Wednesday
Sheriffs: derat, stevenjb
Tree started closed with two issues:
|