Skip to content

Meeting 2016 02 08

Lars Bergstrom edited this page Feb 8, 2016 · 1 revision

Agenda items

  • Changes to Linux automation builders landed (larsberg)
  • Requesting comments on proposed automation changes for new platforms (larsberg)
  • jsup status (jack, till)
  • intermittents (mbrubeck)

Attending

  • jdm, edunham, ajeffrey, till, manish, ms2ger, frewsxcv, simonsapin, jack, mbrubeck, azita, larsberg, bholley

Last week

CI bits

  • larsberg: We now have dedicated linux machines. We still have intermittent test failures, but we should no longer be losing machines due to network or EC2 latent builder issues.
  • larsberg: Also, please comment on https://github.com/servo/saltfs/issues/215 ! We're looking into ways to scale up our infra without adding more fragile buildbot builders and would like your feedback before we do the work.

JS spidermonkey upgrade

  • jack: People are curious where we're at.
  • till: Haven't looked at it since early december. Doesn't make sense to land it before I get promises in (https://bugzilla.mozilla.org/show_bug.cgi?id=911216 ). After I land them, that's the next task in the queue. First, getting the update landed, then more automation to make updates less painful.
  • larsberg: Is this the stuff we discussed in Orlando? Auto-publish nightly, etc.?
  • till: Yes! It would be great to be on a more equal footing with Gecko. Just a test that checks our bindings layer compiled with the changes we do in jsapi would get us a substantial part of the way. Still broken by API changes with semantic changes (if we don't add tests), but that's a much more complicated problem - getting out integration tests onto try server.
  • till: We talked about creating the JSAPI C++ abstractions while compiling on the m-c try server infra. Then, just like we put firefox and js shell nightly builds up, we'd do that with the SM bindings, too. We have everything we need in place there, including the LLVM version we needed to run the rust-bindgen bits. We need to figure out if we can create them all on one platform. I talked with mwu about it briefly, and we should, but otherwise we will have to build them on each platform. Doable, but more work.

Intermittent test failures

  • larsberg: Is there anything we can do to help?
  • mbrubeck: This week will be doing a bunch of runs with rr to see if we can get anything useful. If things have approaches to suggest or try, that would be welcome! Also Keith (kjchang) disabled some of the worst offenders, which will help us scrape along until we figure it out.
  • larsberg: I can set up regression machine to help identify them.
  • bholley: Could we always run tests under rr? Is that feasible?
  • larsberg: I don't think we have anything that would prevent it.
  • mbrubeck: We could try doing full test suite runs under rr! One unfortunate thing is that some of the intermittents go away under rr. It's good for green tests runs, but could mask race conditions. It probably changes the timing.
  • bholley: Does a run without rr map more accurately to the user, or is it a slightly different set of noise? It would be great to have this in place if it works!
  • mbrubeck: I'll make that a goal of my work.
  • simonsapin: I think rr runs programs on a single thread, which will mask a lot of Servo race conditions.
  • jdm: Single CPU.
  • manish: Rust should prevent anything that would require simultinaeity.
  • bholley: Maybe have a separate platform rr run? With the rr oranges, we could fix it!
  • jdm: Also help with bad runs with zero Rust backtrace but an actual segfault since we don't have minidumps.
  • manish: If data races are a problem, I think you can run tsan on Rust.
  • larsberg: I suspect rr will catch most of our bugs... incremental layout and concurrency-related issues are our two biggest issues.
  • jack: Does disabling incremental layout help?
  • mbrubeck: We could disable it for just certain tests that fail on it.
  • larsberg: I like that better - makes the incremental layout-fragile tests explicit in the files.
  • larsberg: Does rr run on OSX?
  • bholley: linux only. For the forseeable future. roc might be able to get it to work on mac, but Windows is a LOT more effort.
  • till: cjones wrote a blog post with an enumeration of the questions that we'd need to answer to port rr to OSX: https://joneschrisg.wordpress.com/2015/01/29/rr-on-os-x-can-it-be-ported/
  • bholley: Big thing recently is that they fixed the bug with vmware, which lets it run inside of a virtual machine.
  • edunham: Would throwing in a slow linux builder help find the intermittents on mac?
  • larsberg: I suspect we have a small number of underlying bugs, and the hardware configuration just changes which tests they affect.
  • bholley: Does buildbot finish the builds after the first failure?
  • mbrubeck: I think so. The results are all there in the buildbot logs, but homu just doesn't add a new GithHub comment so we don't notice them unless we go looking.
  • jack: Maybe we should fix this in homu!
  • jdm: I filed it, and barosl replied: https://github.com/barosl/homu/issues/98 and http://logs.glob.uno/?c=mozilla%23servo&s=21+Oct+2015&e=21+Oct+2015&h=barosl#c286809
  • jdm: Looks like we'll have to do it.
Clone this wiki locally