Skip to content

WeeklyTelcon_20170905

Geoffrey Paulsen edited this page Jan 9, 2018 · 1 revision

Open MPI Weekly Telcon


  • Dialup Info: (Do not post to public mailing list or public wiki)

Attendees

  • Geoff Paulsen (IBM)
  • Artem (Mellanox)
  • Ralph Castain (Intel)
  • Howard
  • Brian Barrett
  • Todd Kordenbrock
  • Jeff Squyres (Cisco)
  • Joshua Ladd
  • Joshua Hursey
  • David Bernholdt
  • Thomas Naughton
  • Geoffroy Vallee
  • Nathan Hjelm

Agenda

Review v2.0.x Milestones v2.0.4

  • Nothing new to report.

Review v2.x Milestones v2.1.2

  • v2.1.2
    • Howard put out an RC last week.
    • Wants to put out a new RC that includes some fixes.
    • PMIx 1.2 might have added a requirement for a newer version of automake
    • README update from Paul Hargrove.
    • Allinia still seeing intermittent issues in v2.1.2 release candidate Issue 3660

Review v3.0.x Milestones v3.0

  • Didn't get an RC out.
  • Discussed and agreed to PR4167
  • Josh Hursey's Patch for Dynamic components linking against their project library. *
  • Someone's saying DBM is still not fully operational in v3.0.x. Took it out of known issues because we thought it was working. Adding back to Readme.
  • Issue3299
    • Should fix it for v3.0.x, but don't hold v3.0.0 for it.
  • Schedule
    • Get a v3.0.0 Beta Out tomorrow Morning.

Review Master Master Pull Requests

  • libpmix failed in linking against libopal.
    • atomic opal_atomic - 64bit in 32bit? In PMIX?
    • Nathan merged in something on Sept 1st.

MTT / Jenkins Testing

  • NFS path testing - Annoying check in opal, and associated test.
    • Intent to ensure we don't put shared memory filesystem components put into NFS locations.
    • Failed on Cray. Fragile test. Lets fix the test to more portable, and not touch many filesystem things, or remove.
    • Probably doesn't belong in make check.
    • A developer might want to look at this, but a user probably doesn't want to do this.
    • Has performance implications, if shared memory components are put onto NFS.
    • We have had users who got bit by this, but Make Check might not have been their answer.
    • Brian will file an issue.
  • Brian is going to add -m32 builds to our test suite.
    • But not actually IA32 hardware, but close enough.
  • Brian tried some new ways of building the tarball, but it failed... so dealyed until Thanksgiving.
  • Root filesystem on webserver failed, because jenkins failed.
    • increased the log rotation to prevent this, but last debian update overwrote the log rotation updates...
    • So it was causing too many logs to be preserved.
    • Server tries to reach out to MacOS Jenkins server, but fails and logs that failure. happening continuously.
      • Will work with Nathan to have his server update jenkins, not jenkins reach out.

This week Discussion Points.

  • Schedule for NEXT v3.1 release (Branch and Ship)

    • Yes there are features in Master that we need to do a fall release.
    • If we shoot for mid-October. Branching Now would give 6 weeks.
    • It might look better if we branch chronologically right after we post v3.0.0.
    • Go/NoGo on making a branch.
      • If you need something for v3.1.0 please bring a list of potential white-list items next week.
      • We'd like to get rid of the concept of white-listing features.
      • v3.0.0 was exceptional since we needed some PMIx items.
    • Brian will send email out to devel about this proposal.
  • Like to enable -Werror. We've gotten sloppy.

    • Could target this for v3.1.0. (or later, need to discuss).
    • It would be tricky to do this for v3.1 if we're targeting branching next week.
    • Keep on agenda for next face to face, but that's not until 2018, so might be good to discuss before then.
  • Running Open MPI in a Container Discussion

    • Brian is going to write up a proposal and send it around to discuss.
  • Next face-to-face meeting

    • Jan / Feb
    • Possible locations: San Jose, Portland, Albuquerque, Dallas

Status Updates:

Status Update Rotation

  1. Mellanox, Sandia, Intel
  2. LANL, Houston, IBM, Fujitsu
  3. Amazon,
  4. Cisco, ORNL, UTK, NVIDIA

Back to 2017 WeeklyTelcon-2017

Clone this wiki locally