Skip to content

WeeklyTelcon_20220315

Geoffrey Paulsen edited this page Mar 22, 2022 · 1 revision

Open MPI Weekly Telecon ---

  • Dialup Info: (Do not post to public mailing list or public wiki)

Attendees (on Web-ex)

  • Austen Lauria (IBM)
  • Brendan Cunningham (Cornelis Networks)
  • Brian Barrett (AWS)
  • Christoph Niethammer (HLRS)
  • David Bernhold (ORNL)
  • Geoffrey Paulsen (IBM)
  • Harumi Kuno (HPE)
  • Hessam Mirsadeghi (UCX/nVidia)
  • Howard Pritchard (LANL)
  • Jeff Squyres (Cisco)
  • Joseph Schuchart
  • Matthew Dosanjh (Sandia)
  • Todd Kordenbrock (Sandia)
  • Tomislav Janjusic (nVidia)
  • William Zhang (AWS)

not there today (I keep this for easy cut-n-paste for future notes)

  • Akshay Venkatesh (NVIDIA)
  • Artem Polyakov (nVidia)
  • Aurelien Bouteiller (UTK)
  • Brandon Yates (Intel)
  • Charles Shereda (LLNL)
  • Edgar Gabriel (UoH)
  • Erik Zeiske
  • Geoffroy Vallee (ARM)
  • George Bosilca (UTK)
  • Josh Hursey (IBM)
  • Joshua Ladd (nVidia)
  • Marisa Roman (Cornelius)
  • Mark Allen (IBM)
  • Matias Cabral (Intel)
  • Michael Heinz (Cornelis Networks)
  • Nathan Hjelm (Google)
  • Noah Evans (Sandia)
  • Raghu Raja (AWS)
  • Ralph Castain (Intel)
  • Sam Gutierrez (LLNL)
  • Scott Breyer (Sandia?)
  • Shintaro iwasaki
  • Thomas Naughton (ORNL)
  • Xin Zhao (nVidia)

4.0.x

  • Schedule: No schedule for v4.0.8 yet
  • Winding down v4.0.x, and after v5.0.x will stop
  • Really only want small changes reported by users.
  • New docs landing page on read-the-docs to say docs are NOT there.

v4.1.x

  • Schedule: Shooting for v4.1.3 end of March/Q1.
    • Goal v4.1.3rc2 Today.
  • No other update.
  • New docs landing page on read-the-docs to say docs are NOT there.
  • Fortran Elemental - https://github.com/open-mpi/ompi/pull/10119
    • New MPI Errata
    • Better for users, and does NOT affect ABI.
    • Only in v4.1.3 and v5.0.0+

Read the Docs

  • merged to master last night.

  • Jeff sent email, and will either put into the wiki or in docs themselves

  • Jeff shared https://docs.open-mpi.org/

    • This will be for v5.0.0 and later
    • Links to older docs for v4.1.x and earlier
  • Also a mobile rendering

  • Think the docs configury is done.

    • If issues, slack or email devel
  • Tons of stuff that is ready, but lots of places

  • Thanks to Harumi for converting all of the man-pages

    • They look great!
    • They are cross-referenced now.
  • For developer, when you git clone, you'll now get a docs/ directory

    • There's an RST guide under developer's guide.
  • Now when you push a PR, there's a details link under the Read-the-docs CI under your PR, for you to preview that PR.

  • This is true for master and will be in a few weeks for v5.0.x release branches.

  • Going to let this soak a while on master, but hope to bring to v5.0.x after a bit more testing on master.

  • USES branch names, so this may be a driver to change master to main.

    • Branch name is in URL so might want to do this sooner before others cache urls.
  • Official Tarballs will have html and man-pages pre-built

  • Developers will need to install sphinx to generate html and man-pages.

    • Open MPI Developers guide has a page on how to install sphinx.
    • uses sphinx-build, and (like Make) it's stateful and only rebuilds changes.
  • Just open build/index.html locally in browser.

  • When you git clone, there IS not build directory. It can take 3-5 minutes to build build directory.

    • But if you want you can just rm -rf build.
  • When doing code changes PR against master, Good to do both Doc updates AND code changes in same PR,

    • But if you
  • What's the behavior if you don't have sphinx installed?

    • Configure will just skip building the docs.
    • BUT in this state, you won't be able to do make dist
  • Amazon is calling make dist in CI.

    • so CI should be covered.
    • This will test and fail CI if error in docs

v5.0.x

  • Schedule: v5.0.0rc4 Next week
  • Issues with --version and help file.
  • PBS build failure (also needs to be pulled into)
  • Need submodule pointer updates.
  • UCX/OSC close to PRing issue.
  • Could do cherry-pick of all the docs to v5.0.x
  • Need to continue working on docs
  • POSIX command line options with double-dashes, but also single dash -np for -n.
    • There's a PR in PRRTE to silently convert all of the single dash options to double dashes.
    • We do this conversion, then just call get_opt_long()
      • No need for warning, since we don't think we want to drop single dash options EVER.
    • All of this code lives in the ompi schizo, so we can do this.
    • Do we document these single dash options?
      • Need to document some of them, because it'd be very weird to not document the MPI specified ones.
    • Could just have a single line saying we do this, since we hope to maintain this long term
    • If we do mpirun -n in docs... but then also say -np is a synonym for -n.
  • Brian's trying to get the point where static builds work correctly with prte and pmix.
    • had to rewrite check_package macro (will port to OMPI) to use pkg_config properly.
    • If we have a pkg_config file, will use that, rather than grepping around for this.

Master

  • If you go to https://docs.open-mpi.org/ You'll notice that the name of the branch is in the URL.

    • We've long wanted to change our primary branch name to something less regressive.
    • We just haven't due to a number of logistical issues / timing.
    • But going forward, if we get the branch name in the URL, then other documents will have the branch name in the URL, which will be lots of pain for others as well.
    • In the past, we could have added re-directs... but due to read-the-docs, redirects wouldn't be possible.
  • Today we just wanted to bring it up.

  • We will start talking about the mechanics of how to do this.

  • Earliest by next week for plan.

  • Approximate schedule: Either in conjunction with or before v5.0.0 release. Perhaps by next Friday.

    • People's MTTs (How many people's MTTs running against checkouts vs tarball)?
      • But tarball URL has master in it's URL.
        • But we're planning to redirect it.
    • Jenkin's CI
    • Coverity
  • No new Gnus

MTT

  • Some Cisco Build Failures, haven't looked at yet.
  • A fix pending to workaround the IBM XL MTT build failure (compiler abort)
  • Issue 9919 - Thinks this common component should still be built.
    • Commons get built when it's likely their is a dependency.
    • Commons self-select if they should be built or not.
Clone this wiki locally