Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ess:lsf configure fails with duplicate nsl libraries #12457

Closed
gregfi opened this issue Apr 9, 2024 · 18 comments
Closed

ess:lsf configure fails with duplicate nsl libraries #12457

gregfi opened this issue Apr 9, 2024 · 18 comments

Comments

@gregfi
Copy link

gregfi commented Apr 9, 2024

I'm attempting to compile OpenMPI 2.1.6 on SuSE Linux 15-SP5, and I'm getting a failure at during the ess:lsf configure step. I've attached the config.log.

This system has the following libnsl libraries, and I wonder it's causing some confusion:

fischega@ptest15sp5:/local/fischega/scale/openmpi/build> rpm -qf /lib64/libnsl.so.1 
glibc-2.31-150300.58.1.x86_64
fischega@ptest15sp5:/local/fischega/scale/openmpi/build> rpm -qf /usr/lib64/libnsl.so.2.0.0 
libnsl2-1.2.0-2.44.x86_64

Is there a way I could work around this and get OpenMPI to build with LSF support?

config.log

@jsquyres
Copy link
Member

I'm afraid that Open MPI v2.1.x is ancient. Can you upgrade to a newer version?

@gregfi
Copy link
Author

gregfi commented Apr 15, 2024 via email

@jsquyres
Copy link
Member

jsquyres commented Apr 16, 2024

Bummer. I don't know if we can help much; v2.1.6 is from January 2019; we haven't touched that release branch in years. That being said, I don't know if this is an Open MPI error. I see in the config.log:

/usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: /tools/lsf_GF/10.1/linux3.10-glibc2.17-x86_64/lib/liblsf.so: undefined reference to `yp_get_default_domain'
/usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: /tools/lsf_GF/10.1/linux3.10-glibc2.17-x86_64/lib/liblsf.so: undefined reference to `yp_all'
/usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: /tools/lsf_GF/10.1/linux3.10-glibc2.17-x86_64/lib/liblsf.so: undefined reference to `ypprot_err'
collect2: error: ld returned 1 exit status

Which means that the LSF library is looking for the yellow pages ("yp") libraries, and it doesn't seem to be finding them. Open MPI's configure therefore rightfully fails (because you asked for LSF support, but Open MPI cannot link against the LSF libraries because they are missing a dependency).

@hammett999
Copy link

Which means that the LSF library is looking for the yellow pages ("yp") libraries, and it doesn't seem to be finding them. Open MPI's configure therefore rightfully fails (because you asked for LSF support, but Open MPI cannot link against the LSF libraries because they are missing a dependency).

I think I may have found my way here because I'm experiencing the same or similar issue, but I'm trying to build version 4.1.6 with lsf support on rocky linux 9.3.
checking for library containing yp_all... no
I did see this. https://www.open-mpi.org/faq/?category=building#build-rte-lsf
I have libnsl and libnsl2 installed, but there is no libnsl-devel in 9. And then I found this which I think is the real problem.
https://access.redhat.com/solutions/5991271
I don't know anything about Suse, but if Red Hat dropped yp, maybe other distros are as well.

@rhc54
Copy link
Contributor

rhc54 commented Apr 17, 2024

The problem is that LSF may still require the yp library - i.e., maybe LSF doesn't support the newer kernels if they are dropping it. Honestly don't know. However, if you have an LSF license, your best option is to check with IBM and see what they say about the requirement.

Changing the config isn't too hard, though it won't help with an already-released version. The yp check is there, however, because up til now at least, LSF required it.

@hammett999
Copy link

I have a rocky 9 client joined to our cluster (IBM Spectrum LSF Standard 10.1.0.13). No problem running jobs on it. In fact we don't have libnsl* installed on any of our hosts. Not the schedulers, not any nodes.

@rhc54
Copy link
Contributor

rhc54 commented Apr 17, 2024

🤷‍♂️ So I guess your saying that LSF 10.1.0.13 no longer requires yp?? News to us, unfortunately - wonder when that changed? Not sure what to say about it - can't go back to existing releases, and it doesn't seem that IBM is maintaining the LSF launch integration if that's the case. Suspect the OMPI folks will need to ponder how to resolve it.

Thanks for the update and info!

@jsquyres
Copy link
Member

We're (very) unlikely to go change something in the v2.1.x series. If this is still an issue in the v5.0.x series, we might be able to have a look, but since IBM isn't maintaining the LSF integration any more, a patch would likely be greatly appreciated.

@hammett999
Copy link

I've been digging through IBM's LSF documentation out of curiosity. This is the only mention of "libnsl" I found.
https://www.ibm.com/docs/en/spectrum-lsf/10.1.0?topic=notes-known-issues

On RHEL 8, the LSF cluster cannot start up due to a missing libnsl.so.1 file. To resolve this issue, install the libnsl package to ensure that the libnsl.so.1 exists.
yum install libnsl

We're still running 7 so this hasn't been an issue for us. Yet.

And these are the two mentions of "yp" I found. Neither of which is necessary to run lsf.
https://www.ibm.com/docs/en/spectrum-lsf/10.1.0?topic=linux-registering-service-ports
https://www.ibm.com/docs/en/spectrum-lsf/10.1.0?topic=hosts-host-names

I downloaded 5.0.3 and got the same error during configure.

@jsquyres
Copy link
Member

The config.log clearly shows that the LSF library requires the YP library:

/usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: /tools/lsf_GF/10.1/linux3.10-glibc2.17-x86_64/lib/liblsf.so: undefined reference to `yp_get_default_domain'
/usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: /tools/lsf_GF/10.1/linux3.10-glibc2.17-x86_64/lib/liblsf.so: undefined reference to `yp_all'
/usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: /tools/lsf_GF/10.1/linux3.10-glibc2.17-x86_64/lib/liblsf.so: undefined reference to `ypprot_err'
collect2: error: ld returned 1 exit status

There's a dependency that the LSF library needs, and it isn't being found. Open MPI is simply the canary in the coal mine here -- it's reporting the problem with the LSF library, and that problem has to be solved before you can build Open MPI successfully.

@hammett999
Copy link

Apparently the OP and I having different issues, since I don't have those in my config.log. I'll stop adding to the confusion here and raise a new issue if I can't figure it out. @gregfi I'm sorry for hijacking your issue. Good luck.

@jsquyres
Copy link
Member

@hammett999 Sorry, didn't even realize you were a different poster. Yes, your issue sounds different. Open a new issue if it persists.

@gregfi My above answers pertain to your original question.

@gregfi
Copy link
Author

gregfi commented Apr 29, 2024

Response from IBM:

The libnsl library is required for LSF as it provides the Yellow Pages (YP) client functionality. Typically, the YP client library is included with libnsl.
Starting from RHEL 8, libnsl.so.1 is not installed by default. Since SLES 15 aligns with RHEL 8 in terms of release timing and shared features, this change may also applies to SLES 15.
/usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: /tools/lsf_GF/10.1/linux3.10-glibc2.17-x86_64/lib/liblsf.so: undefined reference to `ypprot_err'
The above error indicates the libnsl is not present on your system, you may need to install it using "sudo zypper install libnsl" on SLES15.
The integration of LSF with OpenMPI provides improved performance for task launching and resource usage accounting. It's important to note that while LSF support covers these aspects, any compiling issues related to third-party programs fall outside the scope of LSF support.

The libnsl library is clearly installed on the system (as per the first post on this thread).

I have confirmed that setting LDFLAGS="-l:libnsl.so.2" allows the LSF component to successfully compile. For some reason, LDFLAGS="-l:libnsl.so.1" does not seem to work.

Note that this issue is not limited to OpenMPI 2.1.6; I have the same failure when configuring OpenMPI-5.0.2 on SLES15.

@hammett999
Copy link

Note that this issue is not limited to OpenMPI 2.1.6; I have the same failure when configuring OpenMPI-5.0.2 on SLES15.

And this fixes my issue as well on Rocky 9. EL 9 doesn't have a libnsl.so.2, but setting LDFLAGS="-l:libnsl.so.3" allowed configure to complete.

@rhc54
Copy link
Contributor

rhc54 commented Apr 29, 2024

I believe you are not supposed to put the suffix on that option - i.e., just set LDFLAGS="-l:libnsl", yes? That way the system will pick up the most current version of the library.

@gregfi
Copy link
Author

gregfi commented Apr 29, 2024

I tried with LDFLAGS="-lnsl" and -l:libnsl and I get:

/usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: cannot find -lnsl: No such file or directory

and

/usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: cannot find -l:libnsl: No such file or directory

@jsquyres
Copy link
Member

jsquyres commented May 2, 2024

Looks like this was not an Open MPI problem, and the reporting user is now able to build. I'll close the issue.

@jsquyres jsquyres closed this as completed May 2, 2024
@gregfi
Copy link
Author

gregfi commented May 7, 2024

In case anyone else falls victim to this and ends up here: the underlying issue was that the libnsl-devel package was not installed. Inside the libnsl-devel package on SLES15 is:

lrwxrwxrwx 1 root root 15 May 25 2018 /usr/lib64/libnsl.so -> libnsl.so.2.0.0

Our admins were reluctant to modify their configuration by installing the package, but simply setting the above link manually resolves the issue and allows OpenMPI to build with LSF support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants