Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem building AMUSE from source #1007

Open
Spijkerberg opened this issue Nov 24, 2023 · 21 comments
Open

Problem building AMUSE from source #1007

Spijkerberg opened this issue Nov 24, 2023 · 21 comments
Labels

Comments

@Spijkerberg
Copy link

Hello,

I am trying to build AMUSE from source on a sterrewacht computer.
I am running into issues with certain packages not being built.

Here is the buildlog of the building process.

Thanks in advance,
Menno

@LourensVeen
Copy link
Collaborator

Could you paste the output of module list for the shell in which you're building?

@LourensVeen
Copy link
Collaborator

Looks like Steven is onto something:

make: Entering directory '/data2/vandereerden/amuse/src/amuse/community/capreole'
make -C build_mpi amuse_interface_mpi  VPATH=../src F90FLAGS1="-g -O2 -DNOMPI -fPIC  " FC="/usr/bin/gfortran -fallow-argument-mismatch" MPIFC="/usr/bin/gfortran -fallow-argument-mismatch"
make[1]: Entering directory '/data2/vandereerden/amuse/src/amuse/community/capreole/build_mpi'
/usr/bin/gfortran -fallow-argument-mismatch -c -g -O2 -DNOMPI -fPIC   -DMPI  -I/data2/vandereerden/amuse/lib/stopcond ../src/amuse_mpi.F90
../src/amuse_mpi.F90:54:0:

   54 |   include 'mpif.h'
      | 
Fatal Error: Cannot open included file ‘mpif.h’
compilation terminated.

That's definitely not right. The question is why this is happening. I should be getting access to these machines, so then I can experiment a little.

@rieder
Copy link
Member

rieder commented Nov 24, 2023

It seems to use gfortran, I would expect mpifortran needs to be used here instead.
The argument MPIFC="/usr/bin/gfortran -fallow-argument-mismatch" is probably causing this.

@rieder
Copy link
Member

rieder commented Nov 24, 2023

MPIFC should not be set to gfortran, keep it at mpifortran or undefined.
You can set FC to gfortran -fallow-argument-mismatch if this is important, but redefining an MPI compiler to a non-MPI one is probably what's causing the issue.

@rieder
Copy link
Member

rieder commented Nov 24, 2023

... thinking about it further, I guess this may follow from AMUSE being configured without MPI in the first place. So a first step would be to re-run configure in the AMUSE dir, and checking config.mk for the definition of MPIFC there.

@Spijkerberg
Copy link
Author

Spijkerberg commented Nov 24, 2023 via email

@rieder
Copy link
Member

rieder commented Nov 24, 2023

I'd like to see the config.mk file (in the AMUSE root dir) of the sterrewacht installation of AMUSE, can you find this @Spijkerberg?

@Spijkerberg
Copy link
Author

I found the file in the AMUSE root dir on the sterrewacht machine. I turned it into a txt file to share it.

@rieder
Copy link
Member

rieder commented Nov 24, 2023

Thanks.
This shows AMUSE was configured without MPI support, so no wonder stuff that requires MPI breaks...
This will require fixing on the module level.

@Spijkerberg
Copy link
Author

Spijkerberg commented Nov 24, 2023

Looking at the config.mk that was produced from my own installation from source, I see that MPI_ENABLED=no, while I did load the MPI module before installing. I could try to install AMUSE from source again, but I do not know if that would help.

@rieder
Copy link
Member

rieder commented Nov 24, 2023 via email

@Spijkerberg
Copy link
Author

Looking at the packages enabled in my environment I see that mpi4py is already installed.
Here is the output of pip freeze:
-e git+https://github.com/amusecode/amuse.git@72c4a3c32c21e48f3a823af9f742c7de2684138b#egg=amuse_devel
docutils==0.20.1
h5py==3.10.0
iniconfig==2.0.0
mpi4py==3.1.5
numpy==1.26.2
packaging==23.2
pluggy==1.3.0
pytest==7.4.3
setuptools-scm==8.0.4
typing_extensions==4.8.0

@rieder
Copy link
Member

rieder commented Nov 24, 2023 via email

@Spijkerberg
Copy link
Author

This did change the config.mk file. I can see that MPI_ENABLED=yes is set correctly now.
I will try rebuilding AMUSE to see what the result is.

@Spijkerberg
Copy link
Author

It seems that some more of the community codes have been built, but there are still some errors when building.
I have provided the buildlogs again for you to inspect.

Testing to see if the community codes work results in UNPACK-OPAL-VALUE: UNSUPPORTED TYPE 33 FOR KEY .

@LourensVeen
Copy link
Collaborator

OPAL is OpenMPI's utility library. This error sounds like some kind of data format mismatch, which suggests there are different versions of MPI in use. Possibly mpi4py got compiled against a different version of MPI than AMUSE? Or the version you have active when running your script doesn't match the one that was loaded when mpi4py was installed and/or when AMUSE was built?

@LourensVeen
Copy link
Collaborator

I had a go at building AMUSE on a Sterrenwacht machine. Progress so far:

(amuse-env) <user>@<host>:~/amuse$ python setup.py develop_build
Illegal instruction (core dumped)

Looks like some kind of numpy build issue. Time to start digging...

@rieder
Copy link
Member

rieder commented Dec 6, 2023

If you do module load AMUSE you should now get the correct prerequisites, there was an issue with mpi4py installing the wrong mpi...

@rieder
Copy link
Member

rieder commented Dec 6, 2023

OPAL is OpenMPI's utility library. This error sounds like some kind of data format mismatch, which suggests there are different versions of MPI in use. Possibly mpi4py got compiled against a different version of MPI than AMUSE? Or the version you have active when running your script doesn't match the one that was loaded when mpi4py was installed and/or when AMUSE was built?

this was exactly the issue. mpi4py was installed in an incorrect way, built against the wrong (conda) openmpi library - which then clashed with the correct one. This also caused the wrong configuration of the AMUSE module.

@LourensVeen
Copy link
Collaborator

What a mess. Actually, when I try to module load AMUSE I get this:

Lmod has detected the following error:  Unable to load module because of error when evaluating modulefile:
    /easybuild/easybuild/el8_8/modules/all/AMUSE/2023.10.0.lua: Empty or non-existent file
    Please check the modulefile and especially if there is a line number specified in the above message  
While processing the following module(s):
   Module fullname  Module Filename
   ---------------  ---------------
   AMUSE/2023.10.0  /easybuild/easybuild/el8_8/modules/all/AMUSE/2023.10.0.lua

That lua script exists, but has permissions 600, so it can't read it...

Seems like I should put a working EasyBuild configuration for AMUSE on my to-do list, after Conda packages and a new build system.

@rieder
Copy link
Member

rieder commented Dec 6, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants