Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable trilinos explicit instantiations #231

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

gassmoeller
Copy link
Member

What about this change to solve #219? All explicit instantiations in all trilinos subpackages are disabled by default. Switching this variable enables them for all active packages (I checked the cmake variables in my build, off before, on now). Memory requirements dropped for me, and compile time seemed somewhat faster as well.

The largest compilation unit I have seen was 3.5GB (down from 8GB) and most are now much smaller than that. At the moment I am compiling without complex instantiations though (can try again tomorrow with default if you want).

I also found this: trilinos/Trilinos#8130, so maybe for future trilinos versions we do not need this anymore because trilinos changed its default for new versions (probably the next one after 13.0?)

@koecher
Copy link
Contributor

koecher commented Jul 1, 2021

@gassmoeller Thank you. I like this! Maybe we have overseen this simple solution to the memory issue.

@gassmoeller
Copy link
Member Author

I also want to try a run tomorrow that failed for me before to measure this for the default values to see if we still need to think about the complex instantiations. I will report back once that worked.

@gassmoeller
Copy link
Member Author

Ok, do not merge this right now if complex trilinos is enabled, I get a lot of the following:

In file included from /home/rene/software/trilinos-tmp/tmp/unpack/Trilinos-trilinos-release-12-18-1/packages/kokkos/core/src/Kokkos_Serial.hpp:55,
                 from /home/rene/software/trilinos-tmp/tmp/unpack/Trilinos-trilinos-release-12-18-1/packages/kokkos/core/src/Kokkos_Core.hpp:53,
                 from /home/rene/software/trilinos-tmp/tmp/unpack/Trilinos-trilinos-release-12-18-1/packages/teuchos/kokkoscompat/src/KokkosCompat_ClassicNodeAPI_Wrapper.hpp:6,
                 from /home/rene/software/trilinos-tmp/tmp/unpack/Trilinos-trilinos-release-12-18-1/packages/tpetra/classic/NodeAPI/Kokkos_DefaultNode.hpp:47,
                 from /home/rene/software/trilinos-tmp/tmp/unpack/Trilinos-trilinos-release-12-18-1/packages/muelu/src/Headers/MueLu_ConfigDefs.hpp:54,
                 from /home/rene/software/trilinos-tmp/tmp/build/trilinos-release-12-18-1/packages/muelu/src/Utils/ExplicitInstantiation/MueLu_CoalesceDropFactory_kokkos.cpp:49:
/home/rene/software/trilinos-tmp/tmp/unpack/Trilinos-trilinos-release-12-18-1/packages/kokkos/core/src/Kokkos_Parallel.hpp: At global scope:
/home/rene/software/trilinos-tmp/tmp/unpack/Trilinos-trilinos-release-12-18-1/packages/kokkos/core/src/Kokkos_Parallel.hpp:235:6: error: ‘void Kokkos::parallel_for(const string&, const ExecPolicy&, const FunctorType&) [with ExecPolicy = Kokkos::RangePolicy<int, Kokkos::Serial>; FunctorType = MueLu::CoalesceDropFactory_kokkos<Scalar, LocalOrdinal, GlobalOrdinal, Kokkos::Compat::KokkosDeviceWrapperNode<DeviceType> >::Build(MueLu::CoalesceDropFactory_kokkos<Scalar, LocalOrdinal, GlobalOrdinal, Kokkos::Compat::KokkosDeviceWrapperNode<DeviceType> >::Level&) const [with Scalar = std::complex<float>; LocalOrdinal = int; GlobalOrdinal = int; DeviceType = Kokkos::Serial; typename DeviceType::memory_space = Kokkos::HostSpace; MueLu::CoalesceDropFactory_kokkos<Scalar, LocalOrdinal, GlobalOrdinal, Kokkos::Compat::KokkosDeviceWrapperNode<DeviceType> >::Level = MueLu::Level]::<lambda(MueLu::CoalesceDropFactory_kokkos<std::complex<float>, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial> >::LO)>; std::string = std::__cxx11::basic_string<char>]’, declared using local type ‘const MueLu::CoalesceDropFactory_kokkos<Scalar, LocalOrdinal, GlobalOrdinal, Kokkos::Compat::KokkosDeviceWrapperNode<DeviceType> >::Build(MueLu::CoalesceDropFactory_kokkos<Scalar, LocalOrdinal, GlobalOrdinal, Kokkos::Compat::KokkosDeviceWrapperNode<DeviceType> >::Level&) const [with Scalar = std::complex<float>; LocalOrdinal = int; GlobalOrdinal = int; DeviceType = Kokkos::Serial; typename DeviceType::memory_space = Kokkos::HostSpace; MueLu::CoalesceDropFactory_kokkos<Scalar, LocalOrdinal, GlobalOrdinal, Kokkos::Compat::KokkosDeviceWrapperNode<DeviceType> >::Level = MueLu::Level]::<lambda(MueLu::CoalesceDropFactory_kokkos<std::complex<float>, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial> >::LO)>’, is used but never defined [-fpermissive]
  235 | void parallel_for( const std::string & str
      |      ^~~~~~~~~~~~

It seems some instantiations for std::complex<float> are missing. Will need to take a closer look tomorrow.

@bangerth
Copy link
Member

bangerth commented Jul 2, 2021

Is that the only error, up to uniqueness? I wonder if that's something the Trilinos people would be willing to look at and fix.

@koecher
Copy link
Contributor

koecher commented Jul 6, 2021

Ok, do not merge this right now if complex trilinos is enabled, I get a lot of the following:


In file included from /home/rene/software/trilinos-tmp/tmp/unpack/Trilinos-trilinos-release-12-18-1/packages/kokkos/core/src/Kokkos_Serial.hpp:55,

                 from /home/rene/software/trilinos-tmp/tmp/unpack/Trilinos-trilinos-release-12-18-1/packages/kokkos/core/src/Kokkos_Core.hpp:53,

                 from /home/rene/software/trilinos-tmp/tmp/unpack/Trilinos-trilinos-release-12-18-1/packages/teuchos/kokkoscompat/src/KokkosCompat_ClassicNodeAPI_Wrapper.hpp:6,

                 from /home/rene/software/trilinos-tmp/tmp/unpack/Trilinos-trilinos-release-12-18-1/packages/tpetra/classic/NodeAPI/Kokkos_DefaultNode.hpp:47,

                 from /home/rene/software/trilinos-tmp/tmp/unpack/Trilinos-trilinos-release-12-18-1/packages/muelu/src/Headers/MueLu_ConfigDefs.hpp:54,

                 from /home/rene/software/trilinos-tmp/tmp/build/trilinos-release-12-18-1/packages/muelu/src/Utils/ExplicitInstantiation/MueLu_CoalesceDropFactory_kokkos.cpp:49:

/home/rene/software/trilinos-tmp/tmp/unpack/Trilinos-trilinos-release-12-18-1/packages/kokkos/core/src/Kokkos_Parallel.hpp: At global scope:

/home/rene/software/trilinos-tmp/tmp/unpack/Trilinos-trilinos-release-12-18-1/packages/kokkos/core/src/Kokkos_Parallel.hpp:235:6: error: ‘void Kokkos::parallel_for(const string&, const ExecPolicy&, const FunctorType&) [with ExecPolicy = Kokkos::RangePolicy<int, Kokkos::Serial>; FunctorType = MueLu::CoalesceDropFactory_kokkos<Scalar, LocalOrdinal, GlobalOrdinal, Kokkos::Compat::KokkosDeviceWrapperNode<DeviceType> >::Build(MueLu::CoalesceDropFactory_kokkos<Scalar, LocalOrdinal, GlobalOrdinal, Kokkos::Compat::KokkosDeviceWrapperNode<DeviceType> >::Level&) const [with Scalar = std::complex<float>; LocalOrdinal = int; GlobalOrdinal = int; DeviceType = Kokkos::Serial; typename DeviceType::memory_space = Kokkos::HostSpace; MueLu::CoalesceDropFactory_kokkos<Scalar, LocalOrdinal, GlobalOrdinal, Kokkos::Compat::KokkosDeviceWrapperNode<DeviceType> >::Level = MueLu::Level]::<lambda(MueLu::CoalesceDropFactory_kokkos<std::complex<float>, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial> >::LO)>; std::string = std::__cxx11::basic_string<char>]’, declared using local type ‘const MueLu::CoalesceDropFactory_kokkos<Scalar, LocalOrdinal, GlobalOrdinal, Kokkos::Compat::KokkosDeviceWrapperNode<DeviceType> >::Build(MueLu::CoalesceDropFactory_kokkos<Scalar, LocalOrdinal, GlobalOrdinal, Kokkos::Compat::KokkosDeviceWrapperNode<DeviceType> >::Level&) const [with Scalar = std::complex<float>; LocalOrdinal = int; GlobalOrdinal = int; DeviceType = Kokkos::Serial; typename DeviceType::memory_space = Kokkos::HostSpace; MueLu::CoalesceDropFactory_kokkos<Scalar, LocalOrdinal, GlobalOrdinal, Kokkos::Compat::KokkosDeviceWrapperNode<DeviceType> >::Level = MueLu::Level]::<lambda(MueLu::CoalesceDropFactory_kokkos<std::complex<float>, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial> >::LO)>’, is used but never defined [-fpermissive]

  235 | void parallel_for( const std::string & str

      |      ^~~~~~~~~~~~

It seems some instantiations for std::complex<float> are missing. Will need to take a closer look tomorrow.

We had some recent changes related to this topic. Test it and let us know when you think this is ready to go

@bangerth
Copy link
Member

bangerth commented Jul 6, 2021

Could I suggest looking how xSDK builds Trilinos? deal.II is part of xSDK, and so is Trilinos, and they seem to be working together well within xSDK. So however xSDK builds Trilinos seems to be a winning formula :-)

@koecher
Copy link
Contributor

koecher commented Jul 6, 2021

@bangerth this would be great, I'm not aware of xSDK but this sounds good. A new PR for this would be great if you find the time to do this

@bangerth
Copy link
Member

bangerth commented Jul 6, 2021

I was going to say that I don't actually know what xSDK does, but that's not true. It just uses spack. The deal.II recipe is here: https://github.com/spack/spack/blob/develop/var/spack/repos/builtin/packages/dealii/package.py It seems to me that it requires certain packages in Trilinos to be enabled, but otherwise just goes with whatever spack wants to do about Trilinos. In turns, spack builds Trilinos in the following way: https://github.com/spack/spack/blob/develop/var/spack/repos/builtin/packages/trilinos/package.py It seems to me that it both enables complex and explicit instantiations. There are bunch of additional defines here: https://github.com/spack/spack/blob/develop/var/spack/repos/builtin/packages/trilinos/package.py#L872-L901

I don't actually know enough about how either candi or spack do things, so I'll leave it to you all to come up with the requesite patches (and maybe this one is already that patch), just wanted to point out that others have already figured this out.

@tjhei tjhei mentioned this pull request Mar 26, 2022
@tjhei tjhei mentioned this pull request Jun 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants