Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add Tpetra tests. #16611

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

bangerth
Copy link
Member

@bangerth bangerth commented Feb 8, 2024

This is definitely a WIP patch: 284 tests for the Tpetra wrappers, or which currently 41 succeed. I created these tests by cloning the existing tests/trilinos/ directory and replacing TrilinosWrappers:: by LinearAlgebra::TpetraWrappers:: plus some minor adjustments such as adding template arguments. I'm sure many tests need additional adjustments.

Many (=the vast majority) of these tests clearly do not currently work. But the patch may help us isolate what needs to be fixed to make them work (such as the issue with VectorReference::operator Number() mentioned in #16610) and I will rebase this branch frequently to see how things proceed.

I'll also keep a list of things that need to be fixed for a substantial number of tests, and that I'll keep updated as I see things. Here's a start:

  • We need an implementation of VectorReference::operator Number()
  • SparseMatrix::operator() (int,int) does not exist
  • SparseMatrix::el(int,int) does not exist
  • Vector::operator== is missing.

@kinnewig @jpthiele FYI

@jpthiele
Copy link
Contributor

jpthiele commented Feb 9, 2024

Very nice!

A few questions for the direct solver tests:

  • will we be able to add a full setup to the test server, so we can test whether all the solvers supported by Amesos2 work?
    That would possibly be its own test that is only run there as it would issue various Exceptions when trying
    to create a solver that is not configured in Trilinos
  • should we also have an exception test to see that all the warnings are correctly raised, so only KLU2 works as
    that is the only one bundled with Amesos2
  • Alternative option: is there a nice way to decide on the file the output should be compared with based on the CMake configuration? Possibly with a direct_solver.output.in that is configured by CMake based on what it can find?
  • We should still try to add our subset of tests in our PRs according to this scheme here, right?

@bangerth
Copy link
Member Author

bangerth commented Feb 9, 2024

Yes, with have a number of tests that have outputs like tests/trilinos/precondition_muelu_smoother.with_trilinos_with_muelu=on.with_64bit_indices=off.output that express requirements. Whether it's worth testing that we get errors on unsupported configurations is perhaps debatable, but at least we can limit running tests if they require a configuration not currently available.

@bangerth
Copy link
Member Author

bangerth commented Feb 9, 2024

With #16613, we are now at 55/284 tests succeeding.

@bangerth
Copy link
Member Author

And with #16616, we're now at 71/284 tests succeeding. There's progress :-)

@jpthiele
Copy link
Contributor

Yes, with have a number of tests that have outputs like tests/trilinos/precondition_muelu_smoother.with_trilinos_with_muelu=on.with_64bit_indices=off.output that express requirements. Whether it's worth testing that we get errors on unsupported configurations is perhaps debatable, but at least we can limit running tests if they require a configuration not currently available.

Okay good to know!
Then the question would be how the current setups on the tester look like
and whether we want to have a 'feature complete' Amesos2 test setup.

@jpthiele
Copy link
Contributor

jpthiele commented Feb 10, 2024

With the swap function implemented in #16602
at least the SolverCG test should pass since I was able to compute the reference solution for direct_solver_2.cc with it.
If you want I can make a dedicated PR to bring the deal_solver_*.cc test to passing.
Or two, one for swap alone and one for the tests.

Update: On it and almost completed. Currently recompiling as norm_sqr() was missing and needed bei Minres.

@jpthiele
Copy link
Contributor

We now have #16625

@bangerth
Copy link
Member Author

bangerth commented Feb 10, 2024 via email

@kinnewig
Copy link
Contributor

I like to keep using this PR to keep track of the work we have done so far.

  • We need an implementation of VectorReference::operator Number()

Implemented by: #16613

  • SparseMatrix::operator() (int,int) does not exist
  • SparseMatrix::el(int,int) does not exist

Implemented by: #16616

  • Vector::operator== is missing.

Implemented by: #16664

The current master passes 93 tests out of 284.

@bangerth
Copy link
Member Author

Now at 114 out of 288 tests succeeding :-)

@kinnewig
Copy link
Contributor

kinnewig commented Feb 22, 2024

With #16663, we can now convert vectors between deal.II and tpetra vectors. So, a few more tests should succeed now.

But the test tests/trilinos_tpetra/vector_local_copy_01.cc will now produce a deadlock (see #16663), so maybe we should remove that test.

@bangerth
Copy link
Member Author

I have to admit that I don't understand the deadlock (nor perhaps what the test does). I think what the test does is create a Tpetra vector locally on a process and copy it to a deal.II vector:

if (n_mpi_processes > 1)
    MPI_Comm_split(mpi_communicator, colour, key, &serial_communicator);
  else
    serial_communicator = mpi_communicator;

  if (this_mpi_process == 0)
    {
      const LinearAlgebra::TpetraWrappers::Vector<double> tril_vec(
        complete_index_set(10), serial_communicator);

      // Check copy constructor
      const Vector<double> local_vector_1(tril_vec);

      // Check equality operator
      Vector<double> local_vector_2;
      local_vector_2 = tril_vec;
    }

I believe that the serial_communicator we get is really just MPI_COMM_SELF. So we create a non-parallel vector, and then we copy it. What could actually go wrong here? We never use any MPI communicators with more than one process. I have to admit that I don't even know what this is supposed to test for the Epetra wrappers, nor do I understand how that could deadlock with the new Tpetra wrappers. Can you explain what I'm (clearly) missing?

@masterleinad
Copy link
Member

I believe that the serial_communicator we get is really just MPI_COMM_SELF. So we create a non-parallel vector, and then we copy it. What could actually go wrong here? We never use any MPI communicators with more than one process. I have to admit that I don't even know what this is supposed to test for the Epetra wrappers, nor do I understand how that could deadlock with the new Tpetra wrappers. Can you explain what I'm (clearly) missing?

The test was introduced in #5794. Before that patch we were doing

TrilinosWrappers::MPI::Vector localized_vector;
      localized_vector.reinit(complete_index_set(vec_size));

when copying to a deal::Vector which creates the temporary TrilinosWrapprea::MPI::Vector with an MPI_COMM_GLOBAL communicator instead of using the communicator of the input TrilinosWrapprea::MPI::Vector. This led to a deadlock if the input vector used an MPI communicator that only included the processes that were involved in making a copy.
I would assume this to work correctly for TpetraWrappers::Vector, though.

@bangerth
Copy link
Member Author

bangerth commented Mar 7, 2024

We're now at 141/288 succeeding tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants