BoomerAMG performance if some basis functions have large support #968

craffael · 2023-09-22T19:25:16Z

Dear all

I'm trying to use BoomerAMG as a preconditioner to solve a 3D laplace problem which is assembled by a finite element code.
I have the particular problem, that a small number of the basis functions have very large support (the support is essentially a 2D surface). The number of these "large-support" basis functions is independent of the overall mesh size and depends only on the topology.

For small problems BoomerAMG works quite reasonable, but as soon as I run it with a few million dofs, the number of CG steps drastically increases and so BoomerAMG doesn't scale very well anymore.

Do you have any idea how I could tune BoomerAMG for such a problem?

At the moment I use mostly the default options, but with the chebyshev relaxation method and with one level of aggressive coarsening.

I would be very grateful for some input.

Thanks
Raffael

victorapm · 2023-09-29T12:15:24Z

Hi @craffael, this sounds like an interesting application. Is your code open-source or would you be able to share some matrices? I could take a closer look.

craffael · 2023-09-29T18:55:47Z

@victorapm I've setup a small test program which shows the issue. Along with the test program I provide you matrices for four different mesh sizes (the mesh size is halfed from one mesh to the next). Each mesh has 5 "special basis functions" with large support. If we leave away the special basis by removing the last 5 rows and columns (mesh file ending in _without), then the number of CG iterations decreases drastically.

Here's the number of CG iterations that I measured:

Mesh	#unknowns (including special basis fcn)	#pcg iterations with	#pcg iterations without
lhs3	10'743	40	12
lhs4	69'970	41	13
lhs5	497'920	76	14
lhs6	3'700'686	109	16

We can see that there are always more #pcg iterations with the special basis funcitons than without and that the #pcg iterations increases drastically if the number of unknowns is above a threshold.

If we increase the number of special basis functions to 81 (another case), it's much worse:

Mesh	#unknowns	#pcg iterations with	#pcg iterations without
case2_3	13'581	757	12
case2_4	80'139	769	13
case2_5	562'570	1'277	14

Here's the code for the program which showcases this:

#include <HYPRE.h>
#include <HYPRE_IJ_mv.h>
#include <HYPRE_parcsr_mv.h>
#include <HYPRE_krylov.h>
#include <HYPRE_parcsr_ls.h>

#include <_hypre_parcsr_mv.h>
#include <IJ_matrix.h>
#include <assert.h>
#include <stdio.h>

int main(int argc, char*argv[]) {
  HYPRE_Init();

  char buffer[100];

  // Read matrix:
  HYPRE_IJMatrix matrixij = NULL;
  assert(argc == 2);
  HYPRE_IJMatrixRead(argv[1], 0, HYPRE_PARCSR, &matrixij);
  HYPRE_DescribeError(HYPRE_GetError(), buffer);
  printf("Error: %s\n", buffer);
  assert(HYPRE_GetError() == 0);

  // make ParCsr matrix:
  HYPRE_ParCSRMatrix parcsr = NULL;
  HYPRE_IJMatrixGetObject(matrixij, (void**) &parcsr);
  assert(HYPRE_GetError() == 0);

  // setup rhs (vector of ones):
  HYPRE_Int n = matrixij->row_partitioning[1];
  printf("N = %d\n", n);
  HYPRE_ParVector rhs = NULL;
  HYPRE_ParVectorCreate(0, n, NULL, &rhs);
  assert(HYPRE_GetError() == 0);
  HYPRE_ParVectorInitialize(rhs);
  assert(HYPRE_GetError() == 0);
  HYPRE_ParVectorSetConstantValues(rhs, 1.0);
  assert(HYPRE_GetError() == 0);

  // setup solution vector
  HYPRE_ParVector x = NULL;
  HYPRE_ParVectorCreate(0, n, NULL, &x);
  assert(HYPRE_GetError() == 0);
  HYPRE_ParVectorInitialize(x);
  assert(HYPRE_GetError() == 0);
  HYPRE_ParVectorSetConstantValues(x, 0.0);
  assert(HYPRE_GetError() == 0);

  // Create BoomerAMG
  HYPRE_Solver boomeramg = NULL;
  HYPRE_BoomerAMGCreate(&boomeramg);
  assert(HYPRE_GetError() == 0);

  // set relax type to chebyshev
  HYPRE_BoomerAMGSetRelaxType(boomeramg, 16);
  assert(HYPRE_GetError() == 0);

  // set num aggressive levels:
  HYPRE_BoomerAMGSetAggNumLevels(boomeramg, 1);
  assert(HYPRE_GetError() == 0);

  //HYPRE_BoomerAMGSetPrintLevel(boomeramg, 3);
  //assert(HYPRE_GetError() == 0);

  HYPRE_BoomerAMGSetStrongThreshold(boomeramg, 0.5);
  assert(HYPRE_GetError() == 0);

  HYPRE_BoomerAMGSetTol(boomeramg, 0.0);
  assert(HYPRE_GetError() == 0);

  HYPRE_BoomerAMGSetMaxIter(boomeramg, 1);
  assert(HYPRE_GetError() == 0);

  //// setup boomeramg:
  HYPRE_BoomerAMGSetup(boomeramg, parcsr, NULL, NULL);
  assert(HYPRE_GetError() == 0);  

  // setup pcg:
  HYPRE_Solver pcg;
  HYPRE_ParCSRPCGCreate(0, &pcg);
  assert(HYPRE_GetError() == 0);

  HYPRE_ParCSRPCGSetTol(pcg, 1e-6);
  assert(HYPRE_GetError() == 0);

  HYPRE_ParCSRPCGSetMaxIter(pcg, 1000);
  assert(HYPRE_GetError() == 0);

  HYPRE_ParCSRPCGSetLogging(pcg, 2);
  assert(HYPRE_GetError() == 0);

  HYPRE_ParCSRPCGSetPrintLevel(pcg, 2);
  assert(HYPRE_GetError() == 0);

  HYPRE_ParCSRPCGSetPrecond(pcg, HYPRE_BoomerAMGSolve, HYPRE_BoomerAMGSetup, boomeramg);
  assert(HYPRE_GetError() == 0);

  HYPRE_ParCSRPCGSetup(pcg, parcsr, rhs, x);
  assert(HYPRE_GetError() == 0);

  HYPRE_ParCSRPCGSolve(pcg, parcsr, rhs, x);
  assert(HYPRE_GetError() == 0);
  
  
  HYPRE_DescribeError(HYPRE_GetError(), buffer);
  printf("Error: %s\n", buffer);

  HYPRE_Finalize();
}

And here are the mesh files (I've omitted lhs6 because it's very large)

victorapm · 2023-09-29T23:28:05Z

@craffael Thanks for the quick response and providing the data! Please, allow me a couple of weeks to take a look.

Lastly, are there any publications detailing the model that you are solving or is this part of an on-going work?

Thanks!

craffael · 2023-09-30T08:47:54Z

@victorapm It would be great if you could have a look. I'm still a bit puzzled by the fact that a few additional basis functions can trouble AMG that much.

I've not published anything about this problem. But it is essentially a scalar valued magnetostatic problem in a topologically non-trivial domain without any source current. Our equations are as follows:

Because of the first equation, we know that H is a gradient field + some "cohomology basis functions". For example, if our domain was a donut, we would introduce a cut of the donut. The corresponding cohomology basis function would just be a scalar field but with a uniform jump of 1 across the cut surface:

Therefore we can write

with T_j being the cohomology basis functions, \phi_i being normal lagrangian hat functions and \alpha_i, \beta_j being normal coefficients. (the gradient with the tilde is only well defined inside the elements because of the jump. Strictly speaking it acts on functions from the H1 space broken along the cut surface). Then we make a usual finite element galerkin formulation with this representation of H to solve for the second equation div(\mu H) = 0.

ulrikeyang · 2023-09-30T15:35:29Z

Which solver are you using? It looks like you should use AMS or ADS. Get Outlook for iOS<https://aka.ms/o0ukef>

…

________________________________ From: craffael ***@***.***> Sent: Saturday, September 30, 2023 1:48:07 AM To: hypre-space/hypre ***@***.***> Cc: Subscribed ***@***.***> Subject: Re: [hypre-space/hypre] BoomerAMG performance if some basis functions have large support (Issue #968) @victorapm<https://urldefense.us/v3/__https://github.com/victorapm__;!!G2kpM7uM-TzIFchu!xCVEJNgmrbDfoBGBIFzYabs4Jgk7B9bn2GYbxz0I1jPF60h6W5yBzHfbY7o2XUu_MjTsehDlpWVx3CHqoZb-d43gUg$> It would be great if you could have a look. I'm still a bit puzzled by the fact that a few additional basis functions can trouble AMG that much. I've not published anything about this problem. But it is essentially a scalar valued magnetostatic problem in a topologically non-trivial domain without any source current. Our equations are as follows: [image]<https://urldefense.us/v3/__https://user-images.githubusercontent.com/15138672/271763033-8f2b8a0c-ea51-4777-ae7a-87a691c1ae74.png__;!!G2kpM7uM-TzIFchu!xCVEJNgmrbDfoBGBIFzYabs4Jgk7B9bn2GYbxz0I1jPF60h6W5yBzHfbY7o2XUu_MjTsehDlpWVx3CHqoZb_GUKznQ$> Because of the first equation, we know that H is a gradient field + some "cohomology basis functions". For example, if our domain was a donut, we would introduce a cut of the donut. The corresponding cohomology basis function would just be a scalar field but with a uniform jump of 1 across the cut surface: [image]<https://urldefense.us/v3/__https://user-images.githubusercontent.com/15138672/271763182-7dc3fa4f-45a5-40e9-b988-0aa5d82d4663.png__;!!G2kpM7uM-TzIFchu!xCVEJNgmrbDfoBGBIFzYabs4Jgk7B9bn2GYbxz0I1jPF60h6W5yBzHfbY7o2XUu_MjTsehDlpWVx3CHqoZa5QA9Qhg$> Therefore we can write [image]<https://urldefense.us/v3/__https://user-images.githubusercontent.com/15138672/271763622-3d2ee7cb-39c1-4604-a042-bc476622b7b3.png__;!!G2kpM7uM-TzIFchu!xCVEJNgmrbDfoBGBIFzYabs4Jgk7B9bn2GYbxz0I1jPF60h6W5yBzHfbY7o2XUu_MjTsehDlpWVx3CHqoZZPPRHKEQ$> with T_j being the cohomology basis functions, \phi_i being normal lagrangian hat functions and \alpha_i, \beta_j being normal coefficients. (the gradient with the tilde is only well defined inside the elements because of the jump. Strictly speaking it acts on functions from the H1 space broken along the cut surface). Then we make a usual finite element galerkin formulation with this representation of H to solve for the second equation div(\mu H) = 0. — Reply to this email directly, view it on GitHub<https://urldefense.us/v3/__https://github.com/hypre-space/hypre/issues/968*issuecomment-1741717520__;Iw!!G2kpM7uM-TzIFchu!xCVEJNgmrbDfoBGBIFzYabs4Jgk7B9bn2GYbxz0I1jPF60h6W5yBzHfbY7o2XUu_MjTsehDlpWVx3CHqoZaNtIuxGQ$>, or unsubscribe<https://urldefense.us/v3/__https://github.com/notifications/unsubscribe-auth/AD4NLLNH7UZO5GR6E7PFHSDX47MEPANCNFSM6AAAAAA5DRGRRQ__;!!G2kpM7uM-TzIFchu!xCVEJNgmrbDfoBGBIFzYabs4Jgk7B9bn2GYbxz0I1jPF60h6W5yBzHfbY7o2XUu_MjTsehDlpWVx3CHqoZbKen77IA$>. You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

victorapm · 2023-09-30T16:05:09Z

Hi @craffael,

thanks for providing more details! This is helpful! We might have to develop/update a preconditioner in hypre for solving a problem like yours, so I see a nice research opportunity here. We could have a meeting after I take an initial look at your problem.

@ulrikeyang, BoomerAMG-PCG was being used here and lead to bad convergence. I might be wrong, but it seems AMS is not suitable here since the discretization strategy is continuous galerkin (CG) instead of Nedelec FEM as expected by AMS.

liruipeng · 2023-09-30T16:37:31Z

I was wondering if we should keep those 5 or 81 rows/columns of the matrix from coarsening in AMG. Or we can do a Schur complement approach wrt to them. I believe in hypre we can mark them to not be coarsened.

victorapm · 2023-09-30T16:37:56Z

That's one of the things I'm testing

craffael · 2023-10-02T07:18:49Z

Hi all
a few comments:

AMS/ADS are not applicable, it is a scalar valued Laplace problem.
I've already tried various (in)complete Schur complement approaches, but they are quite expensive and I only managed to get small improvements.

liruipeng · 2023-10-02T14:45:39Z

Hi all a few comments:

AMS/ADS are not applicable, it is a scalar valued Laplace problem.

I've already tried various (in)complete Schur complement approaches, but they are quite expensive and I only managed to get small improvements.

Thanks for the updates. Can you provide the details of the Schur complement approach that you tried? Thanks!

craffael · 2023-10-03T10:06:44Z

@liruipeng I try to summarize a bit what I've tried. For the discussion lets assume the system matrix has the following block structure:

Where C is small n x n matrix, built from the n "special basis functions". Then the normal Schur complement would be S = C - B^t A^-1 B. The schur complement S can then be inverted with a direct solver. Of course you can form this schur complement explicitly, but then you have to invert A n times which is not very efficient.

So the next thing I tried was to form an approximate Schur complement

where \tilde{A}^-1 is an approximate inverse of A. E.g. one or more cycles of BoomerAMG or CG iterations with BoomerAMG as a preconditioner. The approximate schur complement matrix \tilde{S} can then be inverted with a direct solver and used as preconditioner for the overall system. This was not very successful (I don't recall the exact numbers).

Alternatively, we can also swap the roles fo A and C and consider the schur complement S2 = A - B C^-1 B^T. The idea is not to explicitly form S2, but to use it in a PCG Iteration with some preconditioner. To Precondition S2 I've used just BoomerAMG of A. The number of PCG iterations still increases, similarly to when BoomerAMG is applied to the whole system. I think the problem here is that "BoomerAMG of A" is not a good preconditioner for S2.

Finally I've also tried block jacobi approaches, i.e. a preconditioner of the form

victorapm · 2023-10-03T12:19:18Z

@craffael Thanks for the additional info! I'm planning to try other alternatives as well and I'll let you know once I do it (in my free time)

@liruipeng Thanks for the ideas! Let me know if you want to look at this as well so we don't duplicate efforts.

victorapm self-assigned this Sep 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BoomerAMG performance if some basis functions have large support #968

BoomerAMG performance if some basis functions have large support #968

craffael commented Sep 22, 2023

victorapm commented Sep 29, 2023

craffael commented Sep 29, 2023 •

edited

victorapm commented Sep 29, 2023

craffael commented Sep 30, 2023

ulrikeyang commented Sep 30, 2023 via email

victorapm commented Sep 30, 2023 •

edited

liruipeng commented Sep 30, 2023

victorapm commented Sep 30, 2023

craffael commented Oct 2, 2023

liruipeng commented Oct 2, 2023

craffael commented Oct 3, 2023 •

edited

victorapm commented Oct 3, 2023

BoomerAMG performance if some basis functions have large support #968

BoomerAMG performance if some basis functions have large support #968

Comments

craffael commented Sep 22, 2023

victorapm commented Sep 29, 2023

craffael commented Sep 29, 2023 • edited

victorapm commented Sep 29, 2023

craffael commented Sep 30, 2023

ulrikeyang commented Sep 30, 2023 via email

victorapm commented Sep 30, 2023 • edited

liruipeng commented Sep 30, 2023

victorapm commented Sep 30, 2023

craffael commented Oct 2, 2023

liruipeng commented Oct 2, 2023

craffael commented Oct 3, 2023 • edited

victorapm commented Oct 3, 2023

craffael commented Sep 29, 2023 •

edited

victorapm commented Sep 30, 2023 •

edited

craffael commented Oct 3, 2023 •

edited