Direct solver for non-symmetric matrices #48

mrp089 · 2021-09-08T13:46:11Z

Part of #32. I'm running into an issue with the linear solver. (using quasi-static "time integration" #47). There are two time steps in this simple pressurized tube example:

Hyperelastic prestressing (symmetric stiffness matrix)
G&R (non-symmetric stiffness matrix). Related to Problems with only minor symmetry stiffness matrix? #39.

My linear solver settings are

LS type: BICG
{
   Tolerance:           1e-16
   Absolute tolerance:  1e-16
   Max iterations:      100000
}

Here is the output of the G&R simulation:

---------------------------------------------------------------------
 Eq     N-i     T       dB  Ri/R1   Ri/R0    R/Ri     lsIt   dB  %t
---------------------------------------------------------------------
 New time step size: 1.000000
 ST     1-1  1.5219  [   0 1.00000 1.00000 1.6E-14]  [5200  -25  93]
 ST     1-2  2.6789  [ -64 6.27E-4 6.27E-4 1.8E-11]  [3767    0  93]
 ST     1-3  3.5479  [-156 1.55E-8 1.55E-8 2.52E-7]  [2988    0  91]
 ST     1-4s 3.6899  [-226 4.6E-12 4.6E-12 5.92E-3]  [ 138    0  26]
 ST     2-1  3.12E1  [   0 1.00000 4.60453 5.848E2]  !100000    0  99!
 ST     2-2  5.68E1  !  47 2.320E2 1.068E3 2.888E1!  !100000    0 100!
 ST     2-3  8.07E1  !  66 2.215E3 1.020E4     NaN!  !100000    0 100!

The problem takes many linear iterations to converge in step 1 and doesn't converge at all in step 2, causing the nonlinear solver to diverge. (Yes, I use "strict" solver tolerances. However, step 2 also doesn't converge for relaxed tolerances.)

The biconjugate gradient method BICG should be able to handle non-symmetric matrices, so that shouldn't be the problem. And maybe I can get it to work by playing around with preconditioners. But to get G&R running, I think a robust direct solver (that can handle non-symmetric matrices) might be easier for now. (PARDISO in FEBio performs great for this example.)

@vvedula22's help is again much appreciated :-)

Can you think of a better solution than a direct solver?
Is there a direct linear solver in svFSI that I didn't see?
Do you think adding one (e.g. with the Trilinos interface) would be somewhat easily doable?

The text was updated successfully, but these errors were encountered:

vvedula22 · 2021-09-09T09:15:44Z

Try GMRES with a Trilinos-based preconditioner. In my experience, GMRES outperforms BICG for complex solid mechanics problems.
We don't have an option for a direct solver in svFSI hitherto. It is on my TODO list but am currently working on updating the shell formulation. I might revisit the direct solver issue later in Oct or Nov.
You are welcome to implement one using Trilinos Amesos package. Amesos offers easy coupling to several standard libraries including PARDISO used by FEBio. Moreover it uses Epetra data structures that are already used in our current Trilinos interface. PARDISO is limited to use OpenMP type parallelization. As far as I know, MUMPS is the only direct solver library that supports MPI and Schur-complement type systems. You could first start with a simple case through Amesos and run it in serial.

Note that currently, we use a full Newton method with iterative solvers. However, this may be prohibitively expensive with a direct solver as you have to keep inverting the matrix for every Newton iteration. You may follow a simpler update strategy such as the BFGS method that saves you a lot of computational effort. According to Gerard Ateshian, who co-leads FEBio, BFGS method allows you to retain the inverse for several time steps (~25-30, but of course depends on the problem) before recomputing it. The BFGS method (along with other interesting update strategies) is described very elegantly in the Nonlinear FEM book by Peter Wriggers.

mrp089 · 2021-09-14T09:41:11Z

Thanks for the overview @vvedula22!

Quick update: I was able to compile Trilinos with Pardiso and create direct linear solvers with Amesos in svFSI. I'm a little bit stuck (as expected) with some of the Trilinos objects, but I think it's doable. I'll open a pull request (separate from G&R) once I have a running version.

vvedula22 · 2021-09-14T12:37:38Z

Awesome @mrp089 You may follow some examples in the amesos source if it helps. A separate pull request is a great idea. However, note that MUMPS would be ideal for svFSI as it not only handles MPI but also computes Schur complement for mixed problems (fluid, stokes, ustruct, FSI). The Amesos interface allows switching to MUMPS in a straightforward fashion but not all parameters (IPAR, RPAR) in mumps can be controlled through Amesos - this would still be a good starting point. We will also have to add BFGS method for direct solvers instead of the current full Newton approach.

mrp089 · 2021-09-28T21:44:08Z

I got the first version of this working. Here's a quick summary:

The current stiffness matrix K (used for iterative solvers within AztecOO) is stored in an Epetra_FEVbrMatrix that inherits from Epetra_VbrMatrix. It uses a dof x dof dense matrix block structure (not entirely sure if this is advantageous over just putting everything in an Epetra_CrsMatrix with a corresponding map). Consequently, X and F were also built as Epetra_Vector with an Epetra_BlockMap structure, making it a de facto Epetra_MultiVector (but not quite).

Unfortunately, after some debugging, it turned out that none of the linear solvers in Amesos supports the Epetra_VbrMatrix format (see Amesos 2.0 Reference Guide):

I spent most of the time implementing the conversion from the current structure to Epetra_CrsMatrix for K and Epetra_MultiVector for X and F. We can now choose between all Amesos supported linear solvers (that are installed on the user's system and have been compiled with Trilinos). I still need to implement passing that information in the svFSI input file and switching between iterative and direct solvers.

mrp089 · 2021-09-28T21:54:24Z

I'm slightly confused about how we apply Dirichlet boundary conditions (DBCs). I see they are passed to trilinos_global_solve_ via dirW. It looks like the DBCs are actually applied when calling constructJacobiScaling. Here, we assemble a diagonal matrix from the diagonal of the stiffness matrix K and the dirW vector.

What confuses me is that if I set zero DBCs, those DOFs are zeroed in dirW. When multiplying diagonal with K this causes zero rows wherever I want zero DBCs. If I understand that correctly, that doesn't seem like a robust way of applying DBCs. In fact, all direct linear solvers throw an error that the matrix is singular (except Pardiso).

@vvedula22 What is your take on this? Thank you!!

vvedula22 · 2021-09-29T01:02:03Z

@mrp089 That's great! Have you checked if changing the matrix structure impacts other linear solvers and preconditioners in Trilinos (AztecOO, Ifpack, ML?). Please also try different problems (fluid, FSI, with resistance bcs etc.) to make sure these changes don't break other implementations.

Regarding Dirichlet BCs, we apply them strongly. As we are solving for the increments in the solution, we zero-out the entire row corresponding to a Dirichlet node of the matrix but reset the diagonal entry to be equal to 1. The right-hand side residual vector is also zeroed-out for the corresponding row. As a result, the increment in the solution will be 0. Before entering the Newton iteration and calling the linear solver, we set Dirichlet BCs on that node to make sure the value is retained throughout the time step and its effect is included in the residue.

The matrix is not singular but zeroing-out a row might pose issues for some direct solvers that depend on pivoting. Usually, reordering is performed to make sure these issues are resolved and don't affect the pivoting process, but I am not sure how this is done for different direct solvers. You may have to go into their manual. E.g., MUMPS offers a variety of options in the preprocessing stage to analyze the matrix structure. Might be useful.

Alternately, you may apply the Dirichlet boundary condition as a forcing term integrated on that particular face. I believe svSolver already does that - at least based on Alberto Figueroa's thesis work (see Eqs. 3.8-3.11, 3.33-3.34 from the thesis). Please check.

You could also try applying Dirichlet BCs weakly and see if it helps. This is done using Nitsche's method and we already have it implemented for the fluid problem. However, this involves computing Cauchy stress using virtual displacements which is doable for fluids problems but may not be straightforward for nonlinear solid mechanics problems.

mrp089 · 2021-09-29T17:11:11Z

Thanks for the overview @vvedula22! I think there are two different places in svFSI where we apply DBCs: SETBCDIR* in SETBC.f and constructJacobiScaling in trilinos_linear_solver.cpp.

It's hard for me to understand what is happening in SETBC.f. All I know is when I call TRILINOS_GLOBAL_SOLVE from LS.f, we provide the stiffness matrix without DBCs in Val and the DBCs in tls%W:

svFSI/Code/Source/svFSI/LS.f

Lines 90 to 94 in 9c30521

    
               CALL TRILINOS_GLOBAL_SOLVE(Val, R, tls%R, tls%W, 
        
           2      lEq%FSILS%RI%fNorm, lEq%FSILS%RI%iNorm, lEq%FSILS%RI%itr, 
        
           3      lEq%FSILS%RI%callD, lEq%FSILS%RI%dB, lEq%FSILS%RI%suc, 
        
           4      lEq%ls%LS_type, lEq%FSILS%RI%reltol, lEq%FSILS%RI%mItr, 
        
           5      lEq%FSILS%RI%sD, lEq%ls%PREC_Type)

The DBCs are passed to trilinos_global_solve_ in trilinos_linear_solver.cpp as dirW:

svFSI/Code/Source/svFSI/trilinos_linear_solver.cpp

Lines 405 to 408 in 9c30521

    
           void trilinos_global_solve_(const double *Val, const double *RHS, double *x, 
        
                   const double *dirW, double &resNorm, double &initNorm, int &numIters, 
        
                   double &solverTime, double &dB, bool &converged, int &lsType, 
        
                   double &relTol, int &maxIters, int &kspace, int &precondType)

We then put the DBCs in dirW into a diagonal matrix in constructJacobiScaling. So if there's a zero DBC in dirW, there will be a zero in diagonal :

svFSI/Code/Source/svFSI/trilinos_linear_solver.cpp

Lines 840 to 843 in 9c30521

    
           int error = diagonal.ReplaceGlobalValue(localToGlobalSorted[i], 
        
                               j, //block offset 
        
                               0, //multivector of 1 vector 
        
                               dirW[i*dof+j]); //value to insert

This diagonal matrix is then used to scale the system:

svFSI/Code/Source/svFSI/trilinos_linear_solver.cpp

Line 865 in 9c30521

//Let diag = W, solving W*K*W*y = W*F, where X = W*y

So zero DBCs -> zero rows/columns in stiffness matrix. I think this went unnoticed for a while because trilinos_linear_solver.cpp only had iterative solvers that might be "immune" to a singular system (to a certain extent). However, when using direct solvers now, they immediately crash.

My solution would be to introduce the same DBC-handling in SETBC.f to trilinos_linear_solver.cpp: zero row, one on diagonal. I'll have to see if this has implications on the "down-stream" FORTRAN part of the solution process.

vvedula22 · 2021-10-02T00:40:41Z

Why do you have to modify SETBC.f? The left-hand side matrix is not assembled entirely in SETBC.f. And this may have disastrous implications in the svFSILS code for other different types of problems. Note that the DBCs are not handled in svFSI but in svFSILS.

I'd recommend you changing the diagonal element to 1 after pre-multiplying and post-multiplying the matrix with the diagonal scaling. That should avoid any error with the direct solver and the matrix is no longer singular.

mrp089 · 2021-10-02T01:07:09Z

I don't want to touch SETBC.f.

I discovered that constructJacobiScaling in trilinos_linear_solver.cpp has "disastrous implications" for the conditioning of my matrix :-) I fixed that by putting a 1 on the diagonal and 0 in the rest of the row.

I'm just confused how svFSI with iterative solvers ever worked robustly with a bunch of zero rows...

vvedula22 · 2021-10-02T10:50:58Z

@mrp089 haha :) sorry, I was slow on catching up with GitHub issues - should have warned you earlier.

I should have given you more information about DBCs but my other post was already becoming long.
SETBC.f doesn't handle DBCs directly. We only prescribe the values (An, Yn, or Dn) in SETBC.f. E.g., below is where we apply DBCs on the state variable (lY) and its time-derivative (lA): SETBC.f

DO a=1, msh(iM)%fa(iFa)%nNo AC = msh(iM)%fa(iFa)%gN(a) lA(s:e,Ac) = tmpA(:,a) lY(s:e,Ac) = tmpY(:,a) END DO
The actual treatment of the Dirichlet BCs is done during Jacobi preconditioning of the matrix in svFSILS, Here is the code for that:

`! Accounting for Dirichlet BC and inversing W = W^{-1/2}
DO Ac=1, nNo
d = diagPtr(Ac)
DO i=1, dof
IF (W(i,Ac) .EQ. 0._LSRP) W(i,Ac) = 1._LSRP
END DO
END DO

  W = 1._LSRP/SQRT(ABS(W))
  DO faIn=1, lhs%nFaces
     IF (.NOT.lhs%face(faIn)%incFlag) CYCLE
     i = MIN(lhs%face(faIn)%dof,dof)
     IF (lhs%face(faIn)%bGrp .EQ. BC_TYPE_Dir) THEN
        DO a=1, lhs%face(faIn)%nNo
           Ac = lhs%face(faIn)%glob(a)
           W(1:i,Ac) = W(1:i,Ac)*lhs%face(faIn)%val(1:i,a)
        END DO
     END IF
  END DO

! Pre-multipling K with W: K = W*K
CALL PREMUL(rowPtr, lhs%nNo, lhs%nnz, dof, Val, W)

! Multipling R with W: R = WR
R = WR

! Now post-multipling K by W: K = K*W
CALL POSMUL(rowPtr, colPtr, lhs%nNo, lhs%nnz, dof, Val, W)
The arrayWis initialized to the diagonal element of the matrix. If any of the elements is 0, it will be reset to 1. On the Dirichlet nodes, the value ofWis changed to 0W(1:i,Ac) = W(1:i,Ac)*lhs%face(faIn)%val(1:i,a). Later, when the matrix is premultiplied with W`, the entire row on the LHS matrix will be zeroed out. This step is then done for the corresponding row in the RHS residue.

This step doesn't pose an issue with iterative solvers because iterative solvers mostly perform matrix-vector products and norms and don't do any 'division by diagonal' on an already preconditioned matrix. We do see some issues due to this step while using ILU or ILUT, IC or ICT, preconditioners in Trilinos as these preconditioners perform pivoting in the LU decomposition step. It will be interesting to see how resetting the diagonal element to 1 affects iterative solvers because these elements now add some finite contribution to the matrix-vector products, although their RHS counterparts are zeroed out.

One of the advantages of treating DBCs the way it is done now is that it will allow us to treat other types of problems without explicitly applying any Dirichlet BC. E.g., when solving the mesh equation in FSI simulations where the entire solid domain is assumed to be fixed while the fluid mesh is deformed. Likewise, the presence of immersed bodies, edge rings formed at the inflow or outflow faces for CMM equation, etc. are handled easily this way.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Direct solver for non-symmetric matrices #48

Direct solver for non-symmetric matrices #48

mrp089 commented Sep 8, 2021

vvedula22 commented Sep 9, 2021 •

edited

mrp089 commented Sep 14, 2021

vvedula22 commented Sep 14, 2021

mrp089 commented Sep 28, 2021

mrp089 commented Sep 28, 2021

vvedula22 commented Sep 29, 2021

mrp089 commented Sep 29, 2021

vvedula22 commented Oct 2, 2021

mrp089 commented Oct 2, 2021

vvedula22 commented Oct 2, 2021

Direct solver for non-symmetric matrices #48

Direct solver for non-symmetric matrices #48

Comments

mrp089 commented Sep 8, 2021

vvedula22 commented Sep 9, 2021 • edited

mrp089 commented Sep 14, 2021

vvedula22 commented Sep 14, 2021

mrp089 commented Sep 28, 2021

mrp089 commented Sep 28, 2021

vvedula22 commented Sep 29, 2021

mrp089 commented Sep 29, 2021

vvedula22 commented Oct 2, 2021

mrp089 commented Oct 2, 2021

vvedula22 commented Oct 2, 2021

vvedula22 commented Sep 9, 2021 •

edited