Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compute a better alpha factor in the implementation of the SPD-ness of the Newton method. #5555

Open
bangerth opened this issue Jan 30, 2024 · 6 comments

Comments

@bangerth
Copy link
Contributor

@YiminJin pointed out in an email to @MFraters that we have a mistake in the paper that describes the Newton method. Specifically, in the derivation of eq. (18), we introduce a vector

  v = cos(theta) a/||a|| + sin(theta) b/||b||

that for different values must include the two eigenvectors of the operator E^sym = a \otimes b + b \otimes a. But this is not correct, as he points out: The formula for v is only true if a and b are parallel.

For vectors (actually, symmetric tensors) a, b that are not perpendicular, one should replace b by c = b - b:a/(||a|| ||b||), i.e., the component of b perpendicular to a. Assuming c is not (numerically) zero, the formulas below remain true. For c=0, i.e., if b is parallel to a, E^sym has one zero eigenvalue and one that is easily computable as 2 a:b (which can be both positive and negative).

All of this could probably be used to compute an alpha factor in the Newton implementation that is closer to one and that perhaps leads to faster convergence.

@bangerth bangerth changed the title Computer a better alpha factor in the implementation of the SPD-ness of the Newton method. Compute a better alpha factor in the implementation of the SPD-ness of the Newton method. Jan 30, 2024
@YiminJin
Copy link
Contributor

Thank @bangerth for raising this issue. It may not be a good time to discuss about it at this busy moment, but I would be grateful if someone can take a look and check my derivations in his spare time.

My point is that the top-left block of the Jacobian matrix for most of the homogeneous rheology models we use are non-negative definite. The detailed derivations are presented in the following 4-page pdf:
pd_jacobian.pdf

If my conclusions are correct, then we can simply scale the Newton step by a user-specified number $\alpha \in (0,1)$, which can be very close to 1 in order to gain faster convergence rate.

I have implemented a test code in https://github.com/YiminJin/aspect/tree/newton-solver. In this branch, I set $\alpha = 0.8$ when "Stabilization preconditioner" and "Stabilization velocity block" are set to "none", and fixed some defects in source/material_model/rheology/visco_plastic.cc and source/simulator/assemblers/newton_stokes.cc. The resulting code can significantly improve the convergence of benchmark/viscoelastic_plastic_shear_bands/gerya_2019/gerya_2019_vp and benchmark/viscoelastic_plastic_shear_bands/gerya_2019/gerya_2019_vep (the prm files are also modified).

The reason that the Newton solver is unstable without stabilization in the original code is complex. To make a proper modification, one may have to change the structure of MaterialModel. I think it would be more efficient to have an online meeting about this issue when most of us are free.

@YiminJin
Copy link
Contributor

I found some mistakes in the pdf. Here is the revised version:
pd_jacobian.pdf

@bangerth
Copy link
Contributor Author

For reference, the analysis in @YiminJin's paper only applies if you have that eta(eps) is a function of only ||eps||. In that case, it is true that d eta / d eps is a vector (tensor) parallel to eps itself, and it is correct that one gets one zero eigenvalue and one eigenvalue that is negative for strain rate weakening. For the rheologies considered on pages 3,4, it is true that the resulting matrix is always positive semidefinite even without stabilization, but that is not the case if one had a material that is more strongly strain rate weakening.

@bangerth
Copy link
Contributor Author

Also for reference, if we wanted to address this, the function that does the work in question is in base/utilities.cc:

    template <int dim>
    double compute_spd_factor(const double eta,
                              const SymmetricTensor<2,dim> &strain_rate,
                              const SymmetricTensor<2,dim> &dviscosities_dstrain_rate,
                              const double SPD_safety_factor)
    {
      // if the strain rate is zero, or the derivative is zero, then
      // the exact choice of alpha factor does not matter because the
      // factor that it multiplies is zero -- so return the best value
      // (i.e., one)
      if ((strain_rate.norm() == 0) || (dviscosities_dstrain_rate.norm() == 0))
        return 1;

      const double norm_a_b = std::sqrt((strain_rate*strain_rate)*(dviscosities_dstrain_rate*dviscosities_dstrain_rate));//std::sqrt((deviator(strain_rate)*deviator(strain_rate))*(dviscosities_dstrain_rate*dviscosities_dstrain_rate));
      const double contract_b_a = (dviscosities_dstrain_rate*strain_rate);
      const double one_minus_part = 1 - (contract_b_a / norm_a_b);
      const double denom = one_minus_part * one_minus_part * norm_a_b;

      // the case denom == 0 (smallest eigenvalue is zero), should return one,
      // and it does here, because C_safety * 2.0 * eta is always larger then zero.
      if (denom <= SPD_safety_factor * 2.0 * eta)
        return 1.0;
      else
        return std::max(0.0, SPD_safety_factor * ((2.0 * eta) / denom));
    }

@YiminJin
Copy link
Contributor

YiminJin commented Jan 31, 2024

I agree with @bangerth that my conclusion only applies to homogeneous models with moderate strain weakening. If the strain weakening is extremely strong (for example, the stress exponent in the power-law model is $&lt;$ 0), or if the material model is inhomogeneous, the SPD stabilization would be necessary.

The biggest trouble in modifying the code is in the elastic rheology. For viscoelastic models, we have
$$\boldsymbol{\tau} = 2\eta^{ve}\dot{\boldsymbol{\varepsilon}}^{ve},$$
where
$$\eta^{ve} \equiv \left(\frac{1}{\eta} + \frac{1}{G\Delta t}\right)^{-1}, \quad \dot{\boldsymbol{\varepsilon}}^{ve} \equiv \dot{\boldsymbol{\varepsilon}} + \frac{\boldsymbol{\tau}^{old}}{2G\Delta t}.$$
Consequently, the fourth-order tensor $\boldsymbol{a}\otimes\boldsymbol{b}$ for viscoelastic or viscoelastic-plastic models should be
$$\frac{\partial\eta^{ve(p)}}{\partial\dot{\boldsymbol{\varepsilon}}^{ve}}\otimes\dot{\boldsymbol{\varepsilon}}^{ve} = \frac{\partial\eta^{ve(p)}}{\partial\dot{\boldsymbol{\varepsilon}}}\otimes\dot{\boldsymbol{\varepsilon}}^{ve},$$
not
$$\frac{\partial\eta^{ve(p)}}{\partial\dot{\boldsymbol{\varepsilon}}}\otimes\dot{\boldsymbol{\varepsilon}}$$
as in functions NewtonStokesPreconditioner::execute() and NewtonStokesIncompressibleTerms::execute() in source/simulator/assembers/newton_stokes.cc.
To calculate $\dot{\boldsymbol{\varepsilon}}^{ve}$, we need to call function rheology::Elasticity::calculate_viscoelastic_strain_rate(). However, in the current code rheology::Elasticity is inside the material model, and only class ViscoPlastic provides access to Elasticity (in class Viscoelastic the member Elasticity is private). Therefore, to fix this issue we may have to change the structure of the material model.

@gassmoeller
Copy link
Member

Do I understand correctly that this was fixed by the linked PR? If so, feel free to close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants