-
Notifications
You must be signed in to change notification settings - Fork 707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Matrix-free MG transfer: Switch interpolation matrices to double type #16721
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you check if we have a test with float?
We have the performance test
Regarding correctness, I see it in dealii/tests/multigrid-global-coarsening/parallel_multigrid_mf_02.cc Lines 270 to 274 in 329fee6
vcvtsd2ss for converting double -> float follow by vbroadcastss between registers) but I believe we have many more.
|
Thanks for checking! |
template <int dim, typename Number> | ||
template <int dim, typename Number, typename Number2> | ||
class CellProlongator | ||
{ | ||
public: | ||
CellProlongator(const AlignedVector<Number> &prolongation_matrix, | ||
const AlignedVector<Number> &prolongation_matrix_1d, | ||
const Number *evaluation_data_coarse, | ||
Number *evaluation_data_fine) | ||
const Number2 *evaluation_data_coarse, | ||
Number2 *evaluation_data_fine) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you actually need both Number
and Number2
as template arguments? At least at the call sites you modify below, Number
is always double
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My hope is to later use it with either float
or double
, depending on VectorType::value_type
and with the right choice for complex numbers. I can change it for now if you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you've got plans for it, let's leave it as is.
As discussed in #16719, the extension of the matrix-free multigrid infrastructure to complex numbers (which is a desirable goal) necessitates some different choices. Thanks to #15207 and #15217, that switch is rather easy from a software point of view, as the tensor product evaluators already support that setting.
This pull request might not be the final step, because storing the tabulated shape functions as
double
entries is not the right choice when the work arrays are of typeVectorizedArray<float>
, but the compiler generates surprisingly good code nonetheless, keeping it to a single conversiondouble
->float
in reading the array. On AMD CPUs I tested, I can hardly see performance impact of the additional instructions, albeit I believe that Intel CPUs (up to Ice Lake at least) will lead to half the throughput on the inner kernel because the conversion/broadcast from register is on the same execution units as the arithmetic work; on the other hand, the vector access is a much larger share of the instructions in this scenario. So given the background, I am happy to pull this to our main branch right away, and find a better solution later when we have made a step towards complex numbers elsewhere. The idea for the better solution is to selectfloat
as the number type when there isVectorizedArray<float>
orVectorizedArray<std::complex<float>>
orstd::complex<VectorizedArray<float>>
or something like that, anddouble
in the other cases. But that can wait, as surprising as this might sound coming from me.