You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
So, basically it need to be a memcpy and we need to recognize it when for the types data to avoid unpacking/packing.
There is also the scenario where indices are not contiguous, in which case gather/scatter operations are more efficient with primitive types such as int64. For this test on AVX2, there are 291 instructions using Dest[i] = Src[i]; compared to 38 instructions using ((uniform int64 * uniform)Dest)[i] = ((uniform int64 * uniform)Src)[i];.
a more efficient gather/scatter can also be revised if indices are not contiguous but source and destination are of same type. that is, for each index, copy the struct from Src to Dest.
here is a very simplified example
this could be avoided with
Right now i'm using this optimization but to be safe I added this check
But i think it would be nice if the compiler could optimize for this case and I believe it can adapt to structs of any size.
The text was updated successfully, but these errors were encountered: