Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT ARM64-SVE: Add CreateWhileLessThan* #100949

Merged
merged 8 commits into from Apr 25, 2024

Conversation

a74nh
Copy link
Contributor

@a74nh a74nh commented Apr 12, 2024

Support for all CreateWhileLessThanMask and CreateWhileLessThanOrEqualMask APIs.

Copy link

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics
See info in area-owners.md if you want to be subscribed.

@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Apr 12, 2024
@a74nh a74nh force-pushed the CreateWhileLessThan_github branch from e2c5d99 to 944467a Compare April 12, 2024 10:33
@a74nh
Copy link
Contributor Author

a74nh commented Apr 12, 2024

The instruction emitted is dependent on if the inputs are signed or unsigned. This information is lost when the nodes are created (an input with CORINFO_TYPE_ULONG gets created as a node with TYP_LONG due to the conversion in JITtype2varType()).

The easiest way to mark this is to use GTF_UNSIGNED. But that is only used elsewhere by GT_CAST nodes, so I'm a little concerned this may break something.

@a74nh
Copy link
Contributor Author

a74nh commented Apr 12, 2024

❯ $CORE_ROOT/corerun ./artifacts/tests/coreclr/linux.arm64.Checked/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/HardwareIntrinsics_Arm_ro.dll Sve_CreateWhileLessThan
10:42:42.647 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask16Bit_Int32()
Supported ISAs:
  AdvSimd:   True
  Aes:       True
  ArmBase:   True
  Crc32:     True
  Dp:        True
  Rdm:       True
  Sha1:      True
  Sha256:    True
  Sve:       True

Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:42.734 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask16Bit_Int32()
10:42:42.750 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask16Bit_Int64()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:42.760 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask16Bit_Int64()
10:42:42.762 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask16Bit_UInt32()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:42.770 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask16Bit_UInt32()
10:42:42.772 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask16Bit_UInt64()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:42.780 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask16Bit_UInt64()
10:42:42.781 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask32Bit_Int32()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:42.789 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask32Bit_Int32()
10:42:42.791 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask32Bit_Int64()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:42.799 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask32Bit_Int64()
10:42:42.800 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask32Bit_UInt32()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:42.807 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask32Bit_UInt32()
10:42:42.810 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask32Bit_UInt64()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:42.817 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask32Bit_UInt64()
10:42:42.819 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask64Bit_Int32()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:42.826 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask64Bit_Int32()
10:42:42.828 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask64Bit_Int64()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:42.835 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask64Bit_Int64()
10:42:42.837 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask64Bit_UInt32()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:42.844 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask64Bit_UInt32()
10:42:42.846 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask64Bit_UInt64()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:42.853 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask64Bit_UInt64()
10:42:42.855 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask8Bit_Int32()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:42.862 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask8Bit_Int32()
10:42:42.864 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask8Bit_Int64()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:42.871 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask8Bit_Int64()
10:42:42.873 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask8Bit_UInt32()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:42.880 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask8Bit_UInt32()
10:42:42.882 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask8Bit_UInt64()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:42.889 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanMask8Bit_UInt64()
10:42:42.891 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask16Bit_Int32()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:42.899 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask16Bit_Int32()
10:42:42.900 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask16Bit_Int64()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:42.908 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask16Bit_Int64()
10:42:42.909 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask16Bit_UInt32()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:42.917 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask16Bit_UInt32()
10:42:42.919 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask16Bit_UInt64()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:42.926 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask16Bit_UInt64()
10:42:42.928 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask32Bit_Int32()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:42.935 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask32Bit_Int32()
10:42:42.937 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask32Bit_Int64()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:42.944 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask32Bit_Int64()
10:42:42.946 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask32Bit_UInt32()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:42.953 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask32Bit_UInt32()
10:42:42.955 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask32Bit_UInt64()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:42.962 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask32Bit_UInt64()
10:42:42.964 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask64Bit_Int32()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:42.971 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask64Bit_Int32()
10:42:42.973 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask64Bit_Int64()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:42.980 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask64Bit_Int64()
10:42:42.982 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask64Bit_UInt32()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:42.989 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask64Bit_UInt32()
10:42:42.991 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask64Bit_UInt64()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:42.998 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask64Bit_UInt64()
10:42:43.000 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask8Bit_Int32()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:43.008 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask8Bit_Int32()
10:42:43.009 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask8Bit_Int64()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:43.016 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask8Bit_Int64()
10:42:43.018 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask8Bit_UInt32()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:43.026 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask8Bit_UInt32()
10:42:43.028 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask8Bit_UInt64()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
10:42:43.035 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_CreateWhileLessThanOrEqualMask8Bit_UInt64()

@a74nh a74nh marked this pull request as ready for review April 12, 2024 10:45
@a74nh
Copy link
Contributor Author

a74nh commented Apr 12, 2024

@kunalspathak @dotnet/arm64-contrib

@tannergooding
Copy link
Member

This information is lost when the nodes are created (an input with CORINFO_TYPE_ULONG gets created as a node with TYP_LONG due to the conversion in JITtype2varType()).

This information should be tracked by simdBaseJitType, which is the entire point of that field and why all HWIntrinsic nodes require it to be specified. It tracks the true type of the operation regardless of the type of the inputs (which may be reinterpret casted, be typed according to the JIT requirements, or some other consideration)

@a74nh
Copy link
Contributor Author

a74nh commented Apr 12, 2024

This information is lost when the nodes are created (an input with CORINFO_TYPE_ULONG gets created as a node with TYP_LONG due to the conversion in JITtype2varType()).

This information should be tracked by simdBaseJitType, which is the entire point of that field and why all HWIntrinsic nodes require it to be specified. It tracks the true type of the operation regardless of the type of the inputs (which may be reinterpret casted, be typed according to the JIT requirements, or some other consideration)

Problem here is that, for example:

public static unsafe Vector<ushort> CreateWhileLessThanOrEqualMask16Bit(int left, int right);
public static unsafe Vector<ushort> CreateWhileLessThanOrEqualMask16Bit(uint left, uint right);

For both of these, the simdBaseJitType is ushort. Should it be set to int or uint then? If so, I'd can do that at the same place I currently set GTF_UNSIGNED

@a74nh
Copy link
Contributor Author

a74nh commented Apr 12, 2024

Problem here is that, for example:

public static unsafe Vector<ushort> CreateWhileLessThanOrEqualMask16Bit(int left, int right);
public static unsafe Vector<ushort> CreateWhileLessThanOrEqualMask16Bit(uint left, uint right);

For both of these, the simdBaseJitType is ushort. Should it be set to int or uint then? If so, I'd can do that at the same place I currently set GTF_UNSIGNED

I've pushed a version that does this. But we still don't have enough information as there are two distinct types - in the example above - the input type (int or uint) is used to select the instruction , and the return type (ushort) is the size used within the instruction encoding.

With the latest code....

  • simdBaseJitType is now set to CORINFO_TYPE_UINT or CORINFO_TYPE_INT.
  • The correct whilelt/whilelo instruction is picked up from the table in hwintrinsiclistarm64sve.h
  • When calling emitIns_R_R_R In hwintrinsiccodegenarm64.cpp, opt needs to be set to the size of elements in the output vector - INS_OPTS_SCALABLE_H
    • The generic code will call optGetSveInsOpt() using baseType, but this is CORINFO_TYPE_UINT or CORINFO_TYPE_INT.
    • The specialised code instead calls optGetSveInsOpt() using the type of node, but this is TYP_MASK

@tannergooding
Copy link
Member

tannergooding commented Apr 12, 2024

But we still don't have enough information as there are two distinct types

The return type is implicit based on the intrinsicId (in this case CreateWhileLessThanOrEqualMask16Bit, which explicitly indicates it is 16-bit and therefore we know it must be ushort).

In general, simdBaseJitType is meant to be the type that disambiguates the overload and therefore is one of the input types. This defaults to returnType under the presumption that it is sufficient (getting the return type is slightly cheaper and is sufficient for the vast majority of intrinsics), but we have HW_Flag_BaseTypeFromFirstArg and HW_Flag_BaseTypeFromSecondArg if it needs to be driven off either of the main input parameters instead.

The returnType then either matches simdBaseJitType or is implicitly known based on the intrinsicId, since we don't overload by return type.

In the incredibly rare case there needs to be a 3rd type present (such as is done for NI_AdvSimd_AddWideningUpper), we then have the gtAuxiliaryJitType field which can track an additional CorJitType to help with directing general handling. This additional field should really only be needed when overload resolution is chosen based on 2 different input types.

@a74nh
Copy link
Contributor Author

a74nh commented Apr 12, 2024

The return type is implicit based on the intrinsicId (in this case CreateWhileLessThanOrEqualMask16Bit, which explicitly indicates it is 16-bit and therefore we know it must be ushort).

Yes! That's a simple fix then

but we have HW_Flag_BaseTypeFromFirstArg and HW_Flag_BaseTypeFromSecondArg if it needs to be driven off either of the main input parameters instead.

Looks like using this requires arg1/arg2 to be a SIMD type, whereas here we just have a scalar. Is it ok to extend these flags to support scalar args too?

@tannergooding
Copy link
Member

Is it ok to extend these flags to support scalar args too?

Yep, that should be fine!

The parameter is named simdBaseJitType but that's more a historical artifact. It would be better named something else that makes it more clear its just the actual type required to disambiguate the overload, but that's a very messy change to make at this point.

@kunalspathak kunalspathak added the arm-sve Work related to arm64 SVE/SVE2 support label Apr 12, 2024
@a74nh
Copy link
Contributor Author

a74nh commented Apr 15, 2024

Is it ok to extend these flags to support scalar args too?

Yep, that should be fine!

The parameter is named simdBaseJitType but that's more a historical artifact. It would be better named something else that makes it more clear its just the actual type required to disambiguate the overload, but that's a very messy change to make at this point.

Done. Now supports simdBaseJitType allowing a scalar - and if so the simdSize is 0.

Also had to make changes to convert-mask-to-vector code. Need to ensure it always uses the simdBaseJitType and simdSize of the return node (otherwise when using HW_Flag_BaseTypeFromFirstArg it ends up converting using the wrong type)

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kunalspathak
Copy link
Member

superpmi-replay failure is #101070

Comment on lines 1303 to 1304
case NI_Sve_CreateWhileLessThanMask8Bit:
case NI_Sve_CreateWhileLessThanOrEqualMask8Bit:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this going to be a more common pattern? Should we have a way to make it more table driven?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problem is we need to know the name of the intrinsic in order to know the opt value (eg INS_OPTS_SCALABLE_H).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we not tracking the vector type anywhere? We have a couple fields (simdBaseJitType and altType for example) so we should be able to track both the overload type (int vs uint vs long vs ulong) and the vector base type/size

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched the code so that it uses the return vector type as the basetype, and set auxiliary type to arg1 type. That simplifies the code a lot and removes many of my changes.

@@ -1515,6 +1515,17 @@ GenTree* Compiler::impHWIntrinsic(NamedIntrinsic intrinsic,
}
break;

case NI_Sve_CreateWhileLessThanMask8Bit:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like how we have these target specific switch statements in hwintrinsic.cpp, it makes the code hard to follow. I highly expect the number of these special cases to grow too.

I'd much prefer it if each of these instrinsics (including all the neon and X86 ones) were marked with SpecialImport. The special import cases would then have to duplicate the common get args code, but it's only a few lines (which could go into a helper). Not going to do it for this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍, we've definitely done that for almost all the x86/x64 code, there's only a couple special handlers left for that platform (like Sse42.Crc32). It would be nice to get both platforms doing this overall consistently here.

@a74nh
Copy link
Contributor Author

a74nh commented Apr 25, 2024

Merged to main. Results using the new stress tester:

❯ ~/dotnet/sve_api_runtime/stress_tester.py $CORE_ROOT/corerun ./artifacts/tests/coreclr/linux.arm64.Checked/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/HardwareIntrinsics_Arm_ro.dll Sve_CreateWhile
===================Running default===================
{}
===================Running jitstress===================
{'JitMinOpts': '1'}
{'JitStress': '1'}
{'JitStress': '2'}
{'JitStress': '1', 'TieredCompilation': '1'}
{'JitStress': '2', 'TieredCompilation': '1'}
{'TailcallStress': '1'}
{'ReadyToRun': '0'}
===================Running jitstressregs===================
{'JitStressRegs': '1'}
{'JitStressRegs': '2'}
{'JitStressRegs': '3'}
{'JitStressRegs': '4'}
{'JitStressRegs': '8'}
{'JitStressRegs': '0x10'}
{'JitStressRegs': '0x80'}
{'JitStressRegs': '0x1000'}
{'JitStressRegs': '0x2000'}
===================Running jitstress2-jitstressregs===================
{'JitStress': '2', 'JitStressRegs': '1'}
{'JitStress': '2', 'JitStressRegs': '2'}
{'JitStress': '2', 'JitStressRegs': '3'}
{'JitStress': '2', 'JitStressRegs': '4'}
{'JitStress': '2', 'JitStressRegs': '8'}
{'JitStress': '2', 'JitStressRegs': '0x10'}
{'JitStress': '2', 'JitStressRegs': '0x80'}
{'JitStress': '2', 'JitStressRegs': '0x1000'}
{'JitStress': '2', 'JitStressRegs': '0x2000'}

@kunalspathak
Copy link
Member

/ba-g Failure is #101559

@kunalspathak kunalspathak merged commit a8e74e3 into dotnet:main Apr 25, 2024
160 of 168 checks passed
@a74nh a74nh deleted the CreateWhileLessThan_github branch April 26, 2024 09:13
matouskozak pushed a commit to matouskozak/runtime that referenced this pull request Apr 30, 2024
* JIT ARM64-SVE: Add CreateWhileLessThan*

* Set simdBaseJitType to type of input args

* Hardcode opt in codegen

* Fix gtNewSimdConvertMaskToVectorNode types

* Use HW_Flag_BaseTypeFromFirstArg

* Set base type to return type and auxiliary type to input type
michaelgsharp pushed a commit to michaelgsharp/runtime that referenced this pull request May 9, 2024
* JIT ARM64-SVE: Add CreateWhileLessThan*

* Set simdBaseJitType to type of input args

* Hardcode opt in codegen

* Fix gtNewSimdConvertMaskToVectorNode types

* Use HW_Flag_BaseTypeFromFirstArg

* Set base type to return type and auxiliary type to input type
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Runtime.Intrinsics arm-sve Work related to arm64 SVE/SVE2 support community-contribution Indicates that the PR has been added by a community member new-api-needs-documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants