Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix LoongArch support #4652

Merged
merged 4 commits into from May 23, 2024
Merged

Fix LoongArch support #4652

merged 4 commits into from May 23, 2024

Conversation

heiher
Copy link
Contributor

@heiher heiher commented May 9, 2024

This PR fixes support for LoongArch so that it can run on LoongArch systems.

@heiher heiher marked this pull request as draft May 10, 2024 01:40
@heiher heiher marked this pull request as ready for review May 11, 2024 09:28
@heiher heiher requested a review from kinke May 11, 2024 09:28
runtime/druntime/src/core/threadasm.S Outdated Show resolved Hide resolved
runtime/druntime/src/core/threadasm.S Outdated Show resolved Hide resolved
@heiher heiher force-pushed the master branch 3 times, most recently from 969ccb2 to fae65b3 Compare May 14, 2024 07:38

Type *ty = arg.type->toBasetype();
if (ty->isintegral() && (ty->ty == TY::Tint32 || ty->ty == TY::Tuns32)) {
arg.attrs.addAttribute(LLAttribute::SExt);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refer: https://github.com/loongson/la-abi-specs/blob/release/lapcs.adoc

One exception to the above rule is that in the LP64D ABI, unsigned words, such as those representing unsigned int in C, are stored in general-purpose registers as proper sign extensions of their 32-bit values.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a source comment, I was pretty sure this has got to be wrong.

Copy link
Contributor Author

@heiher heiher May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a source comment and describe more here.

For example:

C

int32.c: (https://godbolt.org/z/vcjErxj76)

int check_int(int v)
{
	return v;
}

unsigned int check_uint(unsigned int v)
{
	return v;
}

llvm ir:

define dso_local noundef signext i32 @check_int(i32 noundef returned signext %0) local_unnamed_addr #0 !dbg !9 {
  ...
}

define dso_local noundef signext i32 @check_uint(i32 noundef returned signext %0) local_unnamed_addr #0 !dbg !18 {
  ...
}

D

int32.d:

extern(C) int check_int(int v)
{
	return v;
}

extern(C) uint check_uint(uint v)
{
	return v;
}

llvm ir without this patch:

define i32 @check_int(i32 returned %v_arg) local_unnamed_addr #0 {
  ...
}

define i32 @check_uint(i32 returned %v_arg) local_unnamed_addr #0 {
  ...
}

llvm ir with this patch:

define signext i32 @check_int(i32 returned signext %v_arg) local_unnamed_addr #0 {
  ...
}

define signext i32 @check_uint(i32 returned signext %v_arg) local_unnamed_addr #0 {
  ...
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that there's also TY::Tdchar, which is a 32-bit integer too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really. The corresponding type char32_t in C/C++ is also sign-extended. Thanks.

@heiher
Copy link
Contributor Author

heiher commented May 20, 2024

@kinke gentle ping.

// recursively visit a POD struct to flatten it
// FIXME: may cause low performance
// dmd may cache argtypes in some other architectures as a TypeTuple, but we
// need to additionally store field offsets to realign later
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please give an example of such a problematic struct, where the layout doesn't match?

Also note that a union would have overlapping fields. Official ABI docs often don't explicitly detail how unions are to be dealt with.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And stylistically, it looks like you never need the original ty, only ty->toBasetype(), so flattening it once at the beginning might make sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example:

C

f32x2.c: (https://godbolt.org/z/vK599hrq7)

struct f32x2_s
{
	float a;
	float b;
};

struct f32x2_s f_f32x2_s(struct f32x2_s x)
{
	x.a += 1.0;
	x.b += 1.0;
	return x;
}

llvm ir:

define dso_local { float, float } @f_f32x2_s(float %0, float %1) local_unnamed_addr #0 !dbg !9 {
    ...
}

D

f32x2.d:

extern(C) struct f32x2_s
{
	float a;
	float b;
}

extern(C) f32x2_s f_f32x2_s(f32x2_s x)
{
	x.a += 1.0;
	x.b += 1.0;
	return x;
}

llvm ir without this patch:

define i64 @f_f32x2_s(i64 %x_arg) local_unnamed_addr #0 {
  ...
}

llvm ir with this patch:

define { float, float } @f_f32x2_s({ float, float } %x_arg) local_unnamed_addr #0 {
  ...
}

Copy link
Member

@kinke kinke May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant an example where the struct needing a rewrite has a different layout than the rewritten one. Where you need to store the field offsets and do the piecewise memcpy's, instead of a trivial bitcast. This struct here wouldn't need any rewrite at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is copied from RISC-V, and I don't know which struct need to be rewritten to use temporary memory space to realign. Let's simplify it unless there are new tips.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is copied from RISC-V

Oh I see! Well if there'd be no need to ever 're-align', only bitcast, then a frontend toArgTypes() implementation would be much preferred, see e.g. https://github.com/ldc-developers/ldc/blob/master/dmd/argtypes_aarch64.d. That takes care of caching this info per type, makes it available for the va_arg implementation in druntime too, and also makes it usable for other compilers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review and help. I agree that the front-end implementation would be a much better approach. I'm a newbie in D, so this might take some time for me. It would be best if we could move forward with the current patch to make things better, and I will continue to improve it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I did one more review round, just some little nits.


bool requireHardfloatRewrite(Type *ty) {
if (!ty->toBasetype()->isTypeStruct())
return false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This excludes complex types and static arrays.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I suggest adding the isPOD check here (and probably a size check to avoid costly flattening for larger types), instead of doing that at every call site.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The caller has done an isPOD check before calling requireHardfloatRewrite.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I'm saying, it's ugly to do that at every call site, instead of once here and having the guarantee for the following logic then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -46,11 +188,23 @@ struct LoongArch64TargetABI : TargetABI {

void rewriteFunctionType(IrFuncTy &fty) override {
if (!fty.ret->byref) {
rewriteArgument(fty, *fty.ret);
if (!skipReturnValueRewrite(fty)) {
if (!fty.ret->byref && isPOD(fty.ret->type) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!fty.ret->byref is checked 3 times here; skipReturnValueRewrite() checks it too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Contributor Author

@heiher heiher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.


bool requireHardfloatRewrite(Type *ty) {
if (!ty->toBasetype()->isTypeStruct())
return false;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The caller has done an isPOD check before calling requireHardfloatRewrite.

// recursively visit a POD struct to flatten it
// FIXME: may cause low performance
// dmd may cache argtypes in some other architectures as a TypeTuple, but we
// need to additionally store field offsets to realign later
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example:

C

f32x2.c: (https://godbolt.org/z/vK599hrq7)

struct f32x2_s
{
	float a;
	float b;
};

struct f32x2_s f_f32x2_s(struct f32x2_s x)
{
	x.a += 1.0;
	x.b += 1.0;
	return x;
}

llvm ir:

define dso_local { float, float } @f_f32x2_s(float %0, float %1) local_unnamed_addr #0 !dbg !9 {
    ...
}

D

f32x2.d:

extern(C) struct f32x2_s
{
	float a;
	float b;
}

extern(C) f32x2_s f_f32x2_s(f32x2_s x)
{
	x.a += 1.0;
	x.b += 1.0;
	return x;
}

llvm ir without this patch:

define i64 @f_f32x2_s(i64 %x_arg) local_unnamed_addr #0 {
  ...
}

llvm ir with this patch:

define { float, float } @f_f32x2_s({ float, float } %x_arg) local_unnamed_addr #0 {
  ...
}

@@ -46,11 +188,23 @@ struct LoongArch64TargetABI : TargetABI {

void rewriteFunctionType(IrFuncTy &fty) override {
if (!fty.ret->byref) {
rewriteArgument(fty, *fty.ret);
if (!skipReturnValueRewrite(fty)) {
if (!fty.ret->byref && isPOD(fty.ret->type) &&
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.


Type *ty = arg.type->toBasetype();
if (ty->isintegral() && (ty->ty == TY::Tint32 || ty->ty == TY::Tuns32)) {
arg.attrs.addAttribute(LLAttribute::SExt);
Copy link
Contributor Author

@heiher heiher May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a source comment and describe more here.

For example:

C

int32.c: (https://godbolt.org/z/vcjErxj76)

int check_int(int v)
{
	return v;
}

unsigned int check_uint(unsigned int v)
{
	return v;
}

llvm ir:

define dso_local noundef signext i32 @check_int(i32 noundef returned signext %0) local_unnamed_addr #0 !dbg !9 {
  ...
}

define dso_local noundef signext i32 @check_uint(i32 noundef returned signext %0) local_unnamed_addr #0 !dbg !18 {
  ...
}

D

int32.d:

extern(C) int check_int(int v)
{
	return v;
}

extern(C) uint check_uint(uint v)
{
	return v;
}

llvm ir without this patch:

define i32 @check_int(i32 returned %v_arg) local_unnamed_addr #0 {
  ...
}

define i32 @check_uint(i32 returned %v_arg) local_unnamed_addr #0 {
  ...
}

llvm ir with this patch:

define signext i32 @check_int(i32 returned signext %v_arg) local_unnamed_addr #0 {
  ...
}

define signext i32 @check_uint(i32 returned signext %v_arg) local_unnamed_addr #0 {
  ...
}

@heiher heiher force-pushed the master branch 2 times, most recently from 0dd59cc to 2f9b60c Compare May 22, 2024 12:37

public:
auto returnInArg(TypeFunction *tf, bool) -> bool override {
if (tf->isref()) {
return false;
}
Type *rt = tf->next->toBasetype();
if (!rt->size()) {
return false;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't really required. Every non-POD has a non-empty size (empty structs have a size of 1 - it's only void, noreturn and 0-length static arrays that have a size of 0).

@@ -38,19 +133,30 @@ struct LoongArch64TargetABI : TargetABI {
}

auto passByVal(TypeFunction *, Type *t) -> bool override {
if (!t->size()) {
return false;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

}

for (auto arg : fty.args) {
if (!arg->byref) {
if (!arg->byref && requireHardfloatRewrite(arg->type)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now means that if arg->byref is true, rewriteArgument() is called, which is wrong.

@kinke kinke merged commit c91e199 into ldc-developers:master May 23, 2024
18 of 19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants