Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Returning multiple arguments via struct #495

Open
hugomg opened this issue Oct 15, 2021 · 3 comments
Open

Returning multiple arguments via struct #495

hugomg opened this issue Oct 15, 2021 · 3 comments
Labels
enhancement New feature or request

Comments

@hugomg
Copy link
Member

hugomg commented Oct 15, 2021

I wonder if it would be faster to return multiple arguments via struct, instead of by passing one pointer for each argument.

typedef struct {
    lua_Number out1;
    lua_Integer out2;
} R1;

R1 function_02()
{
    R1 ret;
    ret.out1 = 3.14;
    ret.out2 = 42;
    return ret;
}

Possible advantages:

  • In some arquitectures, if the struct is small enough then it is returned via CPU registers.
  • If the struct is passed by reference, then it is a single pointer, instead of one pointer per argument
  • If we take the address of one of the x1 variables, that may stop it from being stored in a register.

Possible disadvantages:

  • The code would be more complicated
  • We need to test it to see if it is actually faster
@hugomg hugomg added the enhancement New feature or request label Oct 15, 2021
@srijan-paul
Copy link
Member

Would it be useful to benchmark something like this before making any changes?
If the implementation turns out to be simple enough, we could perhaps even run the benchmarks after we have it working already. (Instead of editing C code like we did when testing out upvalue box merging).

@hugomg
Copy link
Member Author

hugomg commented Oct 15, 2021

I agree that this needs to be benchmarked first, before we merge it.

I expect that the implementation will be complex enough that editing the generated code by hand may still be worth it. That said, if someone wants to try implementing the full thing straight away then I won't stop them.

@hugomg
Copy link
Member Author

hugomg commented Oct 16, 2021

I did some tests for the N=2 case on some artificial microbenchmarks, with three variants

  1. returning a struct
size_t s = 0;
for (size_t i = 0; i < N; i++) {
    Ret ret = foo(x)
    s += ret.a;
    s += ret.b;
    bar();
    s += ret.a;
    s += ret.b;
}
  1. returning via pointers, assigning to "x1" variables
size_t s = 0;
size_t a, b;
for (size_t i = 0; i < N; i++) {
    foo(x, &a, &b)
    s += a;
    s += b;
    bar();
    s += a;
    s += b;
}
  1. returning via pointers, but assing to temporary variables.
size_t s = 0;
size_t a,b;
for (size_t i = 0; i < N; i++) {
    {
        size_t c, d;
        foo(x, &c, &d);
        a = c; b = d;
    }
    s += a;
    s += b;
    bar();
    s += a;
    s += b;
}

In this extremely artificial microbenchmark, option 1 and 3 took about 6% less time than option 2. This is an extreme example, and the body of the foo and bar functions is extremely simple (it just returns the argument it receives). I would expect that the performance improvement would be less if the body of foo and bar were larger.

Based on this, the performance angle doesn't seem very impressive for N = 2. However, we may want to consider at least returning to a temporary variable, to avoid taking the address of a x1 variable.

I still haven't tested what happens with N >= 3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants