Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restrict pointers, GCC & Clang #443

Open
hugomg opened this issue Jul 31, 2021 · 2 comments
Open

Restrict pointers, GCC & Clang #443

hugomg opened this issue Jul 31, 2021 · 2 comments
Labels
discussion Just for discussion

Comments

@hugomg
Copy link
Member

hugomg commented Jul 31, 2021

This issue is to collect information about what things the C compiler can and cannot optimize, if we use restrict pointers.

The motivation is that there are some pointers that we know that Lua won't modify behind our back, such as the stack pointer or the upvalues array of a closure. However, the C compiler doesn't know that and may generate suboptimal assembly code that accesses this data more than once. In theory, using restrict might help, because it is a hint that the given pointer won't be modified by other aliases. However, C compilers don't always take advantage of the restrict annotations. Current versions of GCC in particular have some important limitations:

GCC and Clang only care about restrict on function parameters

The C language allows local variables to be declared with restrict, but from what I can gather, both GCC and Clang only take advantage of the restrict qualifier if it is used on function parameters.

GCC Bug: 60712

GCC does not optimize restrict across function calls

In the following example, GCC dereferences the pointer twice. Clang dereferences it only once.

int f(int *restrict x)
{
    int a = *x;
    g();
    int b = *x;
    return a + b;
}

It is not clear to me if this is simply a matter of this optimization not being implemented yet, or if GCC takes a more conservative interpretation of the C standard compared to Clang.

GCC Bugs: 81008, 81009, 89479


The following GCC bug is a "meta" bug collecting all the open issues w.r.t restrict pointers: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49774

@hugomg hugomg added the discussion Just for discussion label Jul 31, 2021
@hugomg
Copy link
Member Author

hugomg commented Aug 3, 2021

After looking a bit more into this problem, another limitation is if the restrict pointers are passed as arguments to the function call. In this case, no compilers optimize the memory reads. The called function is allowed to modify the memory pointed by the pointer, because the pointer it receives is "derived from" the restrict pointer,

This is likely to affect the base and the G/K pointers, because they are passed as arguments when we call other Pallene functions.

// Neither gcc nor clang optimize this
int f(int *restrict x)
{
    int a = *x;
    g(x);
    int b = *x;
    return a + b;
}

However, maybe it's not so bad that the compiler doesn't doesn't optimize the restrict pointers across a function call. In order to keep a local variable "alive" through a function call, the compiler must store it in a precious callee-saved register, of which there aren't that many.

Consider the following program, where we try to avoid a second dereference by saving the value in a local variable. In theory, this will only be useful if the a variable is not spilled. If it is spilled, then unspilling it will also cost a memory access when we dereference the stack pointer. Lazily recomputing it is also just a memory dereference, but from the x pointer.

int f(int *restrict x)
{
    int a = *x;
    g(x);
    int b = a;
    return a + b;
}

I don't know if there is a performance difference between dereferencing the stack pointer and dereferencing the K or U because I haven't measured it. However, if there isn't a big difference, that would be an argument in favor of keeping the current system of lazily de-referencing the K and U.

@hugomg
Copy link
Member Author

hugomg commented Aug 18, 2021

Another thing to add to the discussion: there are only a limited number of function arguments that can be passed as registers. Are we sure that we want to spend two registers for the K and the U?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Just for discussion
Projects
None yet
Development

No branches or pull requests

1 participant