Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Controlling the C compiler for the last bit of performance and cross compiler portability #515

Open
SamirDroubi opened this issue Oct 5, 2023 · 0 comments
Labels
C: Codegen The final C code generation S: Needs Discussion This needs discussion to decide if important to work

Comments

@SamirDroubi
Copy link
Collaborator

SamirDroubi commented Oct 5, 2023

So, far we have been relying on the C compiler as is to compile Exo programs with some optimization flag set (e.g. simply -O3) and indenitifying the tareget architecture (-march).

This has quickly became insufficient as we care about getting perfect performance on our kernels across all inputs. Currently, for example I disable some loop optimizations because they work against what I am trying to do if the cost-model decides thinks otherwise.

It might make sense to start thinking of ways of how to control the C compilers to do what's useful for Exo programs rather than think of them as generic programs. At the end of the day, Exo programs are 1) are very specific semantically 2) very specific in the way the code is represented 3) already optimized.

We might want to try to have recommendations of what set of flags to use when compiling our programs in general. A friend of mine worked on this tool before (https://github.com/ethanlabelle/compiler_tuner) which tunes the compiler parameters for a given program. It might be a fun excercise to use it to tune the C compiler on each kernel we have implemented so far and see if there is a shared set of flags across all kernels. The alternative would be to parse through all compiler flags and think which make or don't make sense, but that may require way too much time.

Another idea is to add tooling so that in case we go through LLVM we could potentially emit specific optimization passes we want to apply on Exo programs which will potentially give us even more granular control over the C compilation process. There has also been recent work (https://arxiv.org/pdf/2309.07062.pdf) on ussing LLMs to generate the proper LLVM flags to tune a given piece of LLVM-IR. You could imagine training something like this on a set of LLVM-IR generated from compiling C generated Exo programs.

Other concerns, getting more control over the C compiler might help making Exo programs have more performance portability across compilers. Currently, I see some non-trivial degeredation as I move in-between compiler vendors and compiler versions; this is mostly on smaller sizes where the code around the loops can have visible impact on performance.

@SamirDroubi SamirDroubi added C: Codegen The final C code generation S: Needs Discussion This needs discussion to decide if important to work labels Oct 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: Codegen The final C code generation S: Needs Discussion This needs discussion to decide if important to work
Projects
None yet
Development

No branches or pull requests

1 participant