Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make fails with ccache-wrapped gcc #58

Open
garrison opened this issue Sep 18, 2021 · 16 comments
Open

make fails with ccache-wrapped gcc #58

garrison opened this issue Sep 18, 2021 · 16 comments

Comments

@garrison
Copy link

garrison commented Sep 18, 2021

I've noticed that ekam fails to build when ccache is installed on Fedora 34, with a bunch of errors like ccache: error: Failed to create directory /var/cache/ccache/a/d: Permission denied. On Fedora, ccache seems to use a shared /var/cache/ccache directory, unlike my machines running Debian derivatives, which use ~/.ccache. Unfortunately, Fedora 34 also adds ccache (edit: more precisely, the directory containing ccache-wrapped compilers) to the PATH by default if it is installed, so it's quite easy to hit up against this if using Fedora.

@kentonv
Copy link
Member

kentonv commented Sep 18, 2021

Hmm, I'm confused. AFAICT Ekam doesn't use ccache unless you tell it to. ccache isn't even mentioned anywhere in the repository. To enable it you'd normally have to set the environment variable CXX_WRAPPER=ccache.

Does Fedora somehow cause the compiler to automatically use ccache?

@garrison
Copy link
Author

garrison commented Sep 18, 2021

$ which g++
/usr/lib64/ccache/g++

Yes. The directory containing ccache-wrapped versions of compilers is added to the PATH in /etc/profile.d/ccache.sh.

@kentonv
Copy link
Member

kentonv commented Sep 18, 2021

So Fedora seems to have installed a compiler that doesn't work? That seems surprising but doesn't seem like an Ekam bug. I don't think there's anything in Ekam specifically that would conflict with the use of /var/cache/ccache as the cache directory.

@garrison
Copy link
Author

garrison commented Sep 19, 2021

Well, I'm not certain that there is an Ekam bug, but something about it is interacting strangely with ccache when it is installed on Fedora. If I boot the following Vagrantfile:

Vagrant.configure("2") do |config|
  config.vm.box = "generic/fedora34"
  config.vm.provision "shell", inline: <<-SHELL
    dnf -y install git gcc-c++ make ccache findutils diffutils automake nano
  SHELL
end

and attempt to build capnproto from source using autotools, it succeeds. But if I then try to build ekam from source using make, it fails (log here). Uninstall ccache, then everything works. Curiously, when run under ccache, there are plenty of No such file or directory errors followed by some linking errors well before ccache is even named in the Permission denied errors that I mentioned.

This is all pretty easy to work around -- just remove ccache -- but I am entertaining the possibility that the root cause here might be related to the build errors under nix-build, which apparently (and unfortunately) were not fixed by #59 (see zenhack/ekam-nix#3). [EDIT: My speculation here was wrong; the build errors under nix turned out to be unrelated.]

@garrison
Copy link
Author

Fedora's /var/cache/ccache has permissions rwxrws---. Is there any possibility that ekam's seccomp filter could be preventing subprocesses from writing there? (That was my first suspicion, anyway, when I filed this bug.)

@zenhack
Copy link
Contributor

zenhack commented Sep 19, 2021

What group owns that directory, and are you a member of that group?

@garrison
Copy link
Author

ccache owns the directory. I am a member of the group on my local machine, but I suspect vagrant may not be a member on the virtual one. I will dig into this tomorrow.

@kentonv
Copy link
Member

kentonv commented Sep 19, 2021

Ekam doesn't use a seccomp filter; rather, it uses LD_PRELOAD to load a library that intercepts many syscalls. While that library could certainly lead to bugs, it generally doesn't touch syscalls accessing files outside of the build directory. So, I'd be surprised if it is causing EPERM errors when opening files under some cache directory. I think a regular user/group permissions issue seems more likely.

@garrison
Copy link
Author

I had indeed forgotten to usermod -a -G ccache vagrant on the VM. But like on my local machine, the error remains even once I run that command and login (i.e., vagrant ssh) again.

Actually, as a precondition of Fedora enabling the shared cache in /etc/profile.d/ccache.sh, it first tests if that directory is writable by the user. Otherwise, it defaults to ~/.cache/ccache. This can be confirmed in the output of ccache -s both when vagrant is and is not a member of the group ccache. It turns out, when vagrant is not a member of ccache, the errors I get are actually of the following variety: ccache: error: Failed to create directory /home/vagrant/.cache/ccache/f/d: Permission denied. So this whole thing can be reproduced at the single-user level, even without the shared cache directory. As I said previously, I've been unable to reproduce any of this on Debian, though I did wrongly conclude before that my inability to reproduce there had something to do with there not being a shared cache directory.

@kentonv
Copy link
Member

kentonv commented Sep 20, 2021

Oh, maybe it has to do with this code from intercept.c:

      /* Absolute path or under `deps`.  Note the access but don't remap. */
      if (usage == WRITE) {
        /* Cannot write to absolute paths. */
        funlockfile(ekam_call_stream);
        errno = EACCES;
        if (debug) fprintf(stderr, "  absolute path, can't write\n");
        return NULL;
      }

Ekam doesn't want build actions writing outside of the build directory since it can't track what they did...

@garrison
Copy link
Author

garrison commented Sep 20, 2021

In support of that hypothesis, I managed to reproduce on Debian bullseye, which has the same version of ccache (version 4.2) as Fedora 34. Previously, I had been unable to reproduce using ccache 3.4.1 (on an Ubuntu 18.04 machine, not in a virtual environment).

FROM debian:bullseye
RUN apt-get update && apt-get install -y git build-essential automake ccache
RUN git clone https://github.com/capnproto/ekam.git
WORKDIR /ekam
RUN PATH=/usr/lib/ccache:$PATH make

EDIT: The same Dockerfile fails if the base image is changed to ubuntu:18.04. I am not sure why I was unable to reproduce on my local machine originally.

@garrison garrison changed the title make fails with ccache installed under Fedora make fails with ccache-wrapped gcc Sep 20, 2021
@garrison
Copy link
Author

garrison commented Oct 7, 2021

I experimented with the following patch, which explicitly allows reads and writes under /var/cache/ccache to bypass special handling. With it, everything works with ccache. Ideally, this allowlist should be configurable, perhaps via an environment variable. But a reasonable default would be to include $HOME/.cache/ccache, $HOME/.ccache, $XDG_CACHE_HOME/ccache, and $CCACHE_DIR, assuming each of these environment variables is set. This list follows https://ccache.dev/manual/4.3.html#config_cache_dir.

diff --git a/src/ekam/rules/intercept.c b/src/ekam/rules/intercept.c
index 3fc5a3a..fd618b8 100644
--- a/src/ekam/rules/intercept.c
+++ b/src/ekam/rules/intercept.c
@@ -273,6 +273,7 @@ static const char VAR_TMP[] = "/var/tmp";
 static const char TMP_PREFIX[] = "/tmp/";
 static const char VAR_TMP_PREFIX[] = "/var/tmp/";
 static const char PROC_PREFIX[] = "/proc/";
+static const char VAR_CACHE_CCACHE_PREFIX[] = "/var/cache/ccache/";
 
 static usage_t last_usage = READ;
 static char last_path[PATH_MAX] = "";
@@ -542,6 +543,12 @@ static const char* remap_file(const char* syscall_name, const char* pathname,
       pathname = pathname + strlen(current_dir);
     } else if (pathname[0] == '/' ||
                strncmp(pathname, "deps/", 5) == 0) {
+      /* First, allow ccache to break the rules. */
+      if (strncmp(pathname, VAR_CACHE_CCACHE_PREFIX, strlen(VAR_CACHE_CCACHE_PREFIX)) == 0) {
+        funlockfile(ekam_call_stream);
+        /*if (debug)*/ fprintf(stderr, "  allowing ccache access: %s %s\n", (usage == WRITE) ? "write" : "read ", pathname);
+        return pathname;
+      }
       /* Absolute path or under `deps`.  Note the access but don't remap. */
       if (usage == WRITE) {
         /* Cannot write to absolute paths. */

First build, with an empty cache:

real	2m0.932s
user	12m52.643s
sys	0m40.038s

Second build, with a warm cache:

real	1m1.668s
user	4m36.971s
sys	0m19.542s

I was actually hoping for a more dramatic improvement, but this does provide some progress toward an alternative, partial solution to #24, #28. In particular, tests need to run again regardless, but I am not sure what fraction of the warm-cache time is spent running tests vs compiling.


Somewhat independent thought: I think that intercept.c should really send a message to stderr when it denies a write with EACCES, regardless of the value of debug. Is there any reason to think doing so might make the output too noisy in an undesirable way?

@kentonv
Copy link
Member

kentonv commented Oct 8, 2021

@vlovich has somehow made ccache work with Ekam without changes. Maybe he can advise on the right answer here?

@vlovich
Copy link
Contributor

vlovich commented Oct 10, 2021

The Ekam changes are already landed but the user has to do some integration work currently. Basically:

EKAM_REMAP_BYPASS_DIR=$(ccache -k cache_dir)/

The trailing slash is important.

Arguably maybe Ekam should auto add the ccache directory for the user if ccache is detected as installed, but doing so within C felt annoying so I went with higher-level tooling providing this info instead.

@garrison
Copy link
Author

Thanks. That works, and performs still better than the change I made:

real	0m46.119s
user	3m30.541s
sys	0m16.794s

Before filing this issue, I had run git grep ccache, git log -Sccache, and searched open and closed issues. With there being zero open pull requests, I did not think to also search closed pull requests. I just discovered git log --grep=ccache -- that would have given me some information too, had I tried it.

But even if I had found #44, I wouldn't have known to use a trailing slash.

Arguably maybe Ekam should auto add the ccache directory for the user if ccache is detected as installed, but doing so within C felt annoying so I went with higher-level tooling providing this info instead.

A fork/exec to run ccache -k cache_dir would be, as you point out, the most direct way. As I implied above, an alternative would be to try to figure out the "right" path by looking only at environment variables following the procedure ccache uses to set the default ccache_dir, whether or not ccache is actually installed. I'm not sure this would be much simpler though, and it seems a bit brittle too.

@vaci
Copy link

vaci commented Nov 2, 2023

Just to note that, despite setting the EKAM_REMAP_BYPASS_DIRS env var, I get the following error from ccache (version 4.8.3) when attempting to run it as the CXX_WRAPPER in ekam:

ccache: Fd.hpp:76: int Fd::operator*() const: failed assertion: m_fd != -1

I haven't yet been able to track down exactly which file ccache is presumably failing to access...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants