Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating symbolic links to toolchains #605

Open
AtomicOperation opened this issue Mar 15, 2024 · 10 comments
Open

Creating symbolic links to toolchains #605

AtomicOperation opened this issue Mar 15, 2024 · 10 comments

Comments

@AtomicOperation
Copy link

I'm trying to set up a rule that will create a symbolic link inside the buck-out folder to a toolchain that is stored elsewhere. I have a package management system that will ensure that the contents inside the symlinked folder won't change. The rule has an attribute which is the toolchain version, and when the rule/script is run with a different version, it will adjust the symlink to the right target path.

 def _install_toolchain_impl(ctx = "context"):
    base_folder = ctx.actions.declare_output("out", dir = True)
    toolchain_folder = base_folder.project(ctx.attrs.toolchain_name)

    command = cmd_args(["install_toolchain.py", 
        "--toolchain", ctx.attrs.toolchain_name,
        "--version",  ctx.attrs.version,
        "--destination", toolchain_folder.as_output(),
        ] 
    )

    sub_targets = {}
    for sub_target in ctx.attrs.sub_targets:
        output = package_folder.project(sub_target)
        command.hidden(output.as_output())
        sub_targets[sub_target] = [DefaultInfo(default_outputs = [output])]

The result of the script is to have out/toolchain_name be a symlink to the target directory.

However, buck is strict about what it considers to be a directory.

If I use declare_output with dir = True, I get something like Required outputs are missing: Action didn’t produce output of the right type. Expected buck-out/v2/gen/root/folder/toolchain_name to be Directoryreal File.

If I use base_folder.project() (like in the example above) then I get an error like Find would traverse a leaf at path: buck-out/v2/gen/root/folder/out/include.

It makes sense that buck is trying to stop me from declaring an artifact that's nested below a regular file. Is there any way to convince it to allow this? I'm suspecting that the best thing to do would be to find the place where the "Action didn't produce output of the right type" error is returned and change it to allow symlinks to folders. Any other thoughts?

@JakobDegen
Copy link
Contributor

If you intend for the out artifact to be a directory, then you definitely need to declare it with dir = True, and buck is right to make you do that.

I suspect the reason that things go wrong if you do that is because you're trying to use a projected artifact as an output. It's not really clear what the behavior of that is supposed to be to me, I wouldn't be surprised if no one ever really thought about that case. If you switch to something like "--destination", cmd_args(base_folder.as_output(), ctx.attrs.toolchain_name, delimiter = "/"), it'll work better.

I would have to dig into this a bit more to confirm though. I also realize that won't work on windows as written too, which is a bit meh. May think about this more

@zjturner
Copy link
Contributor

Isn’t http_archive using a projected artifact as an output?

path: [DefaultInfo(default_output = output.project(path))]

@AtomicOperation
Copy link
Author

I've tried changing the python script to implicitly create the directory (so the --destination arg is passing in just the base folder) and that has the same problem. It seems to be related to the output = package_folder.project(sub_target) call when an intermediate folder on-disk is a symlink. I'd have to go back and verify, but I'm pretty sure the same rule works if the python script creates actual folders without symlinks.

@JakobDegen
Copy link
Contributor

@zjturner output of an action, not output of analysis.

@AtomicOperation can you confirm, is the python script trying to make the output dir a symlink or the toolchain_name in the output dir?

@AtomicOperation
Copy link
Author

I'm trying to make an output dir (base_folder in the snippet above) and a symlink to a folder underneath that (toolchain_folder in the snippet).

@AtomicOperation
Copy link
Author

I suspect the reason that things go wrong if you do that is because you're trying to use a projected artifact as an output. It's not really clear what the behavior of that is supposed to be to me, I wouldn't be surprised if no one ever really thought about that case

@JakobDegen can you explain this a little more? Isn't the entire point of projecting an artifact to create an output relative to some other output?

@JakobDegen
Copy link
Contributor

can you explain this a little more? Isn't the entire point of projecting an artifact to create an output relative to some other output?

Right, but the typical expectation is that you first declare the artifact, use .as_output() to make it the output of some action, and only then project it - in other words, projection is meant for an already bound artifact.

So concretely, I would expect that the version of the code that works looks something like this:

def _install_toolchain_impl(ctx = "context"):
    base_folder = ctx.actions.declare_output("out", dir = True)

    command = cmd_args(["install_toolchain.py", 
        "--toolchain", ctx.attrs.toolchain_name,
        "--version",  ctx.attrs.version,
        "--destination", base_folder.as_output(),
        ] 
    )

    sub_targets = {}
    for sub_target in ctx.attrs.sub_targets:
        output = base_folder.project(ctx.attrs.toolchain_name).project(sub_target)
        command.hidden(output.as_output())
        sub_targets[sub_target] = [DefaultInfo(default_outputs = [output])]

Note that I'm passing the entire base_folder.as_output() to the action, and then only calling .project() afterwards. Obviously this would need adjustments to the python script

@AtomicOperation
Copy link
Author

Thanks, but unfortunately doing it the way you suggest with the .project() call happening after .as_output() results in the same Find would traverse a leaf at path... error.

@zjturner
Copy link
Contributor

I looked into this some and I can reproduce this. Here's a minimal reproducer:

# root//bucklink:BUCK
load(":rule.bzl", "link_dir")

link_dir(
    name = "makelink",
    link_name = "hello",
    link_target = "D:\\src\\buck2",
    script = "makelink.py",
    sub_targets = [
        "buck2",
        "prelude"
    ]
)
# root//bucklink:makelink.py

import os
import argparse

parser = argparse.ArgumentParser()
parser.add_argument("--link_name", type=str)
parser.add_argument("--link_target", type=str)
parser.add_argument("--destination", type=str)

args = parser.parse_args()

os.makedirs(args.destination, exist_ok=True)

symlink_path = os.path.join(args.destination, args.link_name)
link_location = args.destination
link_target = args.link_target
print(f"Creating symlink from {symlink_path} to {link_target}")
os.symlink(link_target, symlink_path, target_is_directory=True)
# root//bucklink:rule.bzl

load("@prelude//python:toolchain.bzl", "PythonToolchainInfo")


def _link_dir_impl(ctx:AnalysisContext) -> list[Provider]:
    base_folder = ctx.actions.declare_output("out", dir = True)

    python = ctx.attrs.python[PythonToolchainInfo]
    args = python.interpreter.args.copy()
    args.add(ctx.attrs.script)
    args.add("--link_name", ctx.attrs.link_name)
    args.add("--link_target", ctx.attrs.link_target)
    args.add("--destination", base_folder.as_output())

    ctx.actions.run(
        args,
        category = "genrule",
    )

    sub_targets = {}
    for sub_target in ctx.attrs.sub_targets:
        output = base_folder.project(ctx.attrs.link_name).project(sub_target)
        sub_targets[sub_target] = [DefaultInfo(default_outputs = [output])]

    return [DefaultInfo(default_outputs=[base_folder], sub_targets=sub_targets)]

link_dir = rule(
    impl = _link_dir_impl,
    attrs = {
        "python": attrs.default_only(attrs.toolchain_dep(providers = [PythonToolchainInfo], default = "toolchains//:python")),
        "script": attrs.source(),
        "link_name": attrs.string(),
        "link_target": attrs.string(),
        "sub_targets": attrs.list(attrs.string()),
    }
)

Just replace the link_target in bucklink/BUCK with a real path to your buck2 source folder.

If you run buck2 build //bucklink:makelink it works. The link is created on disk. No errors are generated. But if you instead run buck2 build //bucklink:makelink[prelude] you get an error like this:

$ buck2 build //bucklink:makelink[prelude]
File changed: root//bucklink/rule.bzl
Build ID: e9cb1937-0626-466a-a1e3-db73effdfb13
Jobs completed: 6. Time elapsed: 0.1s.
Cache hits: 0%. Commands: 1 (cached: 0, remote: 0, local: 1)
BUILD FAILED
The path `hello/prelude` cannot be projected in the artifact ``(root//bucklink:makelink (foo//platforms:native#ae24adfe62876596))/out`, action: (target: `root//bucklink:makelink (foo//platforms:native#ae24adfe62876596)`, id: `0`)`

Caused by:
    Find would traverse a leaf at path: `buck-out/v2/gen/root/ae24adfe62876596/bucklink/__makelink__/out/prelude`

@JakobDegen
Copy link
Contributor

Back from travel.

I asked about this, and it would indeed be possible to support projecting through symlinks, at least external symlinks. However, it's a bit of a refactor because you end up with artifacts that aren't really at the place they claim to be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants