Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

github_repo will fail confusingly if the given revision downloads a file with a different prefix. #3000

Open
cemeceme opened this issue Dec 19, 2023 · 3 comments

Comments

@cemeceme
Copy link
Contributor

If a github_repo rule is used for a repo and the revision argument is set to something that doesnt exist, please will still download whatever the main branch is and try to run arcat on it with the -s {expected but different prefix} flag set.

However, this makes arcat fail silently and some subsequent rule will then fail stating the directory that should have been given by arcat was not found as it wasnt created in the first place.

Ideally either arcat should report that the prefix didnt exist, or there should be a check to see if downloaded file name doesnt match what was expected.

@chrisnovakovic
Copy link
Contributor

I wasn't able to reproduce this with the following MWE:

github_repo(
    name = "repo",
    repo = "thought-machine/please",
    revision = "0123456789abcdef0123456789abcdef01234567",
)

because GitHub (correctly) returns a 404 instead of an archive:

$ plz-out/bin/src/please build //tests:repo
Build stopped after 1.38s. 1 target failed:
    //tests:_repo#download
1 error occurred:
        * Error retrieving https://github.com/thought-machine/please/archive/0123456789abcdef0123456789abcdef01234567.zi
p: 404 Not Found

The same happens when revision is set to something that doesn't look like a commit hash (e.g. a non-existent branch name).

The thing about arcat not erroring if the value of -s isn't present at the front of any of the archive member names is indeed a known problem, though - I ran into it in #2829.

@cemeceme
Copy link
Contributor Author

cemeceme commented Jan 3, 2024

Ok, so I looked into it a bit more.
The issue I was facing was with mysql-connector-cpp.
I had incorrectly set please to use the master branch which does not exist (or no longer exists?) as they use trunk instead.
So it was trying to fetch https://github.com/mysql/mysql-connector-cpp/archive/master.zip, which appears to fetch the correct trunk branch from github instead, everything named with the trunk prefix.
For the record, the 'actual' url is supposed to be https://github.com/mysql/mysql-connector-cpp/archive/trunk.zip as expected, which works fine in please.
I also checked some other common branch names, such as main, which shows the correct 404 response as you said.

I would still say that please should confirm if the downloaded file actually matches, but this seems to be more related with some (intentional?) inconsistencies with github.

@cemeceme cemeceme changed the title github_repo will fail confusingly if the given revision doesn't exist. github_repo will fail confusingly if the given revision downloads a file with a different prefix. Jan 3, 2024
@chrisnovakovic
Copy link
Contributor

chrisnovakovic commented Jan 3, 2024

Right - I suspect what happened here is that the owner(s) of mysql-connector-cpp renamed the default branch, which - as you suggested - will set up a redirection of requests for the original branch name to the new branch name (although this happens for all renamed branches, not just the default branch).

The downloading of the archive in github_repo is ultimately handed off to remote_file, so this is really outside the scope of the subrepo rules. Perhaps it makes sense to add a new follow_redirects parameter to remote_file, defaulting to True for backwards compatibility but allowing for problems like this to be surfaced in a more obvious way by setting it to False. I think this is a better solution than trying to hack in GitHub-specific workarounds in github_repo (especially because any hashes specified on the github_repo target will no longer be valid for the downloaded archive after the branch is renamed, which IMO dooms any attempt to handle the branch rename automatically and transparently). I'll have a chat with the Please folks tomorrow and see what they think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants