Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Git repositories with spaces in their name are not recognized #1211

Open
ardislu opened this issue Feb 1, 2023 · 4 comments · May be fixed by #1214
Open

Git repositories with spaces in their name are not recognized #1211

ardislu opened this issue Feb 1, 2023 · 4 comments · May be fixed by #1214
Labels

Comments

@ardislu
Copy link
Contributor

ardislu commented Feb 1, 2023

Description

The git side panel does not recognize a git repo exists when you open a freshly-cloned repo that has spaces in its name.

For example: a repo with the name repo with spaces in its name is cloned to a folder named repo%20with%20spaces%20in%20its%20name. All the files inside the folder are cloned as expected. However, opening the folder does not trigger the git side panel to recognize any git repo.

Reproduce

  1. Create a new git repo that contains spaces in its name (NOTE: GitHub does not allow this, but other hosts such as Azure DevOps do).
  2. In Jupyter, clone the repo.
  3. Open the folder and try to use the git side panel.

Expected behavior

The repo is cloned successfully and the git side panel works as expected.

Actual behavior

The repo is cloned successfully with the spaces URL encoded in the folder name (i.e. "%20" instead of spaces). However, the git side panel does not detect any git repo inside the folder (it shows the default "You are not currently in a Git repository" page).

Workarounds

Workaround 1: Manually rename the folder to replace the "%20" encoding with spaces. After renaming the folder, the side panel works as expected.
Workaround 2: Open a new terminal and manually use the git CLI.

Context

  • Python package version: 0.41.0
  • Extension version: 0.41.0
  • Git version: 2.34.1
@ardislu ardislu added the bug label Feb 1, 2023
@welcome
Copy link

welcome bot commented Feb 1, 2023

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! 🤗

If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.
welcome
You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! 👋

Welcome to the Jupyter community! 🎉

@ardislu ardislu changed the title Git repositories with spaces in their name are not recognized by the extension Git repositories with spaces in their name are not recognized Feb 1, 2023
@fcollonval
Copy link
Member

Thanks @ardislu

would you be willing to contribute fixing this? I can provide pointers

@ardislu
Copy link
Contributor Author

ardislu commented Feb 2, 2023

@fcollonval Sure, I gave it a shot.

I believe the issue is that the path ends up unescaped when it gets passed to various functions in git.py.

But in this case, we actually want spaces to remain escaped because that's how the folder name is cloned. So my first pass to get it working was to update each execute call to re-escape the spaces in the directory path.

Either like this:

async def show_prefix(self, path):
    cmd = ["git", "rev-parse", "--show-prefix"]
    code, my_output, my_error = await execute(
        cmd,
-       cwd=path,
+       cwd=quote(path, safe=":/\\"),
    )

Or like this:

async def branch(self, path):
+   path = quote(path, safe=":/\\")
    heads = await self.branch_heads(path)

This solution fixes this issue, but I don't think it's optimal:

  • If the user manually renames the folder to replace %20 with spaces, like suggested in workaround 1, then this solution will break it.
  • There are 53 execute calls in git.py that would need to get updated, plus possibly others in other files I haven't looked at.

Any suggestions? Thank you for your help on this.

@ardislu ardislu linked a pull request Feb 4, 2023 that will close this issue
@ardislu
Copy link
Contributor Author

ardislu commented Feb 4, 2023

Thought about it some more and realized it'd be much easier and simpler to just move the same quote logic directly into execute. Created #1214 to do that. Confirmed that folders named repo with spaces in its name and repo%20with%20spaces%20in%20its%20name now both work as expected.

Edit: if two folders named repo with spaces in its name and repo%20with%20spaces%20in%20its%20name both exist in the same folder, then my update will cause git commands from either folder to only go to repo with spaces in its name. Note that this is the same behavior as currently (maybe another issue should be raised for this). So I think the root cause is not yet fixed.

I believe the root cause is that this URL:

/repo%2520with%2520spaces%2520in%2520its%2520name

Should only be decoded once to:

/repo%20with%20spaces%20in%20its%20name

But there is some logic which decodes it again, so the actual string passed to the git.py functions is:

/repo with spaces in its name

However, I'm having a hard time finding where/how the URL is getting decoded twice. In handlers.py I see the path is decoded by url2localpath:

local_path = os.path.join(os.path.expanduser(cm.root_dir), url2path(path))

But I can't see where it's getting decoded again before getting passed to git.py. @fcollonval Any ideas?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants