Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix filename decoding issue for zip files archived by macOS #75

Closed
wants to merge 2 commits into from
Closed

Fix filename decoding issue for zip files archived by macOS #75

wants to merge 2 commits into from

Conversation

Wh1isper
Copy link

this is how I fix #74

@github-actions
Copy link

Binder 👈 Launch a binder notebook on branch Wh1isper/jupyter-archive-zh_CN/master-upstream

Copy link
Member

@fcollonval fcollonval left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Wh1isper

I have a couple of suggestions. And could you update the workflow .github/workflows/build.yml on all OSes, so the tests will cover this use case.

with reader as f:
for fn in f.namelist():
extreact_path = pathlib.Path(f.extract(fn, path=destination))
correct_filename = try_macos_decode(fn) or try_windows_chinese_decode(fn) or fn
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you change this a bit to use a single function that has the following structure:

def decode_filename(fn):
    try:
        encoded_filename = fn.encode('cp437')
        try:
            # MacOS encoding
            return encoded_filename.decode('utf-8')
        except UnicodeError:
            # Windows encoding
            return encoded_filename.decode('gbk')
    except UnicodeError:
        return fn

for fn in f.namelist():
extreact_path = pathlib.Path(f.extract(fn, path=destination))
correct_filename = try_macos_decode(fn) or try_windows_chinese_decode(fn) or fn
extreact_path.rename(os.path.join(destination, correct_filename))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

destination is a Path, so you can write:

Suggested change
extreact_path.rename(os.path.join(destination, correct_filename))
extracted_path.rename(destination / correct_filename)

# *nix zip utilities silently uses system encoding(utf-8 generally)
with reader as f:
for fn in f.namelist():
extreact_path = pathlib.Path(f.extract(fn, path=destination))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix wording

Suggested change
extreact_path = pathlib.Path(f.extract(fn, path=destination))
extracted_path = pathlib.Path(f.extract(fn, path=destination))


self.log.info("Finished extracting {} to {}.".format(archive_path, archive_destination))



Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

@Wh1isper
Copy link
Author

Wh1isper commented Dec 1, 2021

well, I found this is not a good way to solve this problem
but anyone need this temporarily solution can refer my fork

@Wh1isper Wh1isper closed this Dec 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Filename decoding issue for zip files archived by macOS
2 participants