Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Submissions with control character in title cannot be downloaded #776

Open
wants to merge 1 commit into
base: development
Choose a base branch
from

Conversation

thomas694
Copy link

bdfr archive --subreddit Unicode --sort new Z:/Reddit fails on a windows system.

If a submission contains an invalid character in a field used for the filename the file cannot be saved and an exception occurs:

[2023-02-12 17:10:13,830 - root - ERROR] - Archiver exited unexpectedly
Traceback (most recent call last):
  File "Z:\bulk-downloader-for-reddit\bdfr\__main__.py", line 143, in cli_archive
    reddit_archiver.download()
  File "Z:\bulk-downloader-for-reddit\bdfr\archiver.py", line 60, in download
    self.write_entry(submission)
  File "Z:\bulk-downloader-for-reddit\bdfr\archiver.py", line 120, in write_entry
    self._write_entry_json(entry, content, hash)
  File "Z:\bulk-downloader-for-reddit\bdfr\archiver.py", line 132, in _write_entry_json
    self._write_content_to_disk(resource, content, hash)
  File "Z:\bulk-downloader-for-reddit\bdfr\archiver.py", line 167, in _write_content_to_disk
    with Path(file_path).open(mode="w", encoding="utf-8") as file:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user\AppData\Local\Programs\Python\Python\Lib\pathlib.py", line 1044, in open
    return io.open(self, mode, buffering, encoding, errors, newline)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: [Errno 22] Invalid argument: "Z:\\Reddit\\Unicode\\Expert-Fun-2444_Unicode that doesn't exist \x03_vm3p1h.json"

The submission's title contains a control character.
It either needs to be replaced or according to the already existing style to be removed.

The provided fix does the latter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant