Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filename sanitization improvement in download_file function #4964

Merged

Conversation

jacopotediosi
Copy link
Contributor

@jacopotediosi jacopotediosi commented Mar 2, 2024

  • Your changes are not possible to do through a plugin and relevant
    to a large audience (ideally all users of OctoPrint)
  • If your changes are large or otherwise disruptive: You have
    made sure your changes don't interfere with current development by
    talking it through with the maintainers, e.g. through a
    Brainstorming ticket
  • Your PR targets OctoPrint's devel branch if it's a completely
    new feature, or maintenance if it's a bug fix or improvement of
    existing functionality for the current stable version (no PRs
    against master or anything else please)
  • Your PR was opened from a custom branch on your repository
    (no PRs from your version of master, maintenance, or devel
    please), e.g. dev/my_new_feature or fix/my_bugfix
  • Your PR only contains relevant changes: no unrelated files,
    no dead code, ideally only one commit - rebase and squash your PR
    if necessary!
  • Your changes follow the existing coding style
  • If your changes include style sheets: You have modified the
    .less source files, not the .css files (those are generated
    with lessc)
  • You have tested your changes (please state how!) - ideally you
    have added unit tests
  • You have run the existing unit tests against your changes and
    nothing broke
  • You have added yourself to the AUTHORS.md file :)

What does this PR do and why is it necessary?

The download_file function in /octoprint/OctoPrint/src/octoprint/util/net.py suffers from a quasi - Path Traversal vulnerability:

def download_file(url, folder, max_length=None, connect_timeout=3.05, read_timeout=7):
with requests.get(url, stream=True, timeout=(connect_timeout, read_timeout)) as r:
r.raise_for_status()
filename = None
if "Content-Disposition" in r.headers.keys():
_, options = werkzeug.http.parse_options_header(
r.headers["Content-Disposition"]
)
filename = options.get("filename")
if filename is None:
filename = url.split("/")[-1]
assert len(filename) > 0
# TODO check content-length against safety limit
path = os.path.abspath(os.path.join(folder, filename))
assert path.startswith(folder)
with open(path, "wb") as f:
for chunk in r.iter_content(chunk_size=8192):
f.write(chunk)
return path

As visible on line 277, the filename of the file to be downloaded, taken from the Content-Disposition header received from the remote server during file download, is not sanitized and is appended directly to the folder name.

The presence of the assert path.startswith(folder) instruction on line 278 prevents Path Traversal towards the parent of the folder, but not within subfolders. Taking for example a case in which the function was called with the intention of storing the file inside /, if the server responded with the header Content-Disposition: attachment; filename=etc/passwd, the downloaded file would overwrite /etc/passwd.

I'm sending a PR instead of using the GitHub Advisory tool because in the current codebase the download_file function is only called by specifying temporary folders as the folder, so this vulnerability is not exploitable. However, I believe that improving filename sanitization in this case is easy and prevents any vulnerabilities that might arise by reusing the download_file function differently in the future.

How was it tested? How can it be tested by the reviewer?

I have verified that the unit tests pass.

I don't seem to see any situations where this change would break anything.

The function I added, secure_filename, takes care of sanitizing file names and is specifically designed to pass the sanitized filename to os.path.join(), as documented in the Flask documentation.

However, I'm open to discussion if you notice any edge cases that I missed.

@github-actions github-actions bot added targets maintenance The PR targets the maintenance branch approved Issue has been approved by the bot or manually for further processing labels Mar 2, 2024
@foosel
Copy link
Member

foosel commented Mar 6, 2024

FYI, I've seen this, I'm just still out of commission after a surgery last week. Will look into this once fully recovered.

@jacopotediosi
Copy link
Contributor Author

FYI, I've seen this, I'm just still out of commission after a surgery last week. Will look into this once fully recovered.

Thanks Gina, I already knew that. Take your time and stay safe out there.

@foosel foosel merged commit 83f0cea into OctoPrint:maintenance Mar 12, 2024
27 checks passed
@foosel
Copy link
Member

foosel commented Mar 12, 2024

Merged, and I'll also cherry-pick this into the upcoming 1.10.0rc3 so that this goes out ASAP. Thank you, for both the patch and your patience!

@jacopotediosi jacopotediosi deleted the download_file_sanitization branch March 12, 2024 17:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Issue has been approved by the bot or manually for further processing targets maintenance The PR targets the maintenance branch
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

2 participants