Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URL Safety Validation Function #8301

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

URL Safety Validation Function #8301

wants to merge 2 commits into from

Conversation

mvlttt
Copy link

@mvlttt mvlttt commented May 16, 2024

Description

This pull request introduces a new function is_safe_url() in the client_utils module to validate the safety of URLs before processing them further. Here's an overview of the changes:

  • Added client_utils.is_safe_url() function to check if a URL is safe for processing.
  • The function uses URL parsing (urlparse), DNS lookup (socket.gethostbyname), and IP address validation (ipaddress.ip_address) to determine the safety of the URL.
  • If the URL's scheme is not 'http' or 'https', or if the resolved IP address is private or multicast, the function returns False.
  • Error handling is implemented to catch invalid URL formats, DNS resolution failures, and other potential issues.

This enhancement ensures that URLs are properly validated before any further actions are taken, improving the security and reliability of URL processing within the application.

…tocol verification.

* Add URL safety check function "is_safe_url"
* Add URL safety check with client_utils.is_safe_url
* Removed follow_redirects parameter of httpx.stream function
@gradio-pr-bot
Copy link
Contributor

gradio-pr-bot commented May 16, 2024

🪼 branch checks and previews

Name Status URL
Spaces ready! Spaces preview
Website failed! Details

Install Gradio from this PR

pip install https://gradio-builds.s3.amazonaws.com/c9bbf8a1cd786a9881a8e493d5665badfc2b0e7a/gradio-4.31.3-py3-none-any.whl

Install Gradio Python Client from this PR

pip install "gradio-client @ git+https://github.com/gradio-app/gradio@c9bbf8a1cd786a9881a8e493d5665badfc2b0e7a#subdirectory=client/python"

@abidlabs
Copy link
Member

Thanks @mvlttt I'm not totally sure if we want to introduce all these checks as they will likely add some latency any time URLs are being handled. Can you explain the motivation for this a little more?

@abidlabs
Copy link
Member

I'm mainly concerned about this DNS lookup:

        # Resolve the domain to an IP address using DNS lookup
        resolve = socket.gethostbyname(domain)

Would the following simpler approach work:

  1. Check to see if its a url string or an ip address
  2. If its a url string, it should not start with localhost
  3. If its an ip address, it should not be ip.is_private or ip.is_multicast

@mvlttt
Copy link
Author

mvlttt commented May 17, 2024

The reason I do DNS lookup is to prevent the DNS rebinding vulnerability. I'm not sure how to optimize this specifically, but I'm considering implementing a control to only perform DNS lookup if the domain is different from the IP.

The solutions you suggest have many bypass methods. I'm adding a few references for you.

https://payatu.com/blog/dns-rebinding/
https://cheatsheetseries.owasp.org/cheatsheets/Server_Side_Request_Forgery_Prevention_Cheat_Sheet.html
https://github.com/swisskyrepo/PayloadsAllTheThings/blob/master/Server%20Side%20Request%20Forgery/README.md

@missionfloyd
Copy link

This is going to break LAN and localhost addresses. Maybe it should be optional.

Or perhaps it should only apply to domain names, not IPs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants