Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

boltons.urlutils.URL cannot handle some characters in credentials #309

Open
arossert opened this issue May 16, 2022 · 0 comments
Open

boltons.urlutils.URL cannot handle some characters in credentials #309

arossert opened this issue May 16, 2022 · 0 comments

Comments

@arossert
Copy link

arossert commented May 16, 2022

When trying to parse a URL with special characters in the credentials part it is not working as expected, I found an issue with ? and /.

If it is in the password there is an exception

In [52]: URL("http://username:password?@www.proxy.com:443")
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~/tests/venv/lib/python3.7/site-packages/boltons/urlutils.py in parse_url(url_text)
    938             try:
--> 939                 port = int(port_str)
    940             except ValueError:

ValueError: invalid literal for int() with base 10: 'password'

During handling of the above exception, another exception occurred:

URLParseError                             Traceback (most recent call last)
<ipython-input-52-4078cba2692c> in <module>
----> 1 URL("http://username:password?@www.proxy.com:443")

~/tests/venv/lib/python3.7/site-packages/boltons/urlutils.py in __init__(self, url)
    496                                         ' passing the result. (got: %s)'
    497                                         % (DEFAULT_ENCODING, ude))
--> 498             ud = parse_url(url)
    499
    500         _e = u''

~/tests/venv/lib/python3.7/site-packages/boltons/urlutils.py in parse_url(url_text)
    941                 if port_str:  # empty ports ok according to RFC 3986 6.2.3
    942                     raise URLParseError('expected integer for port, not %r'
--> 943                                         % port_str)
    944                 port = None
    945

URLParseError: expected integer for port, not 'password'

If it is in the username it is not raising an exception but the pairing is incorrect

In [56]: print(URL("http://username?:password@www.proxy.com:443").port)
None

This seems to be an issue with urlparse so I'm not sure it is boltons to blame.

Trying to use a regex pattern is working for me

pattern = re.compile(
r"""
    (?P<schema>[\w\+]+)://
    (?:
        (?P<username>[^:/]*)
        (?::(?P<password>.*))?
    @)?
    (?:
        (?:
            (?P<host>[^/:]+)
        )?
        (?::(?P<port>[^/]*))?
    )?
    """,
re.X,
)

(This still not working if the username contain /)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant