Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port normalization #921

Open
1 task done
Julian opened this issue Aug 28, 2023 · 3 comments
Open
1 task done

Port normalization #921

Julian opened this issue Aug 28, 2023 · 3 comments
Labels

Comments

@Julian
Copy link

Julian commented Aug 28, 2023

Is your feature request related to a problem?

I'm planning on including yarl as a dependency in the referencing project, which essentially needs to maintain an object of type Mapping[URL, Foo].

As part of doing so, I want/need URLs to be normalized according to RFC3986 so that users looking up "equivalent" URLs get the same Foo.

This works perfectly for everything except port normalization. In other words, if a URL contains a default port, the following returns False: URL('http://example.com:80/default/port') == URL('http://example.com/default/port') even though these URLs are equivalent, and even though it seems yarl knows about some default ports.

This means that when doing lookups or insertions, I have to first manually do:

        if url.explicit_port and url.is_default_port():
            url = url.with_port(None)

Describe the solution you'd like

Strip default ports on creation, similar to what hyperlink.URL does, where:

>>> hyperlink.URL.from_text('http://example.com:80/default/port') == hyperlink.URL.from_text('http://example.com/default/port')

returns True.

Describe alternatives you've considered

Provide a .normalize method, though this seems odd given that all other normalization is done already for me (e.g. percentage escapes, case sensitivity or insensitivity, etc.)

Additional context

No response

Code of Conduct

  • I agree to follow the aio-libs Code of Conduct
@Julian Julian mentioned this issue Aug 28, 2023
1 task
@Dreamsorcerer
Copy link
Member

Related issue that should also be fixed with this one: aio-libs/aiohttp#7381
i.e. This should pass: URL("https://foo.com").origin() == URL("https://foo.com:443").origin()

@commonism
Copy link
Contributor

commonism commented Nov 24, 2023

Had a look, I'd consider 2 scenarios.

Dropping the port in __eq__ allows keeping the url "as is" but does not really play nice with __lt__/__gt__.
Having normalize_default_port=False as additional parameter to __new__()/build() allows dropping the port early on, __eq__ just works and it does not create confusion with __lt__/__gt__

What would you prefer?

@Dreamsorcerer
Copy link
Member

Dropping the port in __eq__ allows keeping the url "as is" but does not really play nice with __lt__/__gt__.

Well, I don't think you want to drop it (URL("https://foo.com:8000") != URL("https://foo.com:2020"), but do something more like replacing it with URL.port, which will include the default value, in order to compare them. Or something like that. Maybe the other comparison operators need to use that too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants