Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple subdomains wrongly identified as domain #111

Open
jonathanvandriessen opened this issue Oct 6, 2017 · 7 comments
Open

Multiple subdomains wrongly identified as domain #111

jonathanvandriessen opened this issue Oct 6, 2017 · 7 comments
Labels

Comments

@jonathanvandriessen
Copy link

Hi,

First of all, thank you for tldjs, great library!

When I call tldjs.getDomain() on the following urls, tldjs identifies the entire url (without the protocol) as the domain, and no subdomains (tldjs.getSubdomain() returns empty).
This are the examples I have that cause an issue:

http://bntp-assets.global.ssl.fastly.net/
http://cres-wpstatic.freetls.fastly.net/
http://r3engage-live.global.ssl.fastly.net/
http://ticket-magic-ember-herokuapp-com.global.ssl.fastly.net/

Thank you!

@ctavan
Copy link

ctavan commented Oct 6, 2017

@jonathanvandriessen, this is expected behavior since both of the following are public suffixes:

  • global.ssl.fastly.net
  • freetls.fastly.net

See https://publicsuffix.org/list/ for reference

The domain as returned by getDomain() is the public suffix plus one more element left to it. This is why you get the full domain.

@jonathanvandriessen
Copy link
Author

Thanks for your reply @ctavan !

@remusao
Copy link
Collaborator

remusao commented Oct 8, 2017

@jonathanvandriessen To add to the answer from @ctavan, if this is something blocking for your usage of the library, this could be solved by having a way to match only ICANN rules, ignoring the PRIVATE section of the suffix list (since the rules for fastly.net are in the PRIVATE section).

I already have some POC on one of my branches to do something like that. If there is some interest for such feature, we could try to integrate it.

@jonathanvandriessen
Copy link
Author

@remusao it is blocking for my usage of the library indeed.

Currently, I'm considering a workaround where I check where:
IF I don't get a subdomain AND I do get a domain that has more than 2 dots in it THEN I consider the url to be having a subdomain. I know this is probably not perfect but it seems to do the trick for the exceptions I know of.

FYI: My use case is the following: given a domain (could include subdomains), I'm trying to determine if I need to add www in front of it to make it a valid url. I add .www if the domain does not seem to have a subdomain.

I'd be happy to test drive your solution!

@ctavan
Copy link

ctavan commented Oct 9, 2017

@remusao how about simply adding an additional getIcannDomain()? I believe that could be useful. I believe that changing the behavior of the current getDomain() is probably to big of a breaking change.

@remusao
Copy link
Collaborator

remusao commented Oct 9, 2017

@ctavan Agreed about not introducing a breaking change. Maybe we could, as you suggest, add a new method in the API + a new attribute in the result of parse to know what kind of rule triggered?

@ZLightning
Copy link

ZLightning commented Jan 27, 2019

Maybe something like the isRegisterable()/getRegisterableDomain() methods in this Java library would be a good way to solve this issue. Those functions would be quite useful for security researchers. Public suffix domians in the private section, as returned by getDomain(), do not have whois data, and the registerable domain's info does not apply to the private subdomains since they likely are not under the direct control of the owner of the registerable domain.

This issue is duplicated in #120 #117 #78 and appears to be the main part of the v3 API considerations in #124

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants