Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API design considerations for tld.js@3.0 #124

Open
thom4parisot opened this issue Mar 19, 2018 · 8 comments
Open

API design considerations for tld.js@3.0 #124

thom4parisot opened this issue Mar 19, 2018 · 8 comments

Comments

@thom4parisot
Copy link
Owner

thom4parisot commented Mar 19, 2018

There are some recurring issues going on. I feel they are more related to naming/design rather than because of technical issues.

The intent of this issue is to open a discussion about the painpoints and what we think could help addressing them.

From what I read, the biggest source of pain confusion is getDomain does not return the expected domain • #78 #117 #73

From what I understand, this library can be used:

  1. as a JavaScript API for public-suffix.org
  2. to validate user inputs (primary use of this library when it was created). Is the domain associated with toto@foo.abcdefg email address okay?

I was thinking of making the context of each function explicit.

Via a prefix

const { isValidHostname, getDomain, getSubdomain } = require('tldjs');
const { getPublisSuffixDomain, getPublicSuffixSubdomain } = require('tldjs');

// ...

Via a dedicated module

const { isValidHostname, getDomain, getSubdomain } = require('tldjs');
const publicSuffix = require('tldjs/public-suffix');

publicSuffix.getDomain('...');
publicSuffix.getSubdomain('...');
@remusao
Copy link
Collaborator

remusao commented Apr 14, 2018

Hi @oncletom, thanks for initiating the discussion, and sorry for taking so long to give feedback on this!

I really like that you identify the different use-cases for tld.js. In fact, it made me realize that people do not use this library for the same purposes! I was naively thinking that my use-case was representative, but it might not be! Do you think it would make sense to start looking at how other libraries use tld.js and ask feedback (or look at their source code and check why and how it is used?). Maybe this could be a starting point: https://www.npmjs.com/browse/depended/tldjs

As you suggest, identifying use-cases would allow us to think better about how to structure the API. I do not have clear ideas about what the best solution would be; so far both of your suggestions (dedicated modules or prefix) make sense. I have a few general questions/remarks though:

  • Do we still want to provide a default behavior for tld.js? Or should the user have to choose which behavior he needs? Forcing a choice could put people off and give a feeling of complexity. One of the strengths of tld.js is that it works out of the box.
  • It is not obvious to me how the naming should look like (both for dedicated modules and prefixes for names). How do we make sure that the subtle changes in behavior between different "modes" are understood by the user of the library? At which level of detail should the message be communicated? Should it be: "you have the choice between TLD extraction and public suffix extraction", or more use-case oriented: as you mentioned with email validation, domain validation, checking if cookies can be set, etc. which would allow us to direct users to the right part of the API given some use-case? In one hand, being completely accurate about the technical differences might put people off, and staying at a higher level of abstraction might lead people to surprising results from time to time (such as the "what's a domain" issue we currently have).

As usual, I feel giving more power/control to users will have the cost of making the API more complex, and keeping the simplicity of tld.js could prove to be challenging!

What are your thoughts on this @oncletom?

@thom4parisot
Copy link
Owner Author

thom4parisot commented Apr 14, 2018

Do you think it would make sense to start looking at how other libraries use tld.js and ask feedback (or look at their source code and check why and how it is used?)

@remusao yes, I think it's worthwile otherwise the work might not be as useful as it could be.

I browsed the dependents a few weeks ago and noticed most of them were still depending in the 1.x branch.

I was thinking of either contacting owners directly or by opening Pull Requests to bump to tld@2 and collect feedback there. The later feels a bit heavier to implement (and I wouldn't be able to commit to it at the moment).

It's slower than rewriting code but I totally feel it's worth asking.

@remusao
Copy link
Collaborator

remusao commented Apr 14, 2018

@oncletom I'm willing to help here. If we put up a list of projects in some meta-issue, we can share the work. I think it's a good idea! Should "collecting feedback" and "updating to tld@2" be done simultaneously?

@thom4parisot
Copy link
Owner Author

thom4parisot commented Apr 14, 2018

I'm willing to help here.

Thanks 😊 I find your help on this project to be singular and very helpful.

What do you think about opening a new issue with a checklist of npm dependencies and their tldjs version range?

I have opened #125 to shape out something. Is it something relatable to what you had in mind? Do you think something else should be part of it?

Also, I realised there is no API to get package dependants. module-dependants looks like a good one to programmatically fetch them with a few lines of code.

@remusao
Copy link
Collaborator

remusao commented Apr 15, 2018

Thanks 😊 I find your help on this project to be singular and very helpful.

I'm happy to make myself useful :) I like the project and I also find working with you to be a rewarding experience.

Also, I realised there is no API to get package dependants. module-dependants looks like a good one to programmatically fetch them with a few lines of code.

Indeed, for some reason different libraries did not get all the dependencies, so I combined several of them. We got ~52 deps. What's weird is that on npm there are more listed... We can always check manually at some point. Should not take too long!

@thom4parisot
Copy link
Owner Author

The more I think about it and the more I feel the removal of fromUserSettings is a way to avoid the pitfall of "let's make a global configuration".

Personnalisation comes around "validity" (strict or lenient, additional hosts or not).

My hunch is it's more about strictness (require('tldjs') and require('tldjs/strict') then personnalisation (additional argument object).

Maybe there is something about the naming getBrowserDomain, getFTPDomain, getDNSDomain etc. as hostname rules seem to be context-dependent — cf. recent conversations in #73

Also, I don't mind publishing alpha versions to flesh this out by doing.

What do you think?

@remusao
Copy link
Collaborator

remusao commented May 15, 2018

Good points! I agree regarding the removal of fromUserSettings. I was looking at tldextract which is a pretty popular library doing public suffix extraction in Python and I quite like their API. IMO, we should provide a very simple API with meaningful (and non-surprising defaults) for most use-cases + ways to customize some aspects of the library via options. This could look like:

  1. We continue to expose the parse function which would be the main entry point of the API. By default, it could do lenient hostname validation, and only consider the ICANN section of the list. So most users who only care about getting general domain/tld could just do tld.parse(url) and get it would "just work".
  2. This parse function would take an optional second argument containing different options to customize the behavior: includePslPrivateDomains, strictHostnameValidation, customExtractHostname, validHosts.
  3. We could still expose individual methods for convenience: isValid, getPublicSuffix, etc. which could also take the exact same second optional argument for consistency.
  4. We could also use this occasion to modernize the code-base and use some es6 features? I was thinking about giving another shot at the lazy properties (which were too slow last time I checked, but maybe get attributes for ES6 classes are implemented differently).

So not sure 100% about how all this would look like, but maybe I can play a bit around that and propose some implementation as a starting point.

Some examples:

import tldjs from 'tldjs';

tldjs.parse(url);
tldjs.parse(url, { includePslPrivateDomains: true });
tldjs.isValid(url, { strictHostnameValidation: true });
...

So in the end, I would really like to have a super simple default experience with the library, with potentially opinionated defaults (which should still be as less surprising as possible for people using the library), but one consistent way to customize the behavior.

@remusao
Copy link
Collaborator

remusao commented May 15, 2018

And in the end, it does not look so different from tldjs 2.x, but we could make the API even simpler, more consistent and change some of the defaults to fit better what people expect from the library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants