Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FireMonkey][Feature Request] Expose WebExtension API browser.dns.getPublicSuffix(url) #503

Open
kekkc opened this issue Sep 7, 2022 · 10 comments
Labels

Comments

@kekkc
Copy link

kekkc commented Sep 7, 2022

Hi Erosman,

there seems to be some progress on https://bugzilla.mozilla.org/show_bug.cgi?id=1315558 so that the FF internal publicSuffixList can be used by WebExtensions. Would be awesome when FM would expose this so that userscripts can make use of it:

Exposes a call to Services.eTLD.getPublicSuffix(url) as a new
webextensions API method: browser.dns.getPublicSuffix(url)

E.g. getPublicSuffix("https://www.mozilla.co.uk:80") => "co.uk"

The method also takes an optional "additionalParts" integer parameter:

E.g. getPublicSuffix("https://www.mozilla.co.uk:80", 1) => "mozilla.co.uk"

This will save Addon creators from having to reinvent the wheel.
The Mozilla Multi-Account Containers Addon team intends to use this functionality
for a Wildcard Subdomains feature that is currently in development, see PR:
mozilla/multi-account-containers#2352

Seems that the patch for this is currently being reviewed and this might become available in the upcoming FF versions. Would totally ease the handling of userscripts that make use of the PublicSuffixList (that always has to be included separately currently). Usage of the FF internal list might also improve performance.

@erosman
Copy link
Owner

erosman commented Sep 7, 2022

Sure ......

@erosman erosman added the feature request 💡 New feature or request label Sep 7, 2022
@leonidborisenko
Copy link

Just a note. When getPublicSuffix will be exposed to WebExtensions, it will be useful as a part of #431 resolution.

browser.dns.getPublicSuffix(url, 1) will return firstPartyDomain (url being the URL in location bar of tab where userscript is performing fetch).


Quoting from #431 (comment):

I have been thinking about this. TBH, it should not be necessary to use an algorithm to deduce values for in firstPartyDomain in cookies.getAll(). Since it is a setting in cookies.getAll(), then there should also be a standard method to get a value for it, and there should also be guidance on it.

A setting that no one knows what value it should have, is no help.

Well, there is a standard method to get a value for firstPartyDomain: function in Firefox C++ code. But it's not exposed in WebExtension API, so at this moment it must be re-implemented.

Comment by Firefox contributor from ten months ago (https://bugzilla.mozilla.org/show_bug.cgi?id=1669716#c10):

  1. Parsing firstPartyDomain value from extension
    [...] Another issue is that it's not obvious to extensions that they have to compute the eTLD+1 of a tab's URL to use as the firstPartyDomain.

@erosman
Copy link
Owner

erosman commented May 15, 2023

JavaScript PSL parsing module

I have written a PSL parsing module (check for details) for the new feature in v2.68 (used in toolbar popup).

PSL Userscript API

I can make the module available to userscripts.

I have not decided on userscript API naming.

  • it can be made available as a GM API e.g. GM.PSL.prase()
  • it can be made available as a stand-alone API e.g. PSL.prase()

Providing sync API would entail injecting the module (100kb) into every page, often needlessly which would be very inefficient.
Having async API would mean injecting the module only when needed.

const psl = await PSL.parse('example.com');

Comments are welcomed.

PS

ATM, I am having problems exporting a module class to userscript. I have sent a message to the Mozilla engineers for guidance.

@kekkc
Copy link
Author

kekkc commented May 15, 2023

Would be cool to have some standardized way. Had the hope that we get it in FF directly, but seems they need some time and kind of overengineer this with a joint API ( w3c/webextensions#231 (comment) ).

I have written a PSL parsing module (check for details) for the new feature in v2.68 (used in toolbar popup).
Looks efficient, but parsing the PSL as list "const list =' com.ac edu'" might be slow.

Quickest approach I've seen so far is the implementation of Adblock:

For my userscripts I copyied the json in a separate js file, that I included as userscript in FireMonkey:

// @require		publicSuffixList.js
// getBaseDomain hotname including PSL (e.g. 'yahoo.co.uk' for 'http://www.yahoo.co.uk'), reference from Adblock
let domainSuffixes = function* domainSuffixes(domain, includeBlank = false) {
	if (domain[domain.length - 1] === '.') domain = domain.substring(0, domain.length - 1);
	while (domain !== '') {
		yield domain;
		let dotIndex = domain.indexOf('.');
		domain = dotIndex === -1 ? '' : domain.substr(dotIndex + 1);
	}
	if (includeBlank) yield '';
};
let getBaseDomain = function getBaseDomain(hostname) {
	let slices = [];
	let cutoff = NaN;
	for (let suffix of domainSuffixes(hostname)) {
		slices.push(suffix);
		let offset = publicSuffixes[suffix];
		if (typeof offset === 'number') {
			cutoff = slices.length - 1 - offset;
			break;
		}
	}
	if (isNaN(cutoff)) return slices.length > 2 ? slices[slices.length - 2] : hostname;
	if (cutoff <= 0) return hostname;
	return slices[cutoff];
};

In the same userscript I can utilize it via "getBaseDomain('www.yahoo.co.uk')". Currently I'm executing this on every single link on any web page and had no noticeable performance impact for the overall page loading. However, I don't know how to measure the performance objectively apart from overall page loading and monitoring CPU usage.

@erosman
Copy link
Owner

erosman commented May 15, 2023

I don't know how to measure the performance objectively apart from overall page loading and monitoring CPU usage.

I regularly use a loop to test performances.

I tested it just now.

(() => {

  let t;
  // adjust the number based on function 
  const n = 1000;

  // prepare the fixed data first
  const hostname = 'mail.yahoo.co.uk';
  console.log(PSL.parse(hostname));
  // Object { subdomain: "mail", domain: "yahoo.co.uk", sld: "yahoo", tld: "co.uk" }

  t = performance.now();
  for (let i = 0; i < n; i++) {
    // run the function
    PSL.parse(hostname);
  }
  console.log(`Operation took ${performance.now() - t} milliseconds`);
  // Operation took 46 milliseconds

})();

@kekkc
Copy link
Author

kekkc commented May 15, 2023

Awesome, many thanks. Will definitely use that in the future ;)

Operation took 2 milliseconds
asdf.html.txt

(html file, JS code is at the bottom)

@erosman
Copy link
Owner

erosman commented May 15, 2023

Operation took 2 milliseconds

There is no doubt that object property is extremely fast.
However, a large object takes a lot more memory in comparison to a string.

Furthermore, in normal operations, where the parsing will be performed once (or a few times), the difference in speed is insignificant e.g. for one parsing, 0.046 millisecond vs 0.002 millisecond.

I will do some more testing.

Test Result for 1000

  • string method: parse 46 milliseconds
  • array method: parse 15 milliseconds
  • object method: parse 1 milliseconds

Test Result for 1000 including conversion

For this test, string is converted to array and object.
Note: JavaScript parser still needs to convert an object in code to an actual object.

  • string method: 46 milliseconds
  • array method: convert to array 0.8 + parse 15 = 15.8 milliseconds
  • object method: convert to object 1.9 + parse 1 = 2.9 milliseconds

When doing many parsing from a single import, the difference increase i.e. 46 vs 2.9 milliseconds for 1000 parsing.
However, the difference is still insignificant even for 1000 parsing.

Test Result for 1 including conversion

  • string method: 0.046 milliseconds
  • array method: convert to array 0.8 + parse 0.015 = 0.815 milliseconds
  • object method: convert to object 1.9 + parse 0.001 = 1.9001 milliseconds

When doing a single parsing from a single import, it is in favour of string method i.e. 0.046 vs 1.901 milliseconds.
However, the difference is truly insignificant for 1 parsing.

@erosman
Copy link
Owner

erosman commented May 16, 2023

asdf.html.txt

BTW, GM getResourceText has been updated in v2.68.
You can use it to get publicSuffixList.json.

@erosman
Copy link
Owner

erosman commented May 16, 2023

ATM, I am having problems exporting a module class to userscript. I have sent a message to the Mozilla engineers for guidance.

I got a reply saying "Classes /inheritance is not supported".

Initially, I was planning to create a GM,import() API so in future, other modules could also be imported e.g.

const PSL = await GM.import('PSL');

Sadly, it is not possible in userScript API.
I may work on it for MV3 scripting API once it is finalised for userscript support.

@erosman
Copy link
Owner

erosman commented May 17, 2023

I re-wrote the new APIs again. Check Help for info.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants