Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adblock: Support cosmetic filtering (element hiding) and scriptlets #6480

Open
The-Compiler opened this issue May 26, 2021 · 28 comments · May be fixed by #7629
Open

adblock: Support cosmetic filtering (element hiding) and scriptlets #6480

The-Compiler opened this issue May 26, 2021 · 28 comments · May be fixed by #7629
Labels
priority: 1 - middle Issues which should be done at some point, but aren't that important.

Comments

@The-Compiler
Copy link
Member

Splitting this off from #5754 since it's clearly the most important part of it, and also requires some more text.

The underlying Brave adblocking library seems to already support both element hiding and Scriptlets, but qutebrowser doesn't support them so far. Not sure if they're supported in the python-adblock library or if some more work is required there.

Original discussion here: #5317 (comment).
Related: #6460

Scriptlets seem to be required to properly block YouTube ads after the latest changes, see e.g. this post by the easylist maintainer:

If you don't like youtube ads, switching browsers (Chrome, Firefox, Brave) will be the only option. Basic network blocking won't be enough to stop youtube ads unfortantly

and this post:

Thanks, implementing override-property-read, json-prune, and using those extra two rules did the job!

Some more resources:

cc @ArniDagur

@The-Compiler The-Compiler added the priority: 1 - middle Issues which should be done at some point, but aren't that important. label May 26, 2021
@ArniDagur
Copy link
Contributor

The underlying Brave adblocking library seems to already support both element hiding and Scriptlets, but qutebrowser doesn't support them so far. Not sure if they're supported in the python-adblock library or if some more work is required there.

There are some things that are supported in python-adblock, and some that are missing, e.g. resources. It shouldn't be too much effort to support everything.

I don't have as much time for new open source contributions as I used to, but I will add any functionality to python-adblock that is needed. Additionally, if someone can provide a guide on how to

  • Get all of the classes and ids of the web page's elements from qutebrowser.
  • Inject CSS and JavaScript into a web page from qutebrowser.

then I can take a stab at cosmetic blocking.

@sverona
Copy link

sverona commented May 30, 2021

Additionally, if someone can provide a guide on how to

  • Get all of the classes and ids of the web page's elements from qutebrowser.
  • Inject CSS and JavaScript into a web page from qutebrowser.

then I can take a stab at cosmetic blocking.

I couldn't find an entry in the QB API for doing either of these, so you may have to go through WebEngine. This looks like the primary resource. However:

Qt WebEngine does not allow direct access to the document object model (DOM) of a page. However, the DOM can be inspected and adapted by injecting scripts.

So my guess is you'd need to run some JS through a QWebChannel to pull classes and IDs. Here is the best reference I could find describing how to do this.

Here is a reference that describes how to inject JS into a page.

I don't know enough about the internals of WebEngine or Brave's adblock to say how this will impact performance.

@The-Compiler
Copy link
Member Author

Hopefully, QtWebChannel won't be needed, and any implementation using it will be heavily scrutinized due to its security impact (if you set up a QtWebChannel in the same context web pages run, it follows that web pages can use that QtWebChannel as well, and call things which might be qutebrowser-internal).

There are two ways to run JS with QtWebEngine:

  • QWebEnginePage::runJavaScript, exposed in qutebrowser's tab API via tab.run_js_async(). It takes a callback which gets whatever was returned from JS, so this should be enough to find the elements. It's also what's used to implement hints and other features already, via the tab.elements API.
  • QWebEngineScript which is more like a Greasemonkey script (and in fact supports a sub-set of Greasemonkey comments), i.e. you "install" a script and QtWebEngine/Chromium will take care of automatically injecting it on every page load (or all page loads matching a pattern). This isn't exposed via the tab API so far (and also only available with QtWebEngine), but it's used internally by webenginetab.py in _WebEngineScripts for various built-in functionality, Greasemonkey support, etc. etc.

Unfortunately, there's no way to inject style sheets - it all needs to be done via JS. See the Greasemonkey wrapper for a simple example how to do this, and stylesheet.js for a more complex one supporting live updates (used for content.user_stylesheets).

So, yeah, some of the infrastructure probably is already in place (after, all, there's jhide using the Greasemonkey support for cosmetic adblock filters already), but getting this all to work properly is probably still not too trivial.

@ArniDagur can you elaborate a bit more about what kind of APIs you'd need, and how those cosmetic rules and the adblock rust API work? Why do you need a list of all classes/ids, for example?

@ArniDagur
Copy link
Contributor

@ArniDagur can you elaborate a bit more about what kind of APIs you'd need, and how those cosmetic rules and the adblock rust API work? Why do you need a list of all classes/ids, for example?

The best resource to understand this is the following GitHub issue: brave/adblock-rust#152.

But roughly, first you'd want to call url_cosmetic_resources, of which three fields are relevant: hide_selectors, style_selectors, and exceptions. Then you'll want to gather all of the CSS classes and ids that occur on the page, and call hidden_class_id_selectors, being sure to pass the exceptions as the third argument.

  • hide_selectors is a list of CSS selectors to which display: none !important should be applied.
  • The result from hidden_class_id_selectors is another list of simple CSS selectors to which display: none !important should be applied.
  • style_selectors maps CSS selectors on the page to corresponding rules other than display: none !important.

@The-Compiler
Copy link
Member Author

Thanks! That together with brave/adblock-rust#152 (comment) and the docs added in brave/adblock-rust@445b633 gives me rough idea on how that'd work - and it's much more complex than I would've thought...

Especially reacting to new elements appearing on a page dynamically (which is somewhat common nowadays) could be tricky. They say:

we only inject rules based on classes and ids that actually appear on the page (in practice, we use a MutationObserver to identify those elements)

but something like that would indeed require a QWebChannel - something I'm still quite wary about, given the security implications of letting JS running in the page's context talk back to Python.

From a quick look how Falkon solves this, it looks like they do indeed just inject all rules into every page...

https://github.com/KDE/falkon/blob/8abf9d5cf0e682e9c16fa8604d41299d8327f9ad/src/lib/adblock/adblockplugin.cpp#L80-L89

https://github.com/KDE/falkon/blob/8abf9d5cf0e682e9c16fa8604d41299d8327f9ad/src/lib/adblock/adblockmatcher.cpp#L179-L206

@The-Compiler
Copy link
Member Author

As an aside, for people looking for a workaround for YouTube: Someone on Reddit recently wrote a Greasemonkey script to do so. I haven't tested it myself yet. Another approach is to use an external player like mpv instead, see 10. in the FAQ.

@sverona
Copy link

sverona commented Jul 8, 2021

As an aside, for people looking for a workaround for YouTube: Someone on Reddit recently wrote a Greasemonkey script to do so. I haven't tested it myself yet. Another approach is to use an external player like mpv instead, see 10. in the FAQ.

The Greasemonkey script is a great stopgap.

Here is a mirror in case something happens to the Reddit post. I have this in ~/.local/share/qutebrowser/greasemonkey/youtube-adblock.user.js.

// ==UserScript==
// @name         Auto Skip YouTube Ads 
// @version      1.0.1
// @description  Speed up and skip YouTube ads automatically 
// @author       jso8910
// @match        *://*.youtube.com/*
// @exclude      *://*.youtube.com/subscribe_embed?*
// ==/UserScript==
let main = new MutationObserver(() => {
    let ad = [...document.querySelectorAll('.ad-showing')][0];
    if (ad) {
        let btn = document.querySelector('.videoAdUiSkipButton,.ytp-ad-skip-button')
        if (btn) {
            btn.click()
        }
    }
})

main.observe(document.querySelector('.videoAdUiSkipButton,.ytp-ad-skip-button'), {attributes: true, characterData: true, childList: true})

If that doesn't work, try:

// ==UserScript==
// @name         Auto Skip YouTube Ads 
// @version      1.0.0
// @description  Speed up and skip YouTube ads automatically 
// @author       jso8910
// @match        *://*.youtube.com/*
// @exclude      *://*.youtube.com/subscribe_embed?*
// ==/UserScript==
setInterval(() => {
    const btn = document.querySelector('.videoAdUiSkipButton,.ytp-ad-skip-button')
    if (btn) {
        btn.click()
    }
    const ad = [...document.querySelectorAll('.ad-showing')][0];
    if (ad) {
        document.querySelector('video').playbackRate = 10;
    }
}, 50)

@crocket
Copy link

crocket commented Sep 7, 2021

This issue should be prioritized. Unless qutebrowser can be configured to automatically replace www.youtube.com with yewtu.be, qutebrowser is painful on youtube.

@The-Compiler
Copy link
Member Author

@crocket There are various workarounds for Youtube specifically, see e.g. the comment right above yours.

@crocket
Copy link

crocket commented Sep 22, 2021

My workaround is to make qutebrowser delegate URLs to a URL handler userscript that turns www.youtube.com and youtube.com into yewtu.be

I just haven't figured out how to make :hint delegate URLs to a userscript. Without :hint, I can't click buttons with hints.

@metov
Copy link

metov commented Oct 20, 2021

@The-Compiler I think this feature would be amazing and I really hope it can be added to qutebrowser soon. Qutebrowser is already a great piece of software and is a far superior experience to even chromium with vimium and the like - just native support for keyboard navigation alone already puts it miles ahead of the competition. Unfortunately the web these days is basically unusable without sophisticated request blocking and element hiding - between all the cookie banners, ads, analytics scripts, javascript bloat it's very difficult to actually use most websites "as intended". This feature, if implemented properly, would allow me to finally switch to qb as my main browser for good. And I suspect I'm not the only one.

I'd love to help out with the development of this. However the qb codebase, currently at 75k lines, isn't quite as straightforward to just dive in (and in this case, the real problem is probably the complexity of Brave's codebase rather than qb). Is there any chance we could break this issue down into smaller chunks, that are more self-contained and realistic for a new contributor to try and attack without having to understand the entire codebase?

For example, are there any limited-scope MVP's or proof of concepts you can see that people could contribute and move the effort forward? If I wanted to help with this, where would I start? I've read through the issue and I have a general idea, but it's not really clear to me what the steps and priorities are.

I realize that this breaking down into subtasks will require work on your part as well, probably quite substantial work in fact. But it would also enable other people to contribute and so might be less work than just implementing the whole thing yourself. So could we maybe have something like a roadmap for how we would be getting from here to having fully functional element hiding in qb?

@The-Compiler
Copy link
Member Author

@metov As much as I like mentoring people to get started with qutebrowser (or with Python elsewhere), I'm afraid this really isn't a good fit for it. Right now, there are too many "unknown unknowns", and a proof of concept I'd have to write to get an idea of what's required for a full implementation would already be very close to that full implementation (since the heavy lifting is done by the Brave library, after all).

Additionally, one step that might be required (a QtWebChannel, as outlined above) is very security sensitive. Working on that is quite a responsibility - I don't feel comfortable with delegating to someone who is just getting started with contributing.

However, a first good step for this would probably be a proof of concept isolated from qutebrowser, so that we could understand what exactly would be required for the integration. In other words, a small project using QtWebEngine to display a single tab (like the testbrowser) and integrating element hiding via the adblock Python library.

FWIW, a full in-depth understanding of the qutebrowser code base is not required in any case - nobody has that with a project of this size, not even me 😉

@crocket
Copy link

crocket commented Oct 20, 2021

https://github.com/dudik/blockit is a webkit extension that supports Brave adblock's cosmetic filter.

@The-Compiler
Copy link
Member Author

Ah, that's actually quite promising, because:

  • They use similar capabilities we have as well (only running JS on the page and getting the result)
  • They don't seem to support scriptlets, only element hiding
  • They only do the filtering once when the page is loaded, rather than doing it dynamically when new elements appear

I still feel like the latter would be quite a bit of a limitation, given how many web pages load elements dynamically nowadays. But perhaps it's still a major improvement over the status quo.

Relevant code:

https://github.com/dudik/blockit/blob/be185b454798afd5542f08cf51774b72ee7b01d4/blockit.c#L60-L105

Does anyone have any experience with blockit, or is willing to find out how much of an improvement "static" cosmetic filtering alone is? In other words, what are some (ideally common) real-world ads which blockit can block, but qutebrowser (with adblock) can't?

@The-Compiler
Copy link
Member Author

Perhaps relevant for people looking for a YouTube workaround: Updated Auto Skip Youtube Ads : qutebrowser

@herrsimon
Copy link

Maybe this comment is very naive, but how about https://github.com/ghostery/adblocker? Apparently a pure javascript library (so could be run as a simple background script) and claimed to support '99% of all filters from the Easylist and uBlock Origin project'. At least as an interim solution, it seems to me that this should fit the bill, or is there something I am overlooking?

@The-Compiler
Copy link
Member Author

@herrsimon We don't have anything like extension background scripts. Maybe with some effort it could be made into a greasemonkey script somehow, but I'd rather put effort into properly integrating the existing adblocker library rather than integrating a second one.

@herrsimon
Copy link

I completely agree that properly integrating the existing library is a better solution and was thinking of a quick and easy interim fix by loading the ghostery library as userscript in the background. I just looked into it a bit more and apparently it is not that easy (could also be my lack of knowledge though).

@alkim0
Copy link

alkim0 commented Jul 12, 2022

I took a first stab at implementing cosmetic filtering and scriptlet injection.

There is a pending pull request: #7312

To try it out for yourself:

  1. I highly recommended adding ublock's filters.txt to your content.blocking.adblock.lists. Easylist doesn't have any scriptlet injection:
"https://github.com/uBlockOrigin/uAssets/raw/master/filters/filters.txt"
  1. Update python-adblock to at least 0.6.0:
pip install -U adblock
  1. Download qutebrowser from
git clone https://github.com/alkim0/qutebrowser
  1. In the base directory, run python qutebrowser.py
  2. In qutebrowser, run:
    1. :adblock-update (if you updated your adblock lists)
    2. :adblock-update-resources

At the very least, it should blocks youtube ads...

EDIT: Updated for the lastest version of python-adblock.

@ArniDagur
Copy link
Contributor

ArniDagur commented Jul 12, 2022

That is crazy cool! I will review your python-adblock pull request soon™

@herrsimon
Copy link

Thanks a LOT, do I understand correctly that there's still no dynamic scriptlet injection (just once after page load)? This could explain why ads where not completely blocked on some pages, such as distrowatch.org, heise.de and reddit.com (sponsored posts), when I just tried it out. It could also be that the corresponding blocking rules are on a different filter list (just used easylist and the default ublock list). Still, your work is a giant leap forward!

I will take a closer look at the code and also at how ublock and other browsers solve dynamic injection and hope that I can contribute in some way. Probably, a QtWebChannel can't be avoided, but as the tasks of the interfacing code are very narrowly defined and it could hence be restricted accordingly, I don't see much of a security issue here (assuming proper implementation of course).

PS: proper adblocking is nowadays one of the most important features, preventing many users from switching to qutebrowser in my opinion. The priority of this issue should therefore by high instead of just middle.

@alkim0
Copy link

alkim0 commented Jul 13, 2022

Thanks a LOT, do I understand correctly that there's still no dynamic scriptlet injection (just once after page load)? This could explain why ads where not completely blocked on some pages, such as distrowatch.org, heise.de and reddit.com (sponsored posts), when I just tried it out. It could also be that the corresponding blocking rules are on a different filter list (just used easylist and the default ublock list). Still, your work is a giant leap forward!

No, scriptlet injection should be working (before page load) with the bleeding edge version of python-adblock (that's how we skip youtube ads). The asynchronous code after the page load is for dealing with generic (not site-specific) css filters.

I looked at the websites you mentioned. First, with heise.de, I get the same ads on firefox with ublock + umatrix, so those might not be blocked under the default filters.

Regarding distrowatch.org and reddit.com, I noticed that the underlying engine doesn't seem to be returning some of the more complex css filters. It was the case with these websites as well when I checked them out. I've filed it as an issue with python-adblock (ArniDagur/python-adblock#71), but I think the problem is with the underlying engine itself not supporting certain procedural filters (brave/adblock-rust#145). To confirm this, I downloaded and tried these pages with the Brave web browser, and these ads do show up in it.

@herrsimon
Copy link

herrsimon commented Jul 13, 2022

Alright, I did some further testing. First of all, I was missing some blocklist for qutebrowsers (my apologies), namely

https://pgl.yoyo.org/adservers
https://github.com/uBlockOrigin/uAssets/raw/master/filters/privacy.txt
https://www.i-dont-care-about-cookies.eu/abp/

Now filtering is as follows (compared to firefox with ublock origin):

reddit.com

Promoted posts are still not blocked (they don't appear on firefox)

heise.de

Ads are blocked (finally!), but in one place a grey box is left, which is filtered on firefox (see attached pictures).

distrowatch.org

Due to added blocklist, most ads are now filtered, except for the small box to the lower right at the very bottom of the page. However, I observed the following:

  • filtering only works when directly opening distrowatch.com. If instead browsing to distrowatch.org (which is then redirected to distrowatch.com), filtering does not happen and yanking the url still yields distrowatch.org (so apparently the filter rules, which are written for distrowatch.com don't trigger). This seems like an issue which has nothing to do with the adblock plugin. Just for comparison: Firefox blocks in both cases.
  • I tried refreshing the page a couple of times bypassing the cache (using R), sometimes in rapid sucession. It regularly happened that the banner at the top slips through the filter. The same is true for firefox. Could this be fixed by preventing any js-execution before all filters have been applied?

These issues - even though they should be fixed at some point - are all bearable though. I just used your extension for a bit of browsing and on most pages it worked flawlessly. Thanks again for finally implementing this crucial feature!

Here is the logger output from ublock origin, maybe this helps:

Blocked elements (www.heise.de)
+1 ||upscore.com^$3p -- www.heise.de 3 script https://files.upscore.com/async/upScore.js
+1 ##a[href^="https://pubads.g.doubleclick.net/"] www.heise.de dom https://www.heise.de/
+1 ##a-ad www.heise.de dom https://www.heise.de/
+1 ##.us_ad www.heise.de dom https://www.heise.de/
+1 ##.top-topics__ad www.heise.de dom https://www.heise.de/
+1 ##.ad-microsites www.heise.de dom https://www.heise.de/
+1 ##.a-ad--wide www.heise.de dom https://www.heise.de/
+1 ##.a-ad--skyscraper www.heise.de dom https://www.heise.de/
+1 ##.a-ad--leaderboard www.heise.de dom https://www.heise.de/
+1 ##.a-ad www.heise.de dom https://www.heise.de/
+1 ||cleverpush.com^$3p -- www.heise.de 3 script https://static-eu.cleverpush.com/channel/loader/2et4HQsqBnH6ZMRXr.js?v=2
+1 /base/es6/bundle.js -- www.heise.de 1 script https://data-fb7f8b3ae8.heise.de/iomm/latest/manager/base/es6/bundle.js
+1 googletagservices_gpt.js << www.heise.de 3 script https://securepubads.g.doubleclick.net/tag/js/gpt.js
+1 ||securepubads.g.doubleclick.net/tag/js/gpt.js$script,redirect-rule=googletagservices_gpt.js:5 -- www.heise.de 3 script https://securepubads.g.doubleclick.net/tag/js/gpt.js
+1 ||doubleclick.net^ -- www.heise.de 3 script https://securepubads.g.doubleclick.net/tag/js/gpt.js
+1 /px.js?ch=$script -- www.heise.de 1 script https://www.heise.de/assets/akwa/v24/js/px.js?ch=2
+1 /px.js?ch=$script -- www.heise.de 1 script https://www.heise.de/assets/akwa/v24/js/px.js?ch=1
+1 /prebid.$domain=~prebid.org -- www.heise.de 1 script https://www.heise.de/assets/akwa/v24/js/prebid.e7e419.ltc.js
+1 ##+js(no-setTimeout-if, .call(null), 10) www.heise.de dom https://www.heise.de/
+1 ||responder.wt.heise.de^ -- www.heise.de 1 script https://responder.wt.heise.de/resp/api/get/288689636920174?url=https%3A%2F%2Fwww.heise.de%2F&v=5
+0 ||heise.de/ivw-bin/ivw/cp/ -- www.heise.de 1 image https://www.heise.de/ivw-bin/ivw/CP/
+0 ||upscore.com^$3p -- www.heise.de 3 script https://hl.upscore.com/config/heise.de.js
+0 ##+js(no-setTimeout-if, .call(null), 10) www.heise.de dom https://www.heise.de/
+0 ||kameleoon.eu^ -- www.heise.de 3 script https://yxsu5ufd2m.kameleoon.eu/kameleoon.js
+0 ||heise.de/ivw-bin/ivw/cp/ -- www.heise.de 1 image https://www.heise.de/ivw-bin/ivw/CP/
+0 ||upscore.com^$3p -- www.heise.de 3 script https://hl.upscore.com/config/heise.de.js
blocked elements (distrowatch.org)
+1 ##.cc_banner.cc_container--open distrowatch.com dom https://distrowatch.com/
+1 ##.cc_banner-wrapper distrowatch.com dom https://distrowatch.com/
+0 ##a[href*="3cx.com"] distrowatch.com dom https://distrowatch.com/
+0 ##[href*="utm"]:upward(tbody) distrowatch.com dom https://distrowatch.com/
+0 ||unblockia.com^$3p -- distrowatch.com 3 script https://cdn.unblockia.com/h.js
+0 ||s-onetag.com^ -- distrowatch.com 3 script https://get.s-onetag.com/f31b0467-497f-4b91-86b5-6219f9a80f5e/tag.min.js
+0 ||media.net^ -- distrowatch.com 3 script https://hbx.media.net/bidexchange.js?cid=8CU7XD8P5&version=5.1&dn=distrowatch.com
+0 ||media.net^ -- distrowatch.com 3 script https://contextual.media.net/dmedianet.js?cid=8CUK7Z630
+0 ||unblockia.com^$3p -- distrowatch.com 3 script https://cdn.unblockia.com/h.js
+0 ||s-onetag.com^ -- distrowatch.com 3 script https://get.s-onetag.com/f31b0467-497f-4b91-86b5-6219f9a80f5e/tag.min.js
+0 googlesyndication_adsbygoogle.js << distrowatch.com 3 script https://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js?client=ca-pub-2079470915748165
+0 ||pagead2.googlesyndication.com/pagead/js/adsbygoogle.js$script,redirect-rule=googlesyndication_adsbygoogle.js:5,domain=~zipextractor.app -- distrowatch.com 3 script https://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js?client=ca-pub-2079470915748165
+0 ||googlesyndication.com^ -- distrowatch.com 3 script https://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js?client=ca-pub-2079470915748165
+0 ||media.net^ -- distrowatch.com 3 script https://contextual.media.net/dmedianet.js?cid=8CUK7Z630
blocked elements (reddit.com)
+6 ||alb.reddit.com^ -- www.reddit.com 1 image https://alb.reddit.com/i.gif?z=gAAAAABizz7J8SzJyslYvEgrwMmg7uTNpHoygXZWGwiqEjPb7dAbHXJ3Tv5NVDgiPhwYY_ZuGUKbH8JEg3SC00a1wumVviWh1oY05cdnvSeAtdgTtLkp-bWPcW4PIapFKfJlEe0GjhD8sZGLs2e2lmsQQiDP3e-4IUoPNb1JkeaYb2Gx4-m1OpE=
+5 ##[id^="t3"].promotedlink:upward(.rpBJOHq2PR60pnwJlUyP0 > div) www.reddit.com dom https://www.reddit.com/
+5 ##.promotedlink www.reddit.com dom https://www.reddit.com/
+5 ##.openx www.reddit.com dom https://www.reddit.com/
+5 ##.googad www.reddit.com dom https://www.reddit.com/
+5 ##.adsense-ad www.reddit.com dom https://www.reddit.com/
+5 ##.ads-area www.reddit.com dom https://www.reddit.com/
+5 ##.adbar www.reddit.com dom https://www.reddit.com/
+5 ##.ad-banner:not([style="height: 5px; width: 5px; position: absolute; top: 0;"]):not(.blocker-tester + .ad-banner) www.reddit.com dom https://www.reddit.com/
+5 ##.ad-300-250 www.reddit.com dom https://www.reddit.com/
+5 ##.LeftAd www.reddit.com dom https://www.reddit.com/
+5 ##.GoogleAd www.reddit.com dom https://www.reddit.com/
+5 /rendertimingpixel. -- www.reddit.com 3 image https://www.redditstatic.com/desktop2x/img/renderTimingPixel.png
+3 ##+js(set-constant, Object.prototype.allowClickTracking, false) www.reddit.com dom https://www.reddit.com/account/sso/one_tap/?experiment_d2x_2020ify_buttons=enabled&experiment_d2x_sso_login_link=enabled&experiment_d2x_google_sso_gis_parity=enabled&experiment_d2x_onboarding=enabled
+3 ##+js(no-xhr-if, method:POST url:/^https://www.reddit.com$/) www.reddit.com dom https://www.reddit.com/account/sso/one_tap/?experiment_d2x_2020ify_buttons=enabled&experiment_d2x_sso_login_link=enabled&experiment_d2x_google_sso_gis_parity=enabled&experiment_d2x_onboarding=enabled
+3 ##+js(no-fetch-if, url:/^https://www.reddit.com$/ method:post) www.reddit.com dom https://www.reddit.com/account/sso/one_tap/?experiment_d2x_2020ify_buttons=enabled&experiment_d2x_sso_login_link=enabled&experiment_d2x_google_sso_gis_parity=enabled&experiment_d2x_onboarding=enabled
+3 ##+js(set-constant, Object.prototype.allowClickTracking, false) www.reddit.com dom https://www.reddit.com/account/sso/one_tap/?experiment_d2x_2020ify_buttons=enabled&experiment_d2x_sso_login_link=enabled&experiment_d2x_google_sso_gis_parity=enabled&experiment_d2x_onboarding=enabled
+3 ##+js(no-xhr-if, method:POST url:/^https://www.reddit.com$/) www.reddit.com dom https://www.reddit.com/account/sso/one_tap/?experiment_d2x_2020ify_buttons=enabled&experiment_d2x_sso_login_link=enabled&experiment_d2x_google_sso_gis_parity=enabled&experiment_d2x_onboarding=enabled
+3 ##+js(no-fetch-if, url:/^https://www.reddit.com$/ method:post) www.reddit.com dom https://www.reddit.com/account/sso/one_tap/?experiment_d2x_2020ify_buttons=enabled&experiment_d2x_sso_login_link=enabled&experiment_d2x_google_sso_gis_parity=enabled&experiment_d2x_onboarding=enabled
+3 ||reddit.com/timings/ -- www.reddit.com 1 xhr https://www.reddit.com/timings/rum
+3 /rendertimingpixel. -- www.reddit.com 3 image https://www.redditstatic.com/desktop2x/img/renderTimingPixel.png
+3 ||reddit.com/counters/$xmlhttprequest -- www.reddit.com 1 xhr https://www.reddit.com/counters/client-screenview
+3 ||redditmedia.com/gtm/jail? -- www.reddit.com 3 frame https://www.redditmedia.com/gtm/jail?id=GTM-5XVNS82
+1 /rendertimingpixel. -- www.reddit.com 3 image https://www.redditstatic.com/desktop2x/img/renderTimingPixel.png
+1 /rendertimingpixel. -- www.reddit.com 3 image https://www.redditstatic.com/desktop2x/img/renderTimingPixel.png
+1 /rendertimingpixel. -- www.reddit.com 3 image https://www.redditstatic.com/desktop2x/img/renderTimingPixel.png
+1 ##+js(set-constant, Object.prototype.allowClickTracking, false) www.reddit.com dom https://www.reddit.com/
+1 ##+js(no-xhr-if, method:POST url:/^https://www.reddit.com$/) www.reddit.com dom https://www.reddit.com/
+1 ##+js(no-fetch-if, url:/^https://www.reddit.com$/ method:post) www.reddit.com dom https://www.reddit.com/

heise.de on qutebrowser (ad is filtered, css element remains)

heise_qute

heise.de on firefox/ublock origin (css is filtered as well)

heise_ff

@alkim0
Copy link

alkim0 commented Jul 14, 2022

Wow, thanks for the detailed debug info.

Just to reiterate, css filters like [id^="t3"].promotedlink:upward(.rpBJOHq2PR60pnwJlUyP0 > div) (reddit's promoted posts) are not supported by the underlying Brave ad block engine (brave/adblock-rust/issues/145), so nothing can really be done about them at the moment.

@herrsimon
Copy link

@alkim0 brave with their “shields up” and “strict” mode (which as far as I understand essentially just activates some more blocking lists) actually blocks the reddit promoted ads and gets the css on heise.de right, i.e. removes the gray empty rectangle, but it also doesn't filter the small ad on the bottom right of distrowatch.com. I just tried to find the reason for this but wasn't successful. Anyway, in order to keep this thread focused on the core functionality itself, I will investigate the site-specific problems further and then open separate issues if necessary.

Relevant for this thread however should be the following two things:

  1. (quoted from above)

filtering only works when directly opening distrowatch.com. If instead browsing to distrowatch.org (which is then redirected to distrowatch.com), filtering does not happen and yanking the url still yields distrowatch.org (so apparently the filter rules, which are written for distrowatch.com don't trigger). This seems like an issue which has nothing to do with the adblock plugin. Just for comparison: Brave and Firefox block in both cases.

  1. I noticed by chance that brave and firefox with ublock installed also apply its filter rules to POST requests (which for example are regularly sent by emdedded js-code without the user's consent). However, when running qutebrowser with --debug on a page where such a request is issued and should be blocked according to currently loaded filter rules (one example is again reddit.com posting to ingest.sentry.io, which is blocked via the easyprivacy list), no log entry shows up. Again, this sounds more like a generic issue with the adblock plugin.

Could you please comment on these two issues? If they are not related to your cosmetic filtering addition and instead need to be addressed separately, let me know and I will open separate threads as well.

@alkim0
Copy link

alkim0 commented Jul 15, 2022

@herrsimon, Just to verify, is this what you are talking about?
2022-07-15-105620_1916x1023_scrot
If so, I still seem to be getting ads with it:
2022-07-15-105502_1916x1023_scrot

  1. Qutebrowser doesn't seem to dealing with distrowatch.org's 301 properly at the moment, which is why the url doesn't change to distrowatch.com (which is why the adblocker won't pick up on it). I think this needs to be managed as a separate issue.

  2. These are network requests and should be blocked by the existing adblocker already in qutebrowser. So, if there is a problem with this, that should be raised as a separate issue as well.

@Nicholas42
Copy link
Contributor

As an aside, for people looking for a workaround for YouTube: Someone on Reddit recently wrote a Greasemonkey script to do so. I haven't tested it myself yet. Another approach is to use an external player like mpv instead, see 10. in the FAQ.

The Greasemonkey script is a great stopgap.

Here is a mirror in case something happens to the Reddit post. I have this in ~/.local/share/qutebrowser/greasemonkey/youtube-adblock.user.js.

// ==UserScript==
// @name         Auto Skip YouTube Ads 
// @version      1.0.1
// @description  Speed up and skip YouTube ads automatically 
// @author       jso8910
// @match        *://*.youtube.com/*
// @exclude      *://*.youtube.com/subscribe_embed?*
// ==/UserScript==
let main = new MutationObserver(() => {
    let ad = [...document.querySelectorAll('.ad-showing')][0];
    if (ad) {
        let btn = document.querySelector('.videoAdUiSkipButton,.ytp-ad-skip-button')
        if (btn) {
            btn.click()
        }
    }
})

main.observe(document.querySelector('.videoAdUiSkipButton,.ytp-ad-skip-button'), {attributes: true, characterData: true, childList: true})

If that doesn't work, try:

// ==UserScript==
// @name         Auto Skip YouTube Ads 
// @version      1.0.0
// @description  Speed up and skip YouTube ads automatically 
// @author       jso8910
// @match        *://*.youtube.com/*
// @exclude      *://*.youtube.com/subscribe_embed?*
// ==/UserScript==
setInterval(() => {
    const btn = document.querySelector('.videoAdUiSkipButton,.ytp-ad-skip-button')
    if (btn) {
        btn.click()
    }
    const ad = [...document.querySelectorAll('.ad-showing')][0];
    if (ad) {
        document.querySelector('video').playbackRate = 10;
    }
}, 50)

That works like a charm for me. I reworked it a bit to be faster on unskippable ads (those seem to be the most prevalent today). It seeks right to the end of the ad video, so you are done with them faster. I also tried to hide the video as much as possible:

// ==UserScript==
// @name         Auto Skip YouTube Ads
// @version      1.1.0
// @description  Speed up and skip YouTube ads automatically
// @author       jso8910
// @match        *://*.youtube.com/*
// @exclude      *://*.youtube.com/subscribe_embed?*
// ==/UserScript==
setInterval(() => {
    const btn = document.querySelector('.videoAdUiSkipButton,.ytp-ad-skip-button')
    if (btn) {
        btn.click()
    }
    const ad = [...document.querySelectorAll('.ad-showing')][0];
    if (ad) {
        const video = document.querySelector('video')
        video.muted = true;
        video.hidden = true;

        // This is not necessarily available right at the start
        if(video.duration != NaN) {
            video.currentTime = video.duration;
        }

        // 16 seems to be the highest rate that works, mostly this isn't needed
        video.playbackRate = 16;
    }
}, 50)

@hakan-demirli
Copy link

I block everything I don't use on Youtube. For example Home, Shorts, ... buttons on sidebar, recommendations on the right etc. Since, the PR haven't merged I use greasemonkey scripts to block those elements.

Here is a simple script that removes "Home" button from from Youtube sidebar.

// ==UserScript==
// @name         Remove Elements with Text (MutationObserver)
// @namespace    http://tampermonkey.net/
// @version      0.1
// @description  Remove elements within #items if they contain specified text(s)
// @author       You
// @match        *://*/*
// @grant        none
// ==/UserScript==

(function () {
  "use strict";

  // Function to remove elements within #items if they contain specified text
  function removeElementsWithText(texts) {
    var elementsToRemove = document.querySelectorAll(
      "#items > ytd-guide-entry-renderer",
    );

    Array.from(elementsToRemove).forEach(function (element) {
      texts.forEach(function (text) {
        if (element.innerText.includes(text)) {
          element.remove();
        }
      });
    });
  }

  let blist = ["Home", "Shorts"];

  // Function to be called when mutations are observed
  function handleMutations(mutationsList, observer) {
    removeElementsWithText(blist); // Adjust the texts as needed
  }

  // Options for the observer (which mutations to observe)
  const observerConfig = { childList: true, subtree: true };

  // Create an observer instance linked to the callback function
  const observer = new MutationObserver(handleMutations);

  // Start observing the target node for configured mutations
  observer.observe(document.body, observerConfig);

  // Perform initial removal on page load
  removeElementsWithText(blist); // Adjust the texts as needed
})();

To block any element just copy its JS path using :devtools -> right click -> inspect and then modify the script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: 1 - middle Issues which should be done at some point, but aren't that important.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants