New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adblock: Support cosmetic filtering (element hiding) and scriptlets #6480
Comments
There are some things that are supported in I don't have as much time for new open source contributions as I used to, but I will add any functionality to
then I can take a stab at cosmetic blocking. |
I couldn't find an entry in the QB API for doing either of these, so you may have to go through WebEngine. This looks like the primary resource. However:
So my guess is you'd need to run some JS through a QWebChannel to pull classes and IDs. Here is the best reference I could find describing how to do this. Here is a reference that describes how to inject JS into a page. I don't know enough about the internals of WebEngine or Brave's adblock to say how this will impact performance. |
Hopefully, QtWebChannel won't be needed, and any implementation using it will be heavily scrutinized due to its security impact (if you set up a QtWebChannel in the same context web pages run, it follows that web pages can use that QtWebChannel as well, and call things which might be qutebrowser-internal). There are two ways to run JS with QtWebEngine:
Unfortunately, there's no way to inject style sheets - it all needs to be done via JS. See the Greasemonkey wrapper for a simple example how to do this, and stylesheet.js for a more complex one supporting live updates (used for So, yeah, some of the infrastructure probably is already in place (after, all, there's jhide using the Greasemonkey support for cosmetic adblock filters already), but getting this all to work properly is probably still not too trivial. @ArniDagur can you elaborate a bit more about what kind of APIs you'd need, and how those cosmetic rules and the adblock rust API work? Why do you need a list of all classes/ids, for example? |
The best resource to understand this is the following GitHub issue: brave/adblock-rust#152.
|
Thanks! That together with brave/adblock-rust#152 (comment) and the docs added in brave/adblock-rust@445b633 gives me rough idea on how that'd work - and it's much more complex than I would've thought... Especially reacting to new elements appearing on a page dynamically (which is somewhat common nowadays) could be tricky. They say:
but something like that would indeed require a From a quick look how Falkon solves this, it looks like they do indeed just inject all rules into every page... |
As an aside, for people looking for a workaround for YouTube: Someone on Reddit recently wrote a Greasemonkey script to do so. I haven't tested it myself yet. Another approach is to use an external player like |
The Greasemonkey script is a great stopgap. Here is a mirror in case something happens to the Reddit post. I have this in // ==UserScript==
// @name Auto Skip YouTube Ads
// @version 1.0.1
// @description Speed up and skip YouTube ads automatically
// @author jso8910
// @match *://*.youtube.com/*
// @exclude *://*.youtube.com/subscribe_embed?*
// ==/UserScript==
let main = new MutationObserver(() => {
let ad = [...document.querySelectorAll('.ad-showing')][0];
if (ad) {
let btn = document.querySelector('.videoAdUiSkipButton,.ytp-ad-skip-button')
if (btn) {
btn.click()
}
}
})
main.observe(document.querySelector('.videoAdUiSkipButton,.ytp-ad-skip-button'), {attributes: true, characterData: true, childList: true}) If that doesn't work, try: // ==UserScript==
// @name Auto Skip YouTube Ads
// @version 1.0.0
// @description Speed up and skip YouTube ads automatically
// @author jso8910
// @match *://*.youtube.com/*
// @exclude *://*.youtube.com/subscribe_embed?*
// ==/UserScript==
setInterval(() => {
const btn = document.querySelector('.videoAdUiSkipButton,.ytp-ad-skip-button')
if (btn) {
btn.click()
}
const ad = [...document.querySelectorAll('.ad-showing')][0];
if (ad) {
document.querySelector('video').playbackRate = 10;
}
}, 50) |
This issue should be prioritized. Unless qutebrowser can be configured to automatically replace www.youtube.com with yewtu.be, qutebrowser is painful on youtube. |
@crocket There are various workarounds for Youtube specifically, see e.g. the comment right above yours. |
My workaround is to make qutebrowser delegate URLs to a URL handler userscript that turns www.youtube.com and youtube.com into yewtu.be I just haven't figured out how to make |
@The-Compiler I think this feature would be amazing and I really hope it can be added to qutebrowser soon. Qutebrowser is already a great piece of software and is a far superior experience to even chromium with vimium and the like - just native support for keyboard navigation alone already puts it miles ahead of the competition. Unfortunately the web these days is basically unusable without sophisticated request blocking and element hiding - between all the cookie banners, ads, analytics scripts, javascript bloat it's very difficult to actually use most websites "as intended". This feature, if implemented properly, would allow me to finally switch to qb as my main browser for good. And I suspect I'm not the only one. I'd love to help out with the development of this. However the qb codebase, currently at 75k lines, isn't quite as straightforward to just dive in (and in this case, the real problem is probably the complexity of Brave's codebase rather than qb). Is there any chance we could break this issue down into smaller chunks, that are more self-contained and realistic for a new contributor to try and attack without having to understand the entire codebase? For example, are there any limited-scope MVP's or proof of concepts you can see that people could contribute and move the effort forward? If I wanted to help with this, where would I start? I've read through the issue and I have a general idea, but it's not really clear to me what the steps and priorities are. I realize that this breaking down into subtasks will require work on your part as well, probably quite substantial work in fact. But it would also enable other people to contribute and so might be less work than just implementing the whole thing yourself. So could we maybe have something like a roadmap for how we would be getting from here to having fully functional element hiding in qb? |
@metov As much as I like mentoring people to get started with qutebrowser (or with Python elsewhere), I'm afraid this really isn't a good fit for it. Right now, there are too many "unknown unknowns", and a proof of concept I'd have to write to get an idea of what's required for a full implementation would already be very close to that full implementation (since the heavy lifting is done by the Brave library, after all). Additionally, one step that might be required (a However, a first good step for this would probably be a proof of concept isolated from qutebrowser, so that we could understand what exactly would be required for the integration. In other words, a small project using QtWebEngine to display a single tab (like the testbrowser) and integrating element hiding via the FWIW, a full in-depth understanding of the qutebrowser code base is not required in any case - nobody has that with a project of this size, not even me 😉 |
https://github.com/dudik/blockit is a webkit extension that supports Brave adblock's cosmetic filter. |
Ah, that's actually quite promising, because:
I still feel like the latter would be quite a bit of a limitation, given how many web pages load elements dynamically nowadays. But perhaps it's still a major improvement over the status quo. Relevant code: https://github.com/dudik/blockit/blob/be185b454798afd5542f08cf51774b72ee7b01d4/blockit.c#L60-L105 Does anyone have any experience with blockit, or is willing to find out how much of an improvement "static" cosmetic filtering alone is? In other words, what are some (ideally common) real-world ads which blockit can block, but qutebrowser (with |
Perhaps relevant for people looking for a YouTube workaround: Updated Auto Skip Youtube Ads : qutebrowser |
Maybe this comment is very naive, but how about https://github.com/ghostery/adblocker? Apparently a pure javascript library (so could be run as a simple background script) and claimed to support '99% of all filters from the Easylist and uBlock Origin project'. At least as an interim solution, it seems to me that this should fit the bill, or is there something I am overlooking? |
@herrsimon We don't have anything like extension background scripts. Maybe with some effort it could be made into a greasemonkey script somehow, but I'd rather put effort into properly integrating the existing adblocker library rather than integrating a second one. |
I completely agree that properly integrating the existing library is a better solution and was thinking of a quick and easy interim fix by loading the ghostery library as userscript in the background. I just looked into it a bit more and apparently it is not that easy (could also be my lack of knowledge though). |
I took a first stab at implementing cosmetic filtering and scriptlet injection. There is a pending pull request: #7312 To try it out for yourself:
At the very least, it should blocks youtube ads... EDIT: Updated for the lastest version of python-adblock. |
That is crazy cool! I will review your |
Thanks a LOT, do I understand correctly that there's still no dynamic scriptlet injection (just once after page load)? This could explain why ads where not completely blocked on some pages, such as distrowatch.org, heise.de and reddit.com (sponsored posts), when I just tried it out. It could also be that the corresponding blocking rules are on a different filter list (just used easylist and the default ublock list). Still, your work is a giant leap forward! I will take a closer look at the code and also at how ublock and other browsers solve dynamic injection and hope that I can contribute in some way. Probably, a QtWebChannel can't be avoided, but as the tasks of the interfacing code are very narrowly defined and it could hence be restricted accordingly, I don't see much of a security issue here (assuming proper implementation of course). PS: proper adblocking is nowadays one of the most important features, preventing many users from switching to qutebrowser in my opinion. The priority of this issue should therefore by high instead of just middle. |
No, scriptlet injection should be working (before page load) with the bleeding edge version of python-adblock (that's how we skip youtube ads). The asynchronous code after the page load is for dealing with generic (not site-specific) css filters. I looked at the websites you mentioned. First, with heise.de, I get the same ads on firefox with ublock + umatrix, so those might not be blocked under the default filters. Regarding distrowatch.org and reddit.com, I noticed that the underlying engine doesn't seem to be returning some of the more complex css filters. It was the case with these websites as well when I checked them out. I've filed it as an issue with python-adblock (ArniDagur/python-adblock#71), but I think the problem is with the underlying engine itself not supporting certain procedural filters (brave/adblock-rust#145). To confirm this, I downloaded and tried these pages with the Brave web browser, and these ads do show up in it. |
Alright, I did some further testing. First of all, I was missing some blocklist for qutebrowsers (my apologies), namely
Now filtering is as follows (compared to firefox with ublock origin): reddit.comPromoted posts are still not blocked (they don't appear on firefox) heise.deAds are blocked (finally!), but in one place a grey box is left, which is filtered on firefox (see attached pictures). distrowatch.orgDue to added blocklist, most ads are now filtered, except for the small box to the lower right at the very bottom of the page. However, I observed the following:
These issues - even though they should be fixed at some point - are all bearable though. I just used your extension for a bit of browsing and on most pages it worked flawlessly. Thanks again for finally implementing this crucial feature! Here is the logger output from ublock origin, maybe this helps: Blocked elements (www.heise.de)
blocked elements (distrowatch.org)
blocked elements (reddit.com)
heise.de on qutebrowser (ad is filtered, css element remains)heise.de on firefox/ublock origin (css is filtered as well) |
Wow, thanks for the detailed debug info. Just to reiterate, css filters like |
@alkim0 brave with their “shields up” and “strict” mode (which as far as I understand essentially just activates some more blocking lists) actually blocks the reddit promoted ads and gets the css on heise.de right, i.e. removes the gray empty rectangle, but it also doesn't filter the small ad on the bottom right of distrowatch.com. I just tried to find the reason for this but wasn't successful. Anyway, in order to keep this thread focused on the core functionality itself, I will investigate the site-specific problems further and then open separate issues if necessary. Relevant for this thread however should be the following two things:
Could you please comment on these two issues? If they are not related to your cosmetic filtering addition and instead need to be addressed separately, let me know and I will open separate threads as well. |
@herrsimon, Just to verify, is this what you are talking about?
|
That works like a charm for me. I reworked it a bit to be faster on unskippable ads (those seem to be the most prevalent today). It seeks right to the end of the ad video, so you are done with them faster. I also tried to hide the video as much as possible: // ==UserScript==
// @name Auto Skip YouTube Ads
// @version 1.1.0
// @description Speed up and skip YouTube ads automatically
// @author jso8910
// @match *://*.youtube.com/*
// @exclude *://*.youtube.com/subscribe_embed?*
// ==/UserScript==
setInterval(() => {
const btn = document.querySelector('.videoAdUiSkipButton,.ytp-ad-skip-button')
if (btn) {
btn.click()
}
const ad = [...document.querySelectorAll('.ad-showing')][0];
if (ad) {
const video = document.querySelector('video')
video.muted = true;
video.hidden = true;
// This is not necessarily available right at the start
if(video.duration != NaN) {
video.currentTime = video.duration;
}
// 16 seems to be the highest rate that works, mostly this isn't needed
video.playbackRate = 16;
}
}, 50) |
I block everything I don't use on Youtube. For example Home, Shorts, ... buttons on sidebar, recommendations on the right etc. Since, the PR haven't merged I use greasemonkey scripts to block those elements. Here is a simple script that removes "Home" button from from Youtube sidebar. // ==UserScript==
// @name Remove Elements with Text (MutationObserver)
// @namespace http://tampermonkey.net/
// @version 0.1
// @description Remove elements within #items if they contain specified text(s)
// @author You
// @match *://*/*
// @grant none
// ==/UserScript==
(function () {
"use strict";
// Function to remove elements within #items if they contain specified text
function removeElementsWithText(texts) {
var elementsToRemove = document.querySelectorAll(
"#items > ytd-guide-entry-renderer",
);
Array.from(elementsToRemove).forEach(function (element) {
texts.forEach(function (text) {
if (element.innerText.includes(text)) {
element.remove();
}
});
});
}
let blist = ["Home", "Shorts"];
// Function to be called when mutations are observed
function handleMutations(mutationsList, observer) {
removeElementsWithText(blist); // Adjust the texts as needed
}
// Options for the observer (which mutations to observe)
const observerConfig = { childList: true, subtree: true };
// Create an observer instance linked to the callback function
const observer = new MutationObserver(handleMutations);
// Start observing the target node for configured mutations
observer.observe(document.body, observerConfig);
// Perform initial removal on page load
removeElementsWithText(blist); // Adjust the texts as needed
})(); To block any element just copy its JS path using |
Splitting this off from #5754 since it's clearly the most important part of it, and also requires some more text.
The underlying Brave adblocking library seems to already support both element hiding and Scriptlets, but qutebrowser doesn't support them so far. Not sure if they're supported in the python-adblock library or if some more work is required there.
Original discussion here: #5317 (comment).
Related: #6460
Scriptlets seem to be required to properly block YouTube ads after the latest changes, see e.g. this post by the easylist maintainer:
and this post:
Some more resources:
cc @ArniDagur
The text was updated successfully, but these errors were encountered: