Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optional (But Default On) Tracker URL Removal So That Searches Don't as Easily Fail #76

Open
mollyrealized opened this issue Sep 24, 2023 · 2 comments

Comments

@mollyrealized
Copy link

Is your feature request related to a problem? Please describe.

It was a judgment call whether to file this as a bug or a feature request, but I think it's much the latter. When a site has Google Analytics tracking as part of its URL (the ubiquitous "utm_source" and "utm_medium" and "utm_campaign", it will usually fail when it is piped over to archive.today (archive.is, archive.ph, etc.) -- and with some other trackers, as well, I suspect (but don't know for sure).

Describe the solution you'd like

It would be useful if by default Web Archives stripped trackers out of the URL being looked up on archives, with perhaps a setting to disable that entirely, or disable it per lookup.

Describe alternatives you've considered

I presently hand-remove the trackers and rerun the request. It works; it's just an annoyance. :) I've also brought it up with archive.today, but it seems to have fallen into their bit bucket of "maybe someday".

@dessant
Copy link
Owner

dessant commented Feb 6, 2024

We could use an existing filter list, but I'd like to avoid writing our own filter list parser, and from a quick search I couldn't find a compact js package for parsing static filter lists that has a permissive license.

https://github.com/DandelionSprout/adfilt/blob/master/ClearURLs%20for%20uBo/clear_urls_uboified.txt

@mollyrealized
Copy link
Author

I definitely acknowledge an existing filter list being useful, but even as a first step, removing Google tracking would go a long way. Perhaps something like

[?&]utm_[^&]+

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants