Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential bots #353

Open
19 of 64 tasks
JayBizzle opened this issue Jan 21, 2020 · 8 comments
Open
19 of 64 tasks

Potential bots #353

JayBizzle opened this issue Jan 21, 2020 · 8 comments

Comments

@JayBizzle
Copy link
Owner

JayBizzle commented Jan 21, 2020

  • Filestack
  • Google-Ads-Overview Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.118 Safari/537.36
  • Google-Ads-Overview Mozilla/5.0 (Linux; U; Android 6.0.1; generic) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.118 Version/4.0 Mobile Safari/537.36
  • Google-Ads-Overview Mozilla/5.0 (Linux; U; Android 2.3.4; generic) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.118 Version/4.0 Mobile Safari/537.36
  • Google-Ads-Overview Mozilla/5.0 (Linux; U; Android 2.3.4; generic) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Mobile Safari/537.36
  • Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/602.1 (KHTML, like Gecko) splash Version/9.0 Safari/602.1
  • adreview/1.0
  • Mozilla/5.0 (compatible; RyowlEngine/1.0; +https://ryowl.org)
  • Mozilla/5.0 (compatible; RyowlEngine/1.0; +https://ryowl.com)
  • Google-speakr
  • Google-speakr,gzip(gfe)
  • FeedViewer/1.0 (+http://www.feedviewer.net/webmasters; license agreement: http://www.feedviewer.net/license)
  • acebookexternalhit/1.0 (+http://www.facebook.com/externalhit_uatext.php)
  • WhoAPI/1.0 (whoapi.com)
  • Mozilla/5.0 (compatible; BackupLand/1.0; https://go.backupland.com/; Domain check for viruses;)
  • Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:66.0) WhatCMS/1.0
  • Google-Ads-Overview Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.118 Safari/537.36
  • Google-Ads-Overview Mozilla/5.0 (Linux; U; Android 6.0.1; generic) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.118 Version/4.0 Mobile Safari/537.36
  • Google-Ads-Overview Mozilla/5.0 (Linux; U; Android 2.3.4; generic) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.118 Version/4.0 Mobile Safari/537.36
  • Google-Ads-Overview Mozilla/5.0 (Linux; U; Android 2.3.4; generic) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Mobile Safari/537.36
  • Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) DownloaderChrome/62.0.3202.75 Safari/537.36
  • iGooglePortal
  • Mozilla/5.0+(compatible; Cula/2.0; https://cula.io/)
  • Mozilla/5.0 (Windows; U; Windows NT 6.1; en-us; rv:1.9.2.3) Gecko/20100401 YFF35 Firefox/3.6.3
  • Owlin - http://www.owlin.com
  • Mozilla/5.0 (compatible; +centuryb.o.t9[at]gmail.com)
  • Bublup (+https://www.bublup.com/bublup.html)
  • Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36 | Hexometer.com - HexAct Inc.
  • Mozilla/5.0/Firefox/42.0 - nbertaupete95(at)gmail.com
  • OpenGraphCheck/2.1 (+https://opengraphcheck.com)
  • donwload_html/2.0 (Linux) aniruddh@umich.edu
  • LinuxGetURL/2.0 noahtutt@umich.edu (Linux)
  • Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Google-AMPHTML)
  • Google-AMPHTML
  • inactive-blog-skipper/1.0 (eng@linqia.com)
  • AWS Network Health / Contact abuse@amazonaws.com with your website URL to stop
  • AWS Network Health / Contact abuse@amazonaws.com with your website URL to stop
  • Corax - support@coraxcyber.com
  • draw.io
  • MindsMediaProxy/3.0 (+http://www.minds.com/)
  • Mozilla/5.0 (w3dt header analysis for httprecon tools; http://w3dt.net/tools/httprecon)
  • Google-Test
  • Mozilla/5.0 (compatible; Google-Test;)
  • Mozilla/5.0 (compatible; RSiteAuditor)
  • Mozilla/5.0 (compatible; WPSec/1.3; +https://wpsec.com)
  • Mozilla/5.0 (compatible; Go-KI; +https://www.gosign.de/)
  • Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Google-AMPHTML)
  • Google-AMPHTML
  • Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome-prerendercloud/66.0.3359.139 Safari/537.36
  • DIGMATO.com web tester
  • Mozilla/5.0 (X11; Linux x86_64; Rigor) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.87 Safari/537.36
  • Mozilla/5.0 Windows NT 10.0; Win64; x64 AppleWebKit/537.36 KHTML, like Gecko Chrome/65.0.3286.0 Safari/537.36 Rigor
  • Mozilla/5.0 (X11; Linux x86_64; rv:61.0) Gecko/20100101 Firefox/61.0 (Research project: Visit PrivacyScore.org for details)
  • veu/1.0 (+http://www.veu.cat)
  • Google-Cloud-ML-Vision
  • FirmoGraph (+https://firmograph.io)
  • Mozilla/5.0 (compatible; 2GDPR/1.2; https://2gdpr.com)
  • CityGridMedia/1.0 (compatible; http://url-validation.citygrid.com/)
  • Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_4) AppleWebKit/537.11 (KHTML, like Gecko)(compatible; http://url-validation.citygrid.com/) Chrome/23.0.1271.95 Safari/537.11
  • https://gdnplus.com:Gather Analyze Provide.
  • northcutt.com SEO tools
  • Burf.co
  • Mozilla/5.0 (compatible; WPSec/1.3; +https://wpsec.com)
  • gensun.org
@Abhirup-99
Copy link

Is this merged?

@JayBizzle
Copy link
Owner Author

Is this merged?

The user-agents marked with ✅ have been added, the others need adding 👍🏻

@newHagen
Copy link

newHagen commented Jul 1, 2020

This is the UserAgent of the Google-Weblight bot:

  • Mozilla/5.0 (Linux; Android 4.2.1; en-us; Nexus 5 Build/JOP40D) AppleWebKit/535.19 (KHTML, like Gecko; googleweblight) Chrome/38.0.1025.166 Mobile Safari/535.19
    Should be detectable by "googleweblight"

@clementmas
Copy link
Contributor

There's also:

Mozilla/5.0 AppleWebKit/537.36 Chrome/114.0.5735.179 Safari/537.36 Google-Ads-Conversions

Should these 2 existing rules be replaced:

  • Google-Ads-Creatives-Assistant
  • Google-Ads-Overview

with a simple "Google-Ads" detection?

@JayBizzle
Copy link
Owner Author

There's also:

Mozilla/5.0 AppleWebKit/537.36 Chrome/114.0.5735.179 Safari/537.36 Google-Ads-Conversions

Should these 2 existing rules be replaced:

  • Google-Ads-Creatives-Assistant
  • Google-Ads-Overview

with a simple "Google-Ads" detection?

Yeah, go for it 👍

@SoranDK
Copy link

SoranDK commented Jul 24, 2023

Probably no way to detect but these 2 visit my entirely Danish site every day... The first twice a day from the US and the second once a day from China. These are all the useragent headers and all of it seems to be removed via excludes.

Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36

@JayBizzle
Copy link
Owner Author

JayBizzle commented Jul 26, 2023

Probably no way to detect but these 2 visit my entirely Danish site every day... The first twice a day from the US and the second once a day from China. These are all the useragent headers and all of it seems to be removed via excludes.

Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36

Yep, pretty annoying bots like this. Nothing this package can do about that 🤔

@SoranDK
Copy link

SoranDK commented Aug 17, 2023

I found this list if anyone's interested in going through it ;-P https://user-agents.net/bots

I don't have enough experience with regex to do it myself sadly... As my original post showed (hadn't noticed the bot I mentioned already would get catched by the "bot" in the regex).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants