Skip to content

Latest commit

 

History

History

web-sites

werdlists/web-sites

        Folder  Name         Description of Contents
active-search-engines list of active general purpose search engine names from https://wikipedia.org/wiki/Template:Web_search_engines
alexa-top1mil-sites Alexa list of top 1 million web sites
amazon-aws-namespaces AWS name spaces (paths found in aws.amazon.com URL's)
amazon-macie-types Amazon Macie data object content types via https://docs.aws.amazon.com/macie/latest/userguide/macie-classify-objects-content-type.html
censorship-test-urls URL testing list intended for discovering web site censorship https://github.com/citizenlab/test-lists
content-access-guidelines Web Content Accessibility Guidelines by W3C
free-web-hosts list of free web hosting services from https://mirror1.malwaredomains.com/files/freewebhosts.txt
github-dmca-users links to GitHub accounts that have received DMCA notices https://github.com/github/dmca
marketing-tech-landscape top 5,000 marketing technology web sites
modern-web-history A History of The Modern Web
phishtank-developers-database PhishTank downloadable database in CSV format via https://phishtank.com/developer_info.php
piidox-search-sites list of personally identifiable information search engines
simpl-redir-shortcuts shortcuts for redirection on simpl.info
sites-using-cloudflare sites using CloudFlare WAF according to GitHub @pirate
subreddit-list-full http://www.reddit.com/r/ListOfSubreddits/wiki/listofsubreddits
subreddit-list-nsfw WARNING! NSFW Same as above, but with "not-safe-for-work" subreddit materials
tls-scanner-urls URL's to test TLS scanning on via Botan
top-sites-global Top 1,000 Internet web sites across the globe by OWASP headers
url-shortener-sites URL shortener sites taken from http://dns-bh.sagadc.org/url_shorteners.txt