Skip to content

reksar/web-privacy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Web privacy

Summary of web privacy, based on webbkoll.

Using

In common: pass an URL list in the urls arg to the summary spider

scrapy crawl summary -a urls="<URL_1 URL_2 ...>"

Linux

Pass an URL list to the run.sh, e.g.:

run.sh `cat urls.txt`

Windows

Just run.bat.

But you need to take care of using a Python virtual environment, installing the required Python packages and editing the urls.txt manually.

Use the run.sh as an example.

Note: remove --nolog from runners to debug.