Skip to content

Jertzukka/xenforo-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 

Repository files navigation

xenforo-scraper

Media scraper for XenForo-based forums written in Python. Supports downloading images and videos from single thread or from whole forum category. Includes basic support for printing threads into PDF with pdfkit from wkhtmltopdf.

Usage

$ python xenforo-scraper.py url [optional]


positional arguments:
  url                   URL to a single thread or a forum
                        category.

optional arguments:
  -h, --help            show this help message and exit
  -c COOKIE, --cookie COOKIE
                        Optional cookie for the web request.
  -o OUTPUT, --output OUTPUT
                        Optional download output location, must
                        exist.
  -max MAX_FILESIZE, --max-filesize MAX_FILESIZE
                        Set maximum filesize for downloads.
  -min MIN_FILESIZE, --min-filesize MIN_FILESIZE
                        Set minimum filesize for downloads.
  -i IGNORED [IGNORED ...], --ignored IGNORED [IGNORED ...]
                        Ignore files with this string in URL.
  -e, --external        Follow external files from links.
  -nd, --no-directories
                        Do not create directories for threads.
  -cn, --continue       Skip threads that already have folders
                        for them.
  -p, --pdf             Print pages into PDF.
  -ni, --no-images      Don't download images.
  -nv, --no-videos      Don't download videos.

Installation (Ubuntu)

Required packages:

sudo apt-get install python3 python3-pip
pip3 install bs4

Only needed if printing PDF:

sudo apt-get install wkhtmltopdf
pip3 install pdfkit

Notes

  • Cookie is expected in format: cookie1=value1; cookie2=value2, which is available from browser developer tools.
  • Filesizes supports both SI and IEC base units, ex. 2KB, 5.2MiB.
  • --ignored supports multiple arguments, leave it last.

About

Media scraper for Xenforo-forums written in Python.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages