Skip to content

Vincenz8/GoogleImages_WebScraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Google's images web-scraper

bonsai_style

This web scraper was developed for gathering images of bonsai, you can find the dataset at:

Kaggle -> https://www.kaggle.com/datasets/vincenzors8/bonsai-styles-images

Contents


Setup

Before starting the program you need to install a few requirements like:

  • Google Chrome webdriver (for Selenium)

Then with:

pip install requirements.txt

You can install all the required python libraries, I recommend to create a virtual environment (Anaconda, pipenv, etc).

Configuration

Here we can see how the configuration file ("data/scraper_config.json") look like:

bonsai_style

You can change the number of images and obviously the type of images, fields like "thumb_class" and "img_class" are HTML tag needed to locate the elements on the web page.

Run program

From terminal:

python main.py

Or from the start button of your preferred IDE.