The Image Web Scraper is a Python script that allows you to collect all the images from a website and save them in a designated folder for later use. It utilizes the requests
library for sending HTTP requests, the BeautifulSoup
library for parsing HTML content, and the os
module for file operations.
- Extracts all image URLs from a specified website
- Downloads and saves the images in a local folder
- Handles error cases, such as invalid URLs or inaccessible images
- Automatically skips downloading images if they already exist in the designated folder
- Provides a user-agent header to mimic a regular web browser and bypass certain restrictions
- python==3.9.2
- beautifulsoup4==4.12.2
- certifi==2023.5.7
- charset-normalizer==3.1.0
- idna==3.4
- requests==2.31.0
- soupsieve==2.4.1
- urllib3==2.0.3
- Install the required libraries by running the following command:
pip install requests beautifulsoup4
-
Clone the repository or download the Python script.
-
In the script, modify the
webpage_url
variable to specify the URL of the website from which you want to extract images. -
Set the
save_folder
variable to specify the folder where you want to save the downloaded images. -
Run the script using the following command:
python image_scraper.py
- The script will extract the image URLs from the specified website and save them in the designated folder. Existing images will be skipped to avoid duplicates.
Contributions are welcome! If you encounter any issues or have suggestions for improvements, please open an issue or submit a pull request.
This project is licensed under the MIT License. See the LICENSE file for details.
Please use this tool responsibly and respect the terms of service of the websites you scrape. Be aware of any legal restrictions or permissions required before scraping images from a website.
The Image Web Scraper script is built upon the foundation of the requests
and BeautifulSoup
libraries, which are essential tools for web scraping and HTML parsing in Python.