Skip to content

Elmehdi9/web-scraping-projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Web Scraping Projects With Python

This repository contains a collection of tools, scripts and projects that focus on analysis and visualisation of football data.

Contents

Table of Contents
  1. About the project
  2. Prerequisites
  3. Folder Structure
  4. Projects
    • Scraping salaries data from Salary.com
    • Scraping car's data and crawling to specific URLs
    • Scraping of transfers data
    • Scraping different types of football data from Understat.com
    • Scraping movie data from Cineb.com
    • Scraping Real-estate data and crawling to Appartement pages
    • Scraping amazons data by keywords search

About the Project

This repository has a collection of web scraping projects. I attempted to scrape many websites in order to cope with various structures and obtain various sorts of data (cars, salary, sports...). Some of these projects feature crawling techniques as well as exploratory data visualization. I'd also like to point out that the web isn't constant, thus the method I approach a specific website scraping now may not be appropriate in the future.

I recommend starting with the notebook that scrapes movie data from Cineb.com since it provides an understanding of how the scraping is done.

Prerequisites


made-with-python

Made withJupyter


The following open source packages are used in this project:

  • Pandas

  • Matplotlib

  • bs4

  • requests

  • csv

  • json

Folder structure

|-- web-scraping-projects
    |-- README.md
    |-- data-directory
    |   |-- books_data.csv
    |   |-- cars.csv
    |   |-- movies.csv
    |   |-- real_estate.csv
    |   |-- salary_data.csv
    |   |-- transfers_data.csv
    |-- notebooks
        |-- Amazon.ipynb
        |-- Carvago.ipynb
        |-- Cineb_movies.ipynb
        |-- Real estate.ipynb
        |-- Salaries.ipynb
        |-- Transfermarkt.ipynb
        |-- Understat.ipynb
        |-- .ipynb_checkpoints

About

This repository provides various web scraping projects in Jupyter notebooks for both learning and data-related workshopes

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published