GitHub - sujoyyyy/Website-Scraping-Template: Web scraping, web harvesting, or web data extraction is data scraping used for extracting data from websites. The term typically refers to automated processes.

Project contains 3 python files and 3 csv files.

1.NewEg(test.py) : Generated a products.csv file, which contains the brand, productname, and shipping details of the products.

2.Flipkart Iphone(test1.py) : Generated a products1.csv file, which contains the price, productname, and rating of the products.

3.Flipkart Iphone Page2(test2.py) : Generated a products2.csv file, which contains the price, productname, and rating of the products.

Web pages are built using text-based mark-up languages (HTML and XHTML), and frequently contain a wealth of useful data in text form. However, most web pages are designed for human end-users and not for ease of automated use. As a result, specialized tools and software have been developed to facilitate the scraping of web pages. Newer forms of web scraping involve listening to data feeds from web servers.Beautiful Soup is a Python package for parsing HTML and XML documents (including having malformed markup, i.e. non-closed tags, so named after tag soup). It creates a parse tree for parsed pages that can be used to extract data from HTML, which is useful for web scraping. Beautiful Soup is a library that makes it easy to scrape information from web pages. It sits atop an HTML or XML parser, providing Pythonic idioms for iterating, searching, and modifying the parse tree. Hence this makes it easy to collect and store data from various websites, before they are made ready for wrangling and cleansing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

products.csv

products.csv

products1.csv

products1.csv

products2.csv

products2.csv

test.py

test.py

test2.py

test2.py

test3.py

test3.py

Repository files navigation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
products.csv		products.csv
products1.csv		products1.csv
products2.csv		products2.csv
test.py		test.py
test2.py		test2.py
test3.py		test3.py

sujoyyyy/Website-Scraping-Template

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Stars

Watchers

Forks

Languages