Skip to content

rishabhverma17/webCrawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

webCrawler

Python based WebCrawler with Indexing the keyword from webpages it crawled.

How to run

Just call this Procedure :

  print(crawl_web(seed,max_pages))
  seed = "URL that needs to be crawled"
  max_pages = "To restrict total no of pages to be crawled, Otherwise it could take very long time.

This is a simple Pyhton3 based web crawler.

  • It can index Keyword with URL.
  • It can crawl links with "<a href=" tags only.
  • Demo link has been provided in script for crawling.

Web Crawling Results

WebCrawling

Web Crawling with Indexing Results

Webcrawling And Idexing