Web scraping with Nodejs and Cheerio

CI pipeline for a web scraper built with Nodejs and Cheerio

Install dependencies

npm install

Run scraper

npm start

Then head to a HTTP client (like Postman, Insomnia, or Hoppscotch), enter the endpoint "/scrape", enter a request body like the one below (this app is limited to URLs with the base URL as https://www.amazon.com) and run the request.

{
  "url": "https://www.amazon.com/s?k=all+headphones&crid=2TTXQBOK238J3&qid=1667301526&sprefix=all+headphones%2Caps%2C284&ref=sr%5C_pg%5C_1"
}

Expect a response like the screenshot below depicts and a file in the data folder.

Test the scraper

Run

npm test

Expect results like these

> test
> jest --detectOpenHandles

  console.log
    Server is running on port 3000

      at Server.log (src/server.js:15:11)

 PASS  __tests__/scraper.test.js
  scraper
    ✓ generateFilename() returns a string (2 ms)
    ✓ saveProductJson() saves a file (1 ms)
    ✓ POST /scrape returns a 200 status code (2688 ms)

Test Suites: 1 passed, 1 total
Tests:       3 passed, 3 total
Snapshots:   0 total
Time:        2.927 s, estimated 3 s
Ran all test suites.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.circleci		.circleci
__tests__		__tests__
src		src
.editorconfig		.editorconfig
.gitignore		.gitignore
README.md		README.md
image.png		image.png
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.circleci

.circleci

tests

tests

src

src

.editorconfig

.editorconfig

.gitignore

.gitignore

README.md

README.md

image.png

image.png

package-lock.json

package-lock.json

package.json

package.json

Repository files navigation

Web scraping with Nodejs and Cheerio

Install dependencies

Run scraper

Test the scraper

About

Contributors 2

Languages

CIRCLECI-GWP/nodejs-cheerio-web-scraping

Folders and files

Latest commit

History

Repository files navigation

Web scraping with Nodejs and Cheerio

Install dependencies

Run scraper

Test the scraper

About

Resources

Stars

Watchers

Forks

Languages