Puppeteer Web Scraping Script

Certainly! Here's a README for your Puppeteer script:

Puppeteer Web Scraping Script

My small project about checking menu of Foodoo student`s restaurant at Oulu universite to get the downloaded menu and screenshot.

This is a Node.js script that uses Puppeteer, a headless browser automation tool, to scrape a webpage, capture a snapshot, take a screenshot, and extract text content from a specified webpage. It also includes a step to interact with the webpage by clicking on a specific element.

Prerequisites

Before running this script, make sure you have the following installed:

Node.js: You can download it from https://nodejs.org/.
npm (Node Package Manager): npm usually comes with Node.js.

Additionally, you will need to install the Puppeteer package:

npm install puppeteer

Usage

Clone this repository or create a new Node.js project.
Create a new JavaScript file (e.g., scrape.js) and copy the script into this file.

Replace the URL in the script with the webpage you want to scrape:

await page.goto("https://fi.jamix.cloud/apps/menu/?anro=93077&k=48&mt=89");

Run the script using Node.js:
```
node scrape.js
```

The script will perform the following actions:

Launch a headless browser.
Navigate to the specified webpage.
Click on an element containing the text "English" using page.evaluate.
Capture a snapshot of the webpage in MHTML format and save it as page.mhtml.
Take a screenshot of the webpage and save it as screenshot.png.
Extract the text content of the webpage and save it as foodoo_menu.txt.
Close the browser.

You can customize the script for your specific use case or further process the captured data as needed.

Future enhancements

Add CSS locator for English menu and send it to email.
Go to each meal calories estimation and estimate what is the healthiest option.
Any suggestions?

License

This script is provided under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
foodoo_menu.txt		foodoo_menu.txt
package-lock.json		package-lock.json
package.json		package.json
page.mhtml		page.mhtml
screenshot.png		screenshot.png
script.js		script.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

README.md

README.md

foodoo_menu.txt

foodoo_menu.txt

package-lock.json

package-lock.json

package.json

package.json

page.mhtml

page.mhtml

screenshot.png

screenshot.png

script.js

script.js

Repository files navigation

Puppeteer Web Scraping Script

Prerequisites

Usage

Future enhancements

License

About

Releases

Packages

Languages

maguitaria/puppeteer_automation

Folders and files

Latest commit

History

Repository files navigation

Puppeteer Web Scraping Script

Prerequisites

Usage

Future enhancements

License

About

Topics

Resources

Stars

Watchers

Forks

Languages