Goose Database

This repository is for the development of the goose database system. While goose may be an acronym, nobody is exactly sure what it stands for. Frontend code and scraping scripts will live here, at least until they become the most awesome recipe database ever.

Database setup

Import the recipes_schema.sql file into your mySQL server. This can be done by using the following command:

mysql -u user -p recipes < recipes_schema.sql

To run the frontend

install node
run npm install to download all of the node modules
create a config.js file with your connection information (instructions found below)
run the code with npm start

Running scrapers

Each data collection scraper comes with three files.

A parser that collects all the recipe urls so they may be directly fed into the other program
A scheduler that just runs them in sequence so that they may be ran for a long period of time
A scraper, that scrapes the recipe data from the url that was gathered by the parser

There are a few dependencies that these require:

pip3 install requests
pip3 install beautifulsoup4
pip3 install recipe-scrapers
pip3 install selenium #also need the selenium chromedriver for this to work

Each run collects the urls into recipes.txt. Then the scraper collects them 100 at a time into each threaded file. Once they are done they can then be imported through the api for parsing and insertion into the database.

Creating a config file

When developing, you will need to insert your credentials in a file named config.js. There is an example file to follow for format (exampleconfig.js). It should include the following, where user and pass are your username and password:

host: "pma.horine.dev",
user: "user",
password: "pass",
database: "recipes"

You can have multuple configurations in this file, and the application decides which one to use by looking at the NODE_ENV environment variable. If you haven't set this though, it defaults to development.

Recipe API Server

See recipe-api/readme.md for details.

Ingredient Phrase Tagger API Server

See https://github.com/jemisonf/ingredient-phrase-tagger for details.

Name		Name	Last commit message	Last commit date
Latest commit History 146 Commits
allrecipesScraper		allrecipesScraper
bbcGoodFoodScraper		bbcGoodFoodScraper
bbcScraper		bbcScraper
data		data
epicuriousScraper		epicuriousScraper
foodScraper		foodScraper
foodnetworkScraper		foodnetworkScraper
ingredients-api @ 9fbcb84		ingredients-api @ 9fbcb84
parseKroger		parseKroger
public		public
recipe-api		recipe-api
smartlabel		smartlabel
views		views
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
exampleconfig.js		exampleconfig.js
package.json		package.json
recipes_schema.sql		recipes_schema.sql
server.js		server.js
sqlConfig.js		sqlConfig.js
sqltest.js		sqltest.js
testRecipe.js		testRecipe.js

horinezachary/goose-database

Folders and files

Latest commit

History

Repository files navigation

Goose Database

Database setup

To run the frontend

Running scrapers

Creating a config file

Recipe API Server

Ingredient Phrase Tagger API Server

About

Resources

Stars

Watchers

Forks

Languages