GitHub - suraj5424/Cuisine-Prediction: An NLP task to predict Cuisine from recipe data, visualize cuisine distribution, train MNB, XGBoost, CNN, and Random Forest models for cuisine prediction, and evaluate performance with accuracy, confusion matrix, and classification report.

Description of Data Management and Model Evaluation Program

The provided code imports and processes a dataset containing information about recipes and their corresponding cuisines. It then proceeds to train and evaluate several machine learning models to predict the cuisine of a recipe based on its ingredients.

Program Workflow

Data Import and Preprocessing: The program starts by importing the necessary libraries and datasets from Kaggle. It then reads the training and test data into Pandas DataFrames for further processing.
Data Exploration and Visualization: The program performs exploratory data analysis (EDA) on the training data, visualizing the distribution of cuisines, the number of ingredients per recipe, and the relationship between the number of ingredients and cuisines.
Model Training and Evaluation:
- Multinomial Naive Bayes (MNB): The program uses a Multinomial Naive Bayes classifier to predict the cuisine based on the ingredients. It vectorizes the ingredients using CountVectorizer, trains the model, and evaluates its performance using accuracy score, confusion matrix, and classification report.
- XGBoost Model: Next, the program employs an XGBoost classifier to predict the cuisine. It encodes the cuisine labels, vectorizes the ingredients using CountVectorizer, trains the XGBoost model, and evaluates its performance on the validation set using accuracy score, confusion matrix, and classification report.
- Convolutional Neural Network (CNN): The program utilizes a CNN model to predict the cuisine based on the ingredients. It tokenizes the ingredients, pads sequences to a fixed length, builds and compiles the CNN model, and trains it on the training set with early stopping. It then evaluates the model's performance on the test set using accuracy score, confusion matrix, and classification report.
- Random Forest Model: Finally, the program employs a Random Forest classifier to predict the cuisine. It vectorizes the ingredients using TfidfVectorizer, trains the Random Forest model, and evaluates its performance on the validation set using accuracy score, confusion matrix, and classification report.

Features and Outputs

Data Visualization: The program visualizes the distribution of cuisines, the number of ingredients per recipe, and the relationship between the number of ingredients and cuisines through various plots and charts.
Model Evaluation: For each model, the program calculates and displays evaluation metrics such as accuracy score, confusion matrix, and classification report on the validation or test set.

Conclusion

Overall, the program efficiently manages the data, explores its characteristics, trains multiple machine learning models, and evaluates their performance, providing valuable insights into predicting the cuisine of recipes based on their ingredients.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Cuisine_Prediction_Semantic_Solution.ipynb		Cuisine_Prediction_Semantic_Solution.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cuisine_Prediction_Semantic_Solution.ipynb

Cuisine_Prediction_Semantic_Solution.ipynb

README.md

README.md

Repository files navigation

Description of Data Management and Model Evaluation Program

Program Workflow

Features and Outputs

Conclusion

About

Releases

Packages

Languages

suraj5424/Cuisine-Prediction

Folders and files

Latest commit

History

Cuisine_Prediction_Semantic_Solution.ipynb

Cuisine_Prediction_Semantic_Solution.ipynb

README.md

README.md

Repository files navigation

Description of Data Management and Model Evaluation Program

Program Workflow

Features and Outputs

Conclusion

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages