Skip to content

๐Ÿง  The project aims to predict the popularity of a movie based on it's overview text. It involves a thorough analysis of a movie dataset, exploring various aspects of data preprocessing, model building, training, and evaluation.

Notifications You must be signed in to change notification settings

siliataider/Neural-Network-Analysis-for-Predictive-Modeling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

10 Commits
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Python TensorFlow Keras Sklearn NLTK Pandas NumPy Matplotlib

๐Ÿง  Neural Network Analysis for Predictive Modeling

The project aims to predict the popularity of a movie based on it's overview text. It involves a thorough analysis of a movie dataset, exploring various aspects of data preprocessing, model building, training, and evaluation.

๐Ÿ“‰ Results (don't judge a book by its cover)

This project demonstrates that it is very hard to predict the popularity of a movie based on its overview text.

As you can see from this confusion matrix, the results are not exactly accurate:

Capture d'รฉcran 2023-12-13 17:48:38

Approaches

  • Regression => Target: average_vote
  • Classification => Target: poplarity_class

Contents

  1. ๐Ÿ“š Preparation

    • Loading the Movies dataset from a GitHub repository.
    • Initial exploration of dataset characteristics, including data types, correlations, and handling missing values.
    • Transforming the data by dropping unnecessary columns, creating new features, and visualizing distributions.
  2. ๐Ÿ”„ Data Preprocessing

    • Feature engineering by converting text data to numerical representations (text vectorization) for the movie overviews and titles.
    • Normalization of numeric features for consistent scaling.
    • Splitting the dataset into training and test sets.
  3. ๐Ÿ—๏ธ Model Building and Training

    • Construction of various neural network models for both regression (predicting vote_average) and classification (predicting popularity_class_label).
    • Models include simple dense networks, LSTM, embedding layers, and convolutional networks, showcasing a range of deep learning techniques.
    • Training the models using different features sets: numeric features, overview text, and title text.
  4. ๐Ÿ“Š Evaluation and Results Analysis

    • Evaluation of models using metrics such as R-squared, explained variance for regression, and accuracy, confusion matrix, and classification report for classification.
    • Visualization of training and validation accuracy over epochs.
    • Comparative analysis of different models based on their performance on the test dataset.

๐ŸŒŸ Highlights of the Project

  • Comprehensive Data Analysis: Detailed examination and transformation of a complex movie dataset.
  • Diverse Model Architectures: Exploration of various neural network structures tailored for specific types of features.
  • In-depth Model Evaluation: Extensive analysis of model performance, providing insights into their effectiveness in predictive modeling.

โœจ Conclusion and Future Work

The analysis provides valuable insights into the performance of different types of neural network architectures for both regression and classification tasks. Future work could explore further refinement of models, incorporation of additional features, and application to other datasets or predictive scenarios.

About

๐Ÿง  The project aims to predict the popularity of a movie based on it's overview text. It involves a thorough analysis of a movie dataset, exploring various aspects of data preprocessing, model building, training, and evaluation.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published