Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
-
Updated
May 19, 2021 - Scala
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
This repository contains analysis of IMDB data from multiple sources and analysis of movies/cast/box office revenues, movie brands and franchises
Movie Recommender based on the MovieLens Dataset (ml-100k) using item-item collaborative filtering.
Analysis of MovieLens Dataset in Python
Exploratory Dataset Analysis (EDA) will be uploaded to this repository. Libraries such as Pandas, Matplotlib, Seaborn and Plotly will be used for data analysis.
Data analysis on Big Data. Used various databases from 1M to 100M including Movie Lens dataset to perform analysis. Covers basics and advance map reduce using Hadoop.
Building a movie recommender system with factorization machines on Amazon SageMaker.
Data analysis and movie recommendation of OpenMovie dataset by using the shell, Python, Cosine Similarity algorithm, Apache PySpark, and Apache Hadoop.
Data analysis on Big Data. Used various databases from 1M to 100M including Movie Lens dataset to perform analysis. Covers basics and advance map reduce using MongoDB.
Implementation of Spotify's Generalist-Specialist score on the MovieLens dataset.
Project to determine the ratings for a movie using each of the Spark & Hadoop Eco-system.
Contains my custom implementation of various machine learning models and analysis.
A recommendation algorithm capable of accurately predicting how a user will rate a movie they have not yet viewed based on their historical preferences. The models and EDA are based on the 1M MOVIELENS dataset
Spark MLLIB: Collaborative Filtering Movie Recommendation System
Analytics data for looking : filter movies with drama genre, most rated movies, number of users and average rating for each age range,etc. Visualize the count and age of moviegoers with the Matplolib library and Show movies, age range, average rating.
This is a project made as a part of my data science master's program to analyze and draw inference from Movielens data.
Created visualizations of the MovieLens data set using matrix factorization http://www.yisongyue.com/courses/cs155/2018_winter/assignments/project2.pdf
A Feature Preference based CF Experiment on MovieLens 100K dataset
Add a description, image, and links to the movielens-data-analysis topic page so that developers can more easily learn about it.
To associate your repository with the movielens-data-analysis topic, visit your repo's landing page and select "manage topics."