Tutorials on Big Data essentials: Hadoop, MapReduce, Spark.
-
Updated
Apr 27, 2024 - Jupyter Notebook
Tutorials on Big Data essentials: Hadoop, MapReduce, Spark.
In this project, I used Decision Tree Learning Model as the main algorithm to build the model. Due to the big amount of flight data, we implement the project using MRJob, PySpark and Spark's MLlib then compare the performance and accuracy of those implementations.
The largest collection of publicly accessible Progressive Web Apps*
Movie rating prediction application
RECUPERACIÓ DE LA INFORMACIÓ Curs 2023-24 EPSEVG
Project developed to make an sentiment analysis using dictionary implemented with MrJob applying a map-reduce model. It can be executed locally or in HDFS enviroments (such as Hadoop or AWS)
Search engine for movie cast generation.
Samples related to data engineering, e.g. spark, embulk, airflow, etc.
Analyzes book review data from Amazon and the Amazon-Vine program utilizing PySpark and Amazon Web Service's Relational Database Service (AWS RDS)
Add a description, image, and links to the mrjob topic page so that developers can more easily learn about it.
To associate your repository with the mrjob topic, visit your repo's landing page and select "manage topics."