Skip to content

MLWhiz/Spark_Projects

Repository files navigation

Spark_Projects

Spark Projects for the Berkeley Data Science Course

  1. Wordcount in Spark - A word counting program to count the words in all of Shakespeare's plays

  2. Apache Log File analysis in Spark - Use Spark to explore NASA Apache web server log

  3. Entity Resolution - Entity Resolution using TFIDF approaches in Spark.

  4. Movie Recommendation using ALS - Predicting Movie ratings using Spark.

  5. Linear Regression - Predicting Song Year using Linear regression in Spark.

  6. Logistic Regression - Predicting Click Through Rates using Spark. One Hot Encoding, Hashing Explained.

  7. PCA - Running the PCA on neuroscience data

About

Spark Projects for the Berkeley Data Science Course

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages