Skip to content
/ knn Public

This Project focuses on creating a KNN MapReduce program for the Hadoop Framework

Notifications You must be signed in to change notification settings

vinitS101/knn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This project focuses on comparing the performance of MapReduce and spark on a Hadoop Cluster for the same sufficiently large dataset. (which can be found here: https://archive.ics.uci.edu/ml/datasets/Poker+Hand )

MapReduceCode:

- Contains the MapReduce code written in Java.

SparkAppCode:

- Contains code written in Scala that can be run on a Cluster. 
  Add relevant `hdfs` or `s3` paths for the testing and training data.

- The app writes the classes of the Test Data to a local `.txt` file on the Master Node.

AccuracyTest

- Use `accuracyTest.java` to check the accuracy of the predicted classes. 

About

This Project focuses on creating a KNN MapReduce program for the Hadoop Framework

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published