Skip to content

plaa/mongo-spark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

mongo-spark

Example application on how to use mongo-hadoop connector with Apache Spark.

Read more details at http://codeforhire.com/2014/02/18/using-spark-with-mongodb/

Prerequisites

  • MongoDB installed and running on localhost
  • Scala 2.10 and SBT installed

Running

Import data into the database, run either JavaWordCount or ScalaWordCount and print the results.

mongoimport -d beowulf -c input beowulf.json
sbt 'run-main JavaWordCount'
sbt 'run-main ScalaWordCount'
mongo beowulf --eval 'printjson(db.output.find().toArray())' | less

License

The code itself is released to the public domain according to the Creative Commons CC0.

The example files are based on Beowulf from Project Gutenberg and is under its corresponding license.

About

Example application on how to use mongo-hadoop connector with Spark

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published