Big-Data-Querying-with-Apache-Spark

In this project we experimented with apache spark queries on big data datasets like the movielens dataset ("https://grouplens.org/datasets/movielens/") and tried to optimise their perfomance both on local cluster scenarios and at cloud/server scenarios like the livy server("https://livy.apache.org/").

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
ApacheSparkBigDataQueries.ipynb		ApacheSparkBigDataQueries.ipynb
LICENSE		LICENSE
README.md		README.md
analysingBigDataQueries.png		analysingBigDataQueries.png
compareBigDataQueryResults.png		compareBigDataQueryResults.png
connection_to_livy.ipynb		connection_to_livy.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ApacheSparkBigDataQueries.ipynb

ApacheSparkBigDataQueries.ipynb

LICENSE

LICENSE

README.md

README.md

analysingBigDataQueries.png

analysingBigDataQueries.png

compareBigDataQueryResults.png

compareBigDataQueryResults.png

connection_to_livy.ipynb

connection_to_livy.ipynb

Repository files navigation

Big-Data-Querying-with-Apache-Spark

About

Releases

Packages

Languages

License

OperaDevelop07/Big-Data-Querying-with-Apache-Spark

Folders and files

Latest commit

History

Repository files navigation

Big-Data-Querying-with-Apache-Spark

About

Topics

Resources

License

Stars

Watchers

Forks

Languages