Skip to content

IshmeetKaur/Distributed-Data-Mining-Lab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Distributed Data Mining Lab - TUM SS_2017

Summary

The wiki contains the resources on How to Setup a Distributed Environment for Data Mining and Analysis.

Technologies/Resources involved are:

  1. HDFS
  2. Hadoop
  3. Yarn
  4. Spark
  5. MLlib
  6. Docker
  7. ElasticSearch
  8. Kibana
  9. LocText
  10. Nalaf
  11. String-Tagger
  12. PubMed

Weekly Progress

The wiki for the Distributed Data Mining lab course would be available per week.

References

  1. https://linoxide.com/cluster/setup-hadoop-multi-node-cluster-ubuntu/ (Multinode Hadoop Setup)
  2. http://data-flair.training/blogs/apache-spark-installation-on-multi-node-cluster-step-by-step-guide/ (Multinode Spark Setup)
  3. https://spark.apache.org/docs/1.2.0/mllib-guide.html (MLLib Documentation)
  4. http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/ (MapReduce in Python)
  5. http://cs.smith.edu/dftwiki/index.php/Hadoop_Tutorial_1_--_Running_WordCount (Map Reduce Tutorial)
  6. https://www.digitalocean.com/community/tutorials/how-to-install-and-use-docker-getting-started (Docker Installation)
  7. https://github.com/Rostlab/LocText (Loctext)
  8. https://github.com/Rostlab/nalaf (Nalaf)
  9. https://github.com/titipata/pubmed_parser(Pubmed Parser)
  10. https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html (Elastic Search)
  11. http://hadoop.apache.org/
  12. https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html
  13. https://rubenmiddeljans.files.wordpress.com/2015/08/hadoop-cluster.jpg
  14. http://spark.apache.org/faq.html
  15. https://spark.apache.org/mllib/
  16. https://www.digitalocean.com/community/tutorials/how-to-set-up-an-nfs-mount-on-ubuntu-16-04
  17. https://docs.docker.com/engine/installation/linux/ubuntu/

Releases

No releases published

Packages

No packages published