Skip to content

bsridatta/Data-Intensive-Computing

Repository files navigation

Data-Intensive-Computing

ID2221 Data Intensive Computing Platforms @ KTH

Labs

  • Spark Scala: In this lab assignment we practice the basics of data intensive programming by setting up HDFS, HBase, Hadoop MapReduce, Spark, and Spark SQL, and implementing simple applications on them.

Review Questions

  • Review questions 1: distributed file systems and NoSQL databases
  • Review questions 2: data-parallel processing systems

Reading Assignments

  • TensorFlow: A system for large-scale machine learning
  • MLlib: Fast Training of GLMs using Spark MLlib

Project

Youtube Trends