Note: The basic framework for some of the scripts was used from external online sources such Coursera, StackOverflow.
twitterStream/
- generating twitter streams
- sentiment analysis
- filtering by region
- top ten hashtags
mapReduce/
- simple implementations of breaking a problem into key-value pairs and using MapReduce to cluster the values corresponding a key
pig/
- scripts to filter the "billion triple dataset" according to different fields
sql/
- implementation of basic sql operations
- sparse matrix
- similarity matrix