Using Apache Pig and Hadoop to analyze baseball data
For a project in DS730: Big Data: High Performance Computing, we were tasked with analyzing Major League Baseball data and answering some off the wall questions about the data, eg.#1. Output the birth city of the player who had the most at bats (AB) in his career.
This project contains the scripts I used to answer the questions.