An Ansible Role to Configure and setup Hive Data WareHouse on Client Node.
-
Updated
May 18, 2021
An Ansible Role to Configure and setup Hive Data WareHouse on Client Node.
Hadoop-Cluster
Distributed Hadoop and Spark based framework for in-memory GIS queries
A basic introductory example of hadoops mapreduce libraries to load and analyse large datasets in this case a US patent dataset sourced from https://www.nber.org/research/data/us-patents
In this repository I explained all installation steps of Hadoop Architecture in Windows.
Python Scripts for working with Big Data Files
MapReduce in Cluster.
Titanic data analysis with Hadoop
PageRank algorithm written in Java MapReduce framework
Product recommendation system on Amazon product dataset using Apache Spark framework
MapReduce Python Example
The repo contains the steps for setting up the single node cluster in Hadoop 3.2.1 in Ubuntu 20.04 LTS
Setup hadoop cluster manually and automatically
WQD7008 Parallel and Distributed Computing Project
EMR 5.25.0 cluster single node Hadoop docker image. With Amazon Linux, Hadoop 2.8.5 and Hive 2.3.5
The goal of this project is to identify the flood-prone areas with probabilities of flood in counties in a future date, using Spark MLLib.
Twitter data analysis using hadoop (hdfs), flume, map-reduce and hive. Sentiment Analysis is also done using affin dictionary for tweets related to Indian election.
A storage reference to a comprehensive guide on installing Hadoop on Windows
Add a description, image, and links to the hadoop-framework topic page so that developers can more easily learn about it.
To associate your repository with the hadoop-framework topic, visit your repo's landing page and select "manage topics."