Contains a collection of example of jobs in python for Hadoop using the MapReduce programming model
-
Hadoop
Follow this for initial setup
Follow this for configurations for hadoop single node yarn cluster
Follow this
Follow this for initial setup and installation
Follow this for configurations for hadoop single node yarn cluster
Configuration files used can be found here
-
Install Python (Used Python 3.6 in my virtual environment) and pip (Used latest version for python3)
-
Luigi
Used Python 2.7.15 for Ubuntu with luigi 2.7.5.
Run the command
pip install luigi
Errors:
- Windows 10: Problem when running luigi View here
- Python 3.6 on Ubuntu: mechanizer module not available only for python 2.x versions