Skip to content

Collection of examples for jobs in python for Hadoop using MapReduce programming model

License

Notifications You must be signed in to change notification settings

omkarprabhu-98/mapreduce-jobs-for-hadoop

Repository files navigation

MapReduce Jobs for Hadoop

Contains a collection of example of jobs in python for Hadoop using the MapReduce programming model

Installation

  1. Hadoop

    For Windows 10:

    Follow this for initial setup

    Follow this for configurations for hadoop single node yarn cluster

    For Ubuntu:

    For Stand-Alone mode

    Follow this

    For Pseudo Distributed Mode:

    Follow this for initial setup and installation

    Follow this for configurations for hadoop single node yarn cluster

    Configuration files used can be found here

  2. Install Python (Used Python 3.6 in my virtual environment) and pip (Used latest version for python3)

  3. Luigi

    Used Python 2.7.15 for Ubuntu with luigi 2.7.5.

    Run the command

    pip install luigi
    

    Errors:

    1. Windows 10: Problem when running luigi View here
    2. Python 3.6 on Ubuntu: mechanizer module not available only for python 2.x versions

About

Collection of examples for jobs in python for Hadoop using MapReduce programming model

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages