Skip to content

xeon123/hadoop-0.20.1-bft

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hadoop MapReduce tolerant to arbitrary faults

MapReduce is often used for critical data processing, e.g., in the context of scientific or financial simulation. However, there is evidence in the literature that there are arbitrary (or Byzantine) faults that may corrupt the results of MapReduce without being detected. We present a Byzantine fault-tolerant MapReduce framework that can run in two modes: non-speculative and speculative.

We thoroughly evaluate experimentally the performance of these two versions of the framework, showing that they use around twice more resources than Hadoop MapReduce, instead of the three times more of alternative solutions. We believe this cost is acceptable for many critical applications.

The prototype of the MapReduce runtime was implemented by modifying the original Hadoop 0.20.0 source code.

This work have been published in 1 and 2.

Configuration of MapReduce BFT

I have configure MapReduce based on the site.

mapred-site.xml has several new parameters to configure the platform.

tasktracker.tasks.fault.tolerance -> nr of faults to tolerate 2f+1
mapred.map.tasks.deferred.execution -> true | false if we want to run the scheduler in deferred/non-speculative or tentative/speculative

Wordcount

Wordcount is the common example to run with the application. I have run the example with the following command:

hadoop jar hadoop-0.20.1-examples.jar wordcount /input /output

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published