Skip to content

MapReduce in Java for multi core

aburkov edited this page Jun 28, 2015 · 12 revisions

With xpresso you can easily define and run MapReduce jobs on a multi-core machine to parallelize time-consuming crunching.

Let's assume that we have a list of elements we want to process:

list<String> elements = x.list("Map","aNd","ReDuce","arE","aWEsome");

The processing of each element takes a long time (say, 10 seconds), so we want to parallelize the processing on our multicore machine. Let the desired processing of each element be as follows: if the element starts with an "a", then put it in uppercase and join it with other uppercase elements using "~" as separator; if the element doesn't start with an "a", then put it to lowercase and join it with other lowercase words.

Let's define the Mapper and Reducer:

static Mapper<String,String> mapper = new Mapper<String,String>() {
	public void map(String input) {
		x.Time.sleep(10); //the processing of each element takes a long time :-)
		if (x.String(input).startsWith("a")) {
			yield("upper", input.toUpperCase());				
		} else {
			yield("lower", input.toLowerCase());
		}
	}
};

static Reducer<String,String> reducer = new Reducer<String,String>() {
	public void reduce(tuple2<String,list<String>> input) {
		yield(input.key,x.String("~").join(input.value));
	}
};

Our mapper does the transformation of the case as described above, and our reducer joins the values with the "~".

Our MapReduce setup is now ready, so let's start crunching:

x.timer.start();
x.print(x.<String,String,String>MapReduce(elements).map(mapper).reduce(reducer), x.timer.stop());

Console:
{upper:AND~AWESOME~ARE, lower:reduce~map}
10.013s

As you can see, the processing of all 5 elements took only about 10 seconds, while as we have defined above the processing of each single element takes 10 seconds.