/
rf.txt
46 lines (40 loc) · 1.27 KB
/
rf.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
- the input/output format in each Hadoop task, i.e., the keys for the mappers and reducers
Stripes:
Mapper<LongWritable, Text, Text, Text>
input:
key: the position in the file
value: the line of text
output:
key: a
value: b:1,c:2,d:5,e:3,f:2
Reducer<Text, Text, Text, Text>
input:
key: a
value: b:1,c:2,d:5,e:3,f:2
output:
key: a
value: b
Pairs:
Mapper<LongWritable, Text, Text, Text>
input:
key: the position in the file
value: the line of text
output:
key: a, * or a, b1 or a, b2
value: n
Reducer<Text, Text, Text, Text>
input:
key: a, * or a, b1 or a, b2
value: n
output:
key: a
value: b
- the Hadoop cluster setting you used, i.e., number of mappers and reducers
Stripes:
number of mappers: 8
number of reducers: 1
Stripes:
number of mappers: 8
number of reducers: 1
- the running time for run.sh
2 days, 7 hours