关于partitioner的疑问 #51

leo-987 · 2016-06-23T01:26:48Z

我在 Learning Spark 中看到有一段话：

Finally, for binary operations, which partitioner is set on the output depends on the parent RDDs’ partitioners. By default, it is a hash partitioner, with the number of partitions set to the level of parallelism of the operation. However, if one of the parents has a partitioner set, it will be that partitioner; and if both parents have a partitioner set, it will be the partitioner of the first parent.

子RDD的partitioner应该由父RDD的partitioner决定。但在 SparkInternals 的第二章，父子RDD的partitioner都不相同，这是怎么回事？如果两个父RDD的其中一个是hash-partitioner，那么子RDD不应该也是hash-partitioner吗？

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

关于partitioner的疑问 #51

关于partitioner的疑问 #51

leo-987 commented Jun 23, 2016 •

edited

关于partitioner的疑问 #51

关于partitioner的疑问 #51

Comments

leo-987 commented Jun 23, 2016 • edited

leo-987 commented Jun 23, 2016 •

edited