Stop using Parallel for SparkFeatureUnion #69

taynaud · 2016-08-30T16:21:24Z

See https://issues.apache.org/jira/browse/SPARK-12717
The parameter is still here for the converted to_scikit() object

I think it explain the flappy test on my previous PR

See https://issues.apache.org/jira/browse/SPARK-12717 The parameter is still here for the converted to_scikit() object

fulibacsi · 2016-08-31T07:21:42Z

Is this issue still present in Spark 2.0.0?

taynaud · 2016-09-05T09:47:29Z

I do not know, the issue appears randomly and I have not reproduced it on my cluster. I have add spark 2.0 to CI in #71 but as it is random, I do not know if it will allow to conclude.

I think this parallelization is not very usefull for a spark computation.

kszucs · 2016-11-01T21:24:19Z

Without threading a pipeline steps will be executed sequentially. I think n_jobs make sense, multiple dags will be submitted and executed in parallel. The overall level of parallelization can be increased via n_jobs.

Shouldn't we drop support for spark versions before 2.0.0?

taynaud · 2016-12-20T17:15:07Z

According to apache jira, it is still an issue in pyspark 2.0.2

Stop using Parallel for SparkFeatureUnion

daf3a3d

See https://issues.apache.org/jira/browse/SPARK-12717 The parameter is still here for the converted to_scikit() object

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stop using Parallel for SparkFeatureUnion #69

Stop using Parallel for SparkFeatureUnion #69

taynaud commented Aug 30, 2016

fulibacsi commented Aug 31, 2016

taynaud commented Sep 5, 2016

kszucs commented Nov 1, 2016

taynaud commented Dec 20, 2016

Stop using Parallel for SparkFeatureUnion #69

Are you sure you want to change the base?

Stop using Parallel for SparkFeatureUnion #69

Conversation

taynaud commented Aug 30, 2016

fulibacsi commented Aug 31, 2016

taynaud commented Sep 5, 2016

kszucs commented Nov 1, 2016

taynaud commented Dec 20, 2016