tent/parallel #109

nno · 2013-02-25T20:09:33Z

added a module misc/parallelization as a wrapper around whatever kind of parallelization implementation might be available.
currently supports pprocess and has a fallback single process implementation that should always work.
refactored Searchlight and surface-based voxel selection code
added parallelization support to RepeatedMeasure

…ve error messages

…s length unknown

…r there are unique file names in case the hdf5 backend is used

mih · 2013-02-26T11:59:47Z

Does it pass the full test suite for you -- not just the subset that travis runs? In the past I had problems with serialization of one or the other type that is generated by the test suite. It failed for me once it had to go into another process.

nno · 2013-02-26T16:23:26Z

On 26 February 2013 12:59, Michael Hanke notifications@github.com wrote:

Does it pass the full test suite for you -- not just the subset that
travis runs?

I didn't realize that travis only tests a subset... and the answer is no.

(1) Using pprocess and 'pickle' ('native') to store results quite a few
errors: FAILED (SKIP=15, errors=5, failures=5)

(2) Using pprocess and 'hdf5' : FAILED (SKIP=15, errors=5, failures=3)

(3) one error when using a single thread process - and this happens not
only in my branch but also in master. Actually a bit worrying - anyone
noticed this one before?

For reference I'm including the errors for each of these cases (note that
these are cumulative - I removed duplicates). Tests were run on neurodebian
and virtualbox with two cores.

For those who want to try out these scenarios, look in
mvpa2/measures/base.py:_call(...).

a) Parallelizer = parallelization.get_best_parallelizer(nproc=1)

This sets the number of processes to use to 1, and thus uses a single
thread and no hdf5 or pickle

b) results_backend = 'native'

This sets the backend to pickle rather than

c) results_backend = Parallelizer.get_best_results_backend()

This sets the backend to hdf5 - only effective if nproc!=1 (None or an int
greater than 1) and pprocess available.

In the past I had problems with serialization of one or the other type that

is generated by the test suite. It failed for me once it had to go into
another process.

That seems to be happening here as well. Do you recall or have ideas which
types could be causing this serialization trouble? You and Yarik should be
more familiar with that part of the code as either of you, or both, wrote
it :-)

errors using single thread:

FAIL: Test AUC computation

Traceback (most recent call last):
File "/media/sf_mac/git/pyMVPA/mvpa2/tests/test_transerror.py", line 316,
in test_auc
'AUC=%.2g among %s' % (mauc, stats['AUC']))
AssertionError:
Single scenario lead to failures of unittest test_auc:
on
clf=<kNN(k=5)> :
All AUCs must be above chance. Got minimal AUC=nan among [nan, nan]
clf=<kNN(k=5, voting='majority')> :
All AUCs must be above chance. Got minimal AUC=nan among [nan, nan]
clf=<kNN on SMLR(lm=1) non-0> :
All AUCs must be above chance. Got minimal AUC=nan among [nan, nan]
clf=<kNN on 5%(ANOVA)> :
All AUCs must be above chance. Got minimal AUC=nan among [nan, nan]
clf=<kNN on 50(ANOVA)> :
All AUCs must be above chance. Got minimal AUC=nan among [nan, nan]

additional errors caused by pprocess+hdf5

ERROR: Basic tests of metaclass for using regressions as classifiers

Traceback (most recent call last):
File "/media/sf_mac/git/pyMVPA/mvpa2/testing/sweepargs.py", line 67, in
do_sweep
method(args, *kwargs)
File "/media/sf_mac/git/pyMVPA/mvpa2/tests/test_clf.py", line 968, in
test_regression_as_classifier
self.assertEqual(clf.ca.distances.shape,
File "/media/sf_mac/git/pyMVPA/mvpa2/base/collections.py", line 474, in
getattribute
return self[key].value
File "/media/sf_mac/git/pyMVPA/mvpa2/base/collections.py", line 139, in
_get_virtual
return self._get()
File "/media/sf_mac/git/pyMVPA/mvpa2/base/attributes.py", line 185, in
_get
raise UnknownStateError("Unknown yet value of %s" % (self.name))
UnknownStateError: Unknown yet value of distances

ERROR: test_split_classifier (mvpa2.tests.test_clf.ClassifiersTests)

Traceback (most recent call last):
File "/media/sf_mac/git/pyMVPA/mvpa2/tests/test_clf.py", line 368, in
test_split_classifier
tr_cverror = cv.ca.training_stats.error
File "/media/sf_mac/git/pyMVPA/mvpa2/base/collections.py", line 474, in
getattribute
return self[key].value
File "/media/sf_mac/git/pyMVPA/mvpa2/base/collections.py", line 139, in
_get_virtual
return self._get()
File "/media/sf_mac/git/pyMVPA/mvpa2/base/attributes.py", line 185, in
_get
raise UnknownStateError("Unknown yet value of %s" % (self.name))
UnknownStateError: Unknown yet value of training_stats

ERROR: Basic tests for TreeClassifier

Traceback (most recent call last):
File "/media/sf_mac/git/pyMVPA/mvpa2/testing/tools.py", line 179, in
newfunc
return func(_arg, *_kwargs)
File "/media/sf_mac/git/pyMVPA/mvpa2/tests/test_clf.py", line 594, in
test_tree_classifier
cvtrc = cv.ca.training_stats
File "/media/sf_mac/git/pyMVPA/mvpa2/base/collections.py", line 474, in
getattribute
return self[key].value
File "/media/sf_mac/git/pyMVPA/mvpa2/base/collections.py", line 139, in
_get_virtual
return self._get()
File "/media/sf_mac/git/pyMVPA/mvpa2/base/attributes.py", line 185, in
_get
raise UnknownStateError("Unknown yet value of %s" % (self.name))
UnknownStateError: Unknown yet value of training_stats

ERROR: test_values (mvpa2.tests.test_clf.ClassifiersTests)

Traceback (most recent call last):
File "/media/sf_mac/git/pyMVPA/mvpa2/testing/sweepargs.py", line 67, in
do_sweep
method(args, *kwargs)
File "/media/sf_mac/git/pyMVPA/mvpa2/tests/test_clf.py", line 640, in
test_values
self.assertEqual(len(clf.ca.estimates), ds.nsamples/2)
File "/media/sf_mac/git/pyMVPA/mvpa2/base/collections.py", line 474, in
getattribute
return self[key].value
File "/media/sf_mac/git/pyMVPA/mvpa2/base/collections.py", line 139, in
_get_virtual
return self._get()
File "/media/sf_mac/git/pyMVPA/mvpa2/base/attributes.py", line 185, in
_get
raise UnknownStateError("Unknown yet value of %s" % (self.name))
UnknownStateError: Unknown yet value of estimates

ERROR: mvpa2.tests.test_multiclf.test_multiclass_classifier_cv

Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest
self.test(_self.arg)
File "/usr/lib/python2.7/dist-packages/nose/util.py", line 622, in newfunc
return func(_arg, *kw)
File "/media/sf_mac/git/pyMVPA/mvpa2/testing/sweepargs.py", line 67, in
do_sweep
method(args, *kwargs)
File "/media/sf_mac/git/pyMVPA/mvpa2/testing/sweepargs.py", line 67, in
do_sweep
method(args, **kwargs)
File "/media/sf_mac/git/pyMVPA/mvpa2/tests/test_multiclf.py", line 152,
in test_multiclass_classifier_cv
assert_equal(str(cv.ca.training_stats), str(mcv.ca.training_stats))
File "/media/sf_mac/git/pyMVPA/mvpa2/base/collections.py", line 474, in
getattribute
return self[key].value
File "/media/sf_mac/git/pyMVPA/mvpa2/base/collections.py", line 139, in
_get_virtual
return self._get()
File "/media/sf_mac/git/pyMVPA/mvpa2/base/attributes.py", line 185, in
_get
raise UnknownStateError("Unknown yet value of %s" % (self.name))
UnknownStateError: Unknown yet value of training_stats

FAIL: test_james_problem (mvpa2.tests.test_rfe.RFETests)

Traceback (most recent call last):
File "/media/sf_mac/git/pyMVPA/mvpa2/tests/test_rfe.py", line 421, in
test_james_problem
assert(len(cv_storage.storage) == len(dataset.sa['chunks'].unique))
AssertionError

FAIL: test_james_problem_multiclass (mvpa2.tests.test_rfe.RFETests)

Traceback (most recent call last):
File "/media/sf_mac/git/pyMVPA/mvpa2/tests/test_rfe.py", line 478, in
test_james_problem_multiclass
assert(len(cv_storage.storage) == len(dataset.sa['chunks'].unique))
AssertionError

additional errors caused by pprocess+pickle

FAIL: Basic testing of DistPValue

Traceback (most recent call last):
File "/media/sf_mac/git/pyMVPA/mvpa2/testing/tools.py", line 179, in
newfunc
return func(_arg, *_kwargs)
File "/media/sf_mac/git/pyMVPA/mvpa2/tests/test_transformers.py", line
130, in test_dist_p_value
self.assertEqual(distPValue.ca.positives_recovered[1], 0)
AssertionError: 2 != 0

FAIL: Some really basic testing for match_distribution

Traceback (most recent call last):
File "/media/sf_mac/git/pyMVPA/mvpa2/tests/test_stats_sp.py", line 217,
in test_match_distribution
self.assertTrue(inorm <= 30)
AssertionError: False is not true

…hat is going on

coveralls · 2013-09-03T23:56:20Z

Changes Unknown when pulling 7bf68d6 on nno:_tent/parallel into * on PyMVPA:master*.

mih · 2014-02-27T12:10:05Z

Alright, this is another "zombie conversion" ;-)

@nno where are you on this one? Is it working for you now?

I am increasingly in need of parallelization for RepeatedMeasure, so this would come very handy.

nno · 2014-02-27T13:58:18Z

@Hanke I just updated this branch with the latest master, and things seem to work. That is, unit tests are not failing.

The use of parallelization for RepeatedMeasures is switched off for now, as that one is behaving naughty with conditional attributes IIRC.

yarikoptic · 2014-02-27T14:45:32Z

I would prefer to leave this one for the next release since new
parallelization would require thorough testing before making it public
and there is no time for that ATM.

On Thu, 27 Feb 2014, Nikolaas N. Oosterhof wrote:

[1]@Hanke I just updated this branch with the latest master, and things
seem to work. That is, unit tests are not failing.

The use of parallelization for RepeatedMeasures is switched off for now,
as that one is behaving naughty with conditional attributes IIRC.

Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Senior Research Associate, Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419
WWW: http://www.linkedin.com/in/yarik

mih · 2014-02-27T14:49:58Z

@nno I can confirm that it passes for me too.

I guess it is expected that the tests do not benefit from any kind of speed-up due to these changes on a multi-core system, right -- all is refactoring..

Would be good to crack the nut on the lost conditional attributes -- can even recall properly what the issue was...

mih · 2014-02-27T14:52:39Z

Assigned milestone 2.4

bpinsard · 2016-04-21T14:14:58Z

I would be greatly interested by parallelization of RepeatedMeasure. What is the status of this PR? Thanks

nno added 21 commits February 23, 2013 16:57

NF: seperate class for modular parallel processing

aa47b0a

RF: use parallelization module

8b9e46a

Merge branch 'master' into _tent/parallel

61f0da3

RF: rewriting of parallelization classes

b9151e0

NF: unit test for parallelization

ff51f69

ENH: better pickling support for VolumeBasedSurface

e6243b9

ENH: support NaN for comparison of surfaces

7a6d163

NF: support for verbose surface based voxel selection

853c6c0

NF: support in base for verbose parallelization

30d0601

RF: minor rewrite of unit test

e8c5870

NF: support to raise an error when different geometry; more informati…

0b8b6d3

…ve error messages

RF: rewrite voxel selection with parallelization

70c4217

RF: parallelization in searchlight

618ab3a

RF: use two seperate functions for repeated measure

7b38304

PL: minor

819d356

RF: minor code cleanup

b878b4a

BF: make sure proper debug message even if input is generator and thu…

9c45703

…s length unknown

NF+RF: parallelization support for repeated measure

b131f4c

BF: use proper function to get best results backend

c34b009

Merge branch 'master' into _tent/parallel

61d37ef

ENH: use a process id for each generated element to make it even sure…

53bfe2b

…r there are unique file names in case the hdf5 backend is used

nno added 3 commits February 26, 2013 13:07

Merge branch 'master' into _tent/parallel

7858857

ENH: include parallelization in tests

62b2a81

BF: proper storing results in file when using hdf5 backend

f02fb79

nno added 4 commits September 3, 2013 16:43

RF: resolve merge conflicts coming from earlier parallelization module

6ba9a30

Merge branch 'master' into _tent/parallel

249fdc6

ENH: better init

8b12483

BF: fixed undesired update of __init__

40e7e2e

nno added 3 commits September 3, 2013 18:57

BK: disabled unit test in test projections - XXX have to figure out w…

ca1014a

…hat is going on

ENH: use single thread for repeated measures

d3fec89

BF: added a comma in list of units to test

7bf68d6

MISC: resolve merge conflict

bcdc483

mih added this to the Release 2.4 milestone Feb 27, 2014

Merge branch 'master' into _tent/parallel

0814fc4

yarikoptic force-pushed the master branch from eaaaf14 to 439e744 Compare September 26, 2014 23:38

Merge branch 'master' into _tent/parallel

1988a6d

mih removed this from the Release 2.4 milestone May 3, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tent/parallel #109

tent/parallel #109

nno commented Feb 25, 2013

mih commented Feb 26, 2013

nno commented Feb 26, 2013

coveralls commented Sep 3, 2013

mih commented Feb 27, 2014

nno commented Feb 27, 2014

yarikoptic commented Feb 27, 2014

mih commented Feb 27, 2014

mih commented Feb 27, 2014

bpinsard commented Apr 21, 2016

tent/parallel #109

Are you sure you want to change the base?

tent/parallel #109

Conversation

nno commented Feb 25, 2013

mih commented Feb 26, 2013

nno commented Feb 26, 2013

errors using single thread:

FAIL: Test AUC computation

additional errors caused by pprocess+hdf5

ERROR: Basic tests of metaclass for using regressions as classifiers

ERROR: test_split_classifier (mvpa2.tests.test_clf.ClassifiersTests)

ERROR: Basic tests for TreeClassifier

ERROR: test_values (mvpa2.tests.test_clf.ClassifiersTests)

ERROR: mvpa2.tests.test_multiclf.test_multiclass_classifier_cv

FAIL: test_james_problem (mvpa2.tests.test_rfe.RFETests)

FAIL: test_james_problem_multiclass (mvpa2.tests.test_rfe.RFETests)

additional errors caused by pprocess+pickle

FAIL: Basic testing of DistPValue

FAIL: Some really basic testing for match_distribution

coveralls commented Sep 3, 2013

mih commented Feb 27, 2014

nno commented Feb 27, 2014

yarikoptic commented Feb 27, 2014

mih commented Feb 27, 2014

mih commented Feb 27, 2014

bpinsard commented Apr 21, 2016