Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tent/parallel #109

Open
wants to merge 34 commits into
base: master
Choose a base branch
from
Open

tent/parallel #109

wants to merge 34 commits into from

Conversation

nno
Copy link
Contributor

@nno nno commented Feb 25, 2013

  • added a module misc/parallelization as a wrapper around whatever kind of parallelization implementation might be available.
  • currently supports pprocess and has a fallback single process implementation that should always work.
  • refactored Searchlight and surface-based voxel selection code
  • added parallelization support to RepeatedMeasure

@mih
Copy link
Member

mih commented Feb 26, 2013

Does it pass the full test suite for you -- not just the subset that travis runs? In the past I had problems with serialization of one or the other type that is generated by the test suite. It failed for me once it had to go into another process.

@nno
Copy link
Contributor Author

nno commented Feb 26, 2013

On 26 February 2013 12:59, Michael Hanke notifications@github.com wrote:

Does it pass the full test suite for you -- not just the subset that
travis runs?

I didn't realize that travis only tests a subset... and the answer is no.

(1) Using pprocess and 'pickle' ('native') to store results quite a few
errors: FAILED (SKIP=15, errors=5, failures=5)

(2) Using pprocess and 'hdf5' : FAILED (SKIP=15, errors=5, failures=3)

(3) one error when using a single thread process - and this happens not
only in my branch but also in master. Actually a bit worrying - anyone
noticed this one before?

For reference I'm including the errors for each of these cases (note that
these are cumulative - I removed duplicates). Tests were run on neurodebian
and virtualbox with two cores.

For those who want to try out these scenarios, look in
mvpa2/measures/base.py:_call(...).

a) Parallelizer = parallelization.get_best_parallelizer(nproc=1)

This sets the number of processes to use to 1, and thus uses a single
thread and no hdf5 or pickle

b) results_backend = 'native'

This sets the backend to pickle rather than

c) results_backend = Parallelizer.get_best_results_backend()

This sets the backend to hdf5 - only effective if nproc!=1 (None or an int
greater than 1) and pprocess available.

In the past I had problems with serialization of one or the other type that

is generated by the test suite. It failed for me once it had to go into
another process.

That seems to be happening here as well. Do you recall or have ideas which
types could be causing this serialization trouble? You and Yarik should be
more familiar with that part of the code as either of you, or both, wrote
it :-)

errors using single thread:

FAIL: Test AUC computation

Traceback (most recent call last):
File "/media/sf_mac/git/pyMVPA/mvpa2/tests/test_transerror.py", line 316,
in test_auc
'AUC=%.2g among %s' % (mauc, stats['AUC']))
AssertionError:
Single scenario lead to failures of unittest test_auc:
on
clf=<kNN(k=5)> :
All AUCs must be above chance. Got minimal AUC=nan among [nan, nan]
clf=<kNN(k=5, voting='majority')> :
All AUCs must be above chance. Got minimal AUC=nan among [nan, nan]
clf=<kNN on SMLR(lm=1) non-0> :
All AUCs must be above chance. Got minimal AUC=nan among [nan, nan]
clf=<kNN on 5%(ANOVA)> :
All AUCs must be above chance. Got minimal AUC=nan among [nan, nan]
clf=<kNN on 50(ANOVA)> :
All AUCs must be above chance. Got minimal AUC=nan among [nan, nan]

additional errors caused by pprocess+hdf5

ERROR: Basic tests of metaclass for using regressions as classifiers

Traceback (most recent call last):
File "/media/sf_mac/git/pyMVPA/mvpa2/testing/sweepargs.py", line 67, in
do_sweep
method(args, *kwargs)
File "/media/sf_mac/git/pyMVPA/mvpa2/tests/test_clf.py", line 968, in
test_regression_as_classifier
self.assertEqual(clf.ca.distances.shape,
File "/media/sf_mac/git/pyMVPA/mvpa2/base/collections.py", line 474, in
getattribute
return self[key].value
File "/media/sf_mac/git/pyMVPA/mvpa2/base/collections.py", line 139, in
_get_virtual
return self._get()
File "/media/sf_mac/git/pyMVPA/mvpa2/base/attributes.py", line 185, in
_get
raise UnknownStateError("Unknown yet value of %s" % (self.name))
UnknownStateError: Unknown yet value of distances

ERROR: test_split_classifier (mvpa2.tests.test_clf.ClassifiersTests)

Traceback (most recent call last):
File "/media/sf_mac/git/pyMVPA/mvpa2/tests/test_clf.py", line 368, in
test_split_classifier
tr_cverror = cv.ca.training_stats.error
File "/media/sf_mac/git/pyMVPA/mvpa2/base/collections.py", line 474, in
getattribute
return self[key].value
File "/media/sf_mac/git/pyMVPA/mvpa2/base/collections.py", line 139, in
_get_virtual
return self._get()
File "/media/sf_mac/git/pyMVPA/mvpa2/base/attributes.py", line 185, in
_get
raise UnknownStateError("Unknown yet value of %s" % (self.name))
UnknownStateError: Unknown yet value of training_stats

ERROR: Basic tests for TreeClassifier

Traceback (most recent call last):
File "/media/sf_mac/git/pyMVPA/mvpa2/testing/tools.py", line 179, in
newfunc
return func(_arg, *_kwargs)
File "/media/sf_mac/git/pyMVPA/mvpa2/tests/test_clf.py", line 594, in
test_tree_classifier
cvtrc = cv.ca.training_stats
File "/media/sf_mac/git/pyMVPA/mvpa2/base/collections.py", line 474, in
getattribute
return self[key].value
File "/media/sf_mac/git/pyMVPA/mvpa2/base/collections.py", line 139, in
_get_virtual
return self._get()
File "/media/sf_mac/git/pyMVPA/mvpa2/base/attributes.py", line 185, in
_get
raise UnknownStateError("Unknown yet value of %s" % (self.name))
UnknownStateError: Unknown yet value of training_stats

ERROR: test_values (mvpa2.tests.test_clf.ClassifiersTests)

Traceback (most recent call last):
File "/media/sf_mac/git/pyMVPA/mvpa2/testing/sweepargs.py", line 67, in
do_sweep
method(args, *kwargs)
File "/media/sf_mac/git/pyMVPA/mvpa2/tests/test_clf.py", line 640, in
test_values
self.assertEqual(len(clf.ca.estimates), ds.nsamples/2)
File "/media/sf_mac/git/pyMVPA/mvpa2/base/collections.py", line 474, in
getattribute
return self[key].value
File "/media/sf_mac/git/pyMVPA/mvpa2/base/collections.py", line 139, in
_get_virtual
return self._get()
File "/media/sf_mac/git/pyMVPA/mvpa2/base/attributes.py", line 185, in
_get
raise UnknownStateError("Unknown yet value of %s" % (self.name))
UnknownStateError: Unknown yet value of estimates

ERROR: mvpa2.tests.test_multiclf.test_multiclass_classifier_cv

Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest
self.test(_self.arg)
File "/usr/lib/python2.7/dist-packages/nose/util.py", line 622, in newfunc
return func(_arg, *kw)
File "/media/sf_mac/git/pyMVPA/mvpa2/testing/sweepargs.py", line 67, in
do_sweep
method(args, *kwargs)
File "/media/sf_mac/git/pyMVPA/mvpa2/testing/sweepargs.py", line 67, in
do_sweep
method(args, **kwargs
)
File "/media/sf_mac/git/pyMVPA/mvpa2/tests/test_multiclf.py", line 152,
in test_multiclass_classifier_cv
assert_equal(str(cv.ca.training_stats), str(mcv.ca.training_stats))
File "/media/sf_mac/git/pyMVPA/mvpa2/base/collections.py", line 474, in
getattribute
return self[key].value
File "/media/sf_mac/git/pyMVPA/mvpa2/base/collections.py", line 139, in
_get_virtual
return self._get()
File "/media/sf_mac/git/pyMVPA/mvpa2/base/attributes.py", line 185, in
_get
raise UnknownStateError("Unknown yet value of %s" % (self.name))
UnknownStateError: Unknown yet value of training_stats

FAIL: test_james_problem (mvpa2.tests.test_rfe.RFETests)

Traceback (most recent call last):
File "/media/sf_mac/git/pyMVPA/mvpa2/tests/test_rfe.py", line 421, in
test_james_problem
assert(len(cv_storage.storage) == len(dataset.sa['chunks'].unique))
AssertionError

FAIL: test_james_problem_multiclass (mvpa2.tests.test_rfe.RFETests)

Traceback (most recent call last):
File "/media/sf_mac/git/pyMVPA/mvpa2/tests/test_rfe.py", line 478, in
test_james_problem_multiclass
assert(len(cv_storage.storage) == len(dataset.sa['chunks'].unique))
AssertionError

additional errors caused by pprocess+pickle

FAIL: Basic testing of DistPValue

Traceback (most recent call last):
File "/media/sf_mac/git/pyMVPA/mvpa2/testing/tools.py", line 179, in
newfunc
return func(_arg, *_kwargs)
File "/media/sf_mac/git/pyMVPA/mvpa2/tests/test_transformers.py", line
130, in test_dist_p_value
self.assertEqual(distPValue.ca.positives_recovered[1], 0)
AssertionError: 2 != 0

FAIL: Some really basic testing for match_distribution

Traceback (most recent call last):
File "/media/sf_mac/git/pyMVPA/mvpa2/tests/test_stats_sp.py", line 217,
in test_match_distribution
self.assertTrue(inorm <= 30)
AssertionError: False is not true

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 7bf68d6 on nno:_tent/parallel into * on PyMVPA:master*.

@mih
Copy link
Member

mih commented Feb 27, 2014

Alright, this is another "zombie conversion" ;-)

@nno where are you on this one? Is it working for you now?

I am increasingly in need of parallelization for RepeatedMeasure, so this would come very handy.

@nno
Copy link
Contributor Author

nno commented Feb 27, 2014

@Hanke I just updated this branch with the latest master, and things seem to work. That is, unit tests are not failing.

The use of parallelization for RepeatedMeasures is switched off for now, as that one is behaving naughty with conditional attributes IIRC.

@yarikoptic
Copy link
Member

I would prefer to leave this one for the next release since new
parallelization would require thorough testing before making it public
and there is no time for that ATM.

On Thu, 27 Feb 2014, Nikolaas N. Oosterhof wrote:

[1]@Hanke I just updated this branch with the latest master, and things
seem to work. That is, unit tests are not failing.

The use of parallelization for RepeatedMeasures is switched off for now,
as that one is behaving naughty with conditional attributes IIRC.

Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Senior Research Associate, Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419
WWW: http://www.linkedin.com/in/yarik

@mih
Copy link
Member

mih commented Feb 27, 2014

@nno I can confirm that it passes for me too.

I guess it is expected that the tests do not benefit from any kind of speed-up due to these changes on a multi-core system, right -- all is refactoring..

Would be good to crack the nut on the lost conditional attributes -- can even recall properly what the issue was...

@mih mih added this to the Release 2.4 milestone Feb 27, 2014
@mih
Copy link
Member

mih commented Feb 27, 2014

Assigned milestone 2.4

@mih mih removed this from the Release 2.4 milestone May 3, 2015
@bpinsard
Copy link

I would be greatly interested by parallelization of RepeatedMeasure. What is the status of this PR? Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants