Feature/pubfig83 #46

yamins81 · 2012-03-01T19:06:34Z

standardized splits and tests -- @jaberg, @npinto, your comments would be great

Comments on how things are done are in the test_pubfig83 file.

yamins81 · 2012-03-01T19:38:51Z

@zstone, you comments would be great as well

npinto · 2012-03-01T21:39:07Z

skdata/pubfig83.py

+                p = rng.permutation(len(remainder))
+                if 'Train%d' % _ind not in splits:
+                    splits['Train%d' % _ind] = []
+                splits['Train%d' % _ind].extend(remainder[p[:80]].copy())


any reason to have hardcoded values here (eg 90, 10, 80) ?

I originally preferred the variable ntest, ntrain approach, like you suggest. But, on the other hand, it might be good to have "standard" splits. I've occasionally heard @zstone and @davidcox talking about standard 90/10 splits. Of course, we COULD make them variables and then just set defaults. I think it depends on how the creators of this dataset intended to be used -- @npinto, you and @zstone worked on this together (?) so if you think giving people guidance that suggests that non-standard-sized splits are "OK", then that seems fine to me.

I support the creation of a set of standard splits of any dataset, but I would favor canonizing a minimal, language-agnostic materialization of said standard splits rather than defining them implicitly by code in a specific language that must be run correctly in a specific environment to regenerate the right splits. Concretely, I would advocate looking for the simplest-possible JSON format that would completely specify, say, ten 90/10 splits with reference to relative image paths within a canonical archive of a dataset.

That's just my opinion, though, and it may not fit the way scikit-data already works!

@zstone I'm not sure if it's obvious how to use JSON inside scikit-data in a natural way. however, could you comment on the split scheme currently proposed in the code? I just added support for non-standard splits. I put the split related data in the dataset init, because we might want to have the split data travel with the dataset instance. The current code has the idea that to change splits, you have to re-instantiate a new instance (or I guess, monkey around inside the code; probably the various attributes should be private). What do you think?

jaberg · 2013-03-14T04:41:40Z

@yamins81 Coming back to this after forever - is this the branch of pubfig83 that went into our ICML2013 paper? If not, could you point me to that code? I'd like to merge it in (and I'm sorry for not doing it a long time ago!)

yamins81 and others added 15 commits December 6, 2011 16:41

fbo initial commit

41d0948

better defaults

87ba532

generic labeling function of metadata

16eb447

boto stuff

09f8f5b

Merge branch 'feature/fbo'

62443e2

Merge branch 'master' of github.com:yamins81/scikit-data

708e034

change in comments

cf73248

Merge branch 'upstream/master'

94c4c56

Merge branch 'upstream/master'

5c9c94e

Implementation of the _classification_task methods

7fe4454

changing larray

38c329c

pep8 a bit

6597105

Merge branch 'upstream/master'

524ac52

pubfig83 split suttf

65f302b

more tests

573a4a1

comments and pep8 in test_pubfig83

bc20c27

npinto reviewed Mar 1, 2012
View reviewed changes

yamins81 added 7 commits March 2, 2012 14:03

non-hard-coded spits

f5743a9

better testing of id non-intersection and coverage

808aab6

Merge branch 'feature/pubfig83'

15b3369

better setup

3bdcfdf

verbose stuff

fb46221

handling testing

1a5b30e

view2 classification splits

2f42a23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/pubfig83 #46

Feature/pubfig83 #46

yamins81 commented Mar 1, 2012

yamins81 commented Mar 1, 2012

npinto Mar 1, 2012

yamins81 Mar 1, 2012

zstone Mar 1, 2012

yamins81 Mar 2, 2012

jaberg commented Mar 14, 2013

Feature/pubfig83 #46

Are you sure you want to change the base?

Feature/pubfig83 #46

Conversation

yamins81 commented Mar 1, 2012

yamins81 commented Mar 1, 2012

npinto Mar 1, 2012

Choose a reason for hiding this comment

yamins81 Mar 1, 2012

Choose a reason for hiding this comment

zstone Mar 1, 2012

Choose a reason for hiding this comment

yamins81 Mar 2, 2012

Choose a reason for hiding this comment

jaberg commented Mar 14, 2013