Benchmark #9

sovaa · 2017-09-13T16:13:45Z

Simple benchmark for original(). More benchmarking could be done if originality.py abstracts away the managers or by implementing in-memory versions of them.

Requires two new dependencies, multiprocessing and randomstate (drop-in replacement for numpy's RandomState that plays well with multiprocessing).

Will help later with docker setup.

Added requirements.txt.

philipcmonk · 2017-09-13T16:23:30Z

Can you add the dependencies to setup.py and requirements.txt?

… benchmark

philipcmonk · 2017-09-13T16:43:54Z

It would also be nice to have this for concordance.

philipcmonk · 2017-09-13T19:18:56Z

benchmark_originality.py

+
+
+def check_original(_: int):
+    submission_1, submission_2 = gen_submission(), gen_submission()


Can we benchmark each user against, say, 1000 other users rather than generate new pairs each time? There may be some potential gains in pre-processing the data.

Can you include e.g. 50 user submissions to the repo? Then the test can add random noise to them to create a few thousands more 'almost unique' during setup. Could be a gziped file to be decompressed during setup as well.

That's a good idea. I'll create an issue for it, but for the time being, generating them in the way you're doing it looks fine (except that we should check each user against 1000 other users).

I realized my description above may not be very clear. I mean that the benchmark should capture this sort of optimization: #12

sovaa · 2017-09-14T09:08:23Z

Added benchmark_concordance.py for Create performance benchmark #2
Added 1500 rows of sample train, predict and results data for tests to sample from for Create example predictions for testing #15

The concordance benchmark is sampling in batched from the example data and adding normally distributed noise to create larger benchmark data.

If the PR is accepted for the bounty my numerai account is sovaa.

philipcmonk

Looks good, except the one issue.

philipcmonk · 2017-09-14T14:23:01Z

benchmark_concordance.py

+from concordance import has_concordance
+from concordance import get_sorted_split
+
+N_SAMPLES = 100_000


As much as I like the new number format, we use Python 3.5 internally, so this should be 100000.

Ah, runtime.txt said 3.6.1 sot that's why. I'll change the number format.

philipcmonk · 2017-09-15T14:54:54Z

Thanks! I've sent 30 NMR to sovaa. It should be in your account now.

Shalmezad and others added 7 commits September 13, 2017 11:50

Added requirements.txt.

b9971ef

Will help later with docker setup.

benchmark for originality using randomstate and multiprocessing

c7b0730

removed unused imports

906a1b1

Locked down package versions, included packages listed in numerai#8

3e8a7a4

added some comments explaining some choices

b74a2aa

Updated pandas to 0.20.3

84453f4

Merge pull request numerai#8 from Shalmezad/requirements

edd7be8

Added requirements.txt.

Oscar Eriksson added 4 commits September 14, 2017 00:42

benchmark for originality using randomstate and multiprocessing

7555980

removed unused imports

f31daf9

added some comments explaining some choices

01f0d6a

Merge branch 'benchmark' of github.com:sovaa/submission-criteria into…

ef09368

… benchmark

philipcmonk requested a review from zoso95 September 13, 2017 16:44

added randomstate to requirements and setup

3942704

zoso95 approved these changes Sep 13, 2017

View reviewed changes

philipcmonk suggested changes Sep 13, 2017

View reviewed changes

philipcmonk mentioned this pull request Sep 13, 2017

Create example predictions for testing #15

Open

benchmark concordance with data generation

4c24691

philipcmonk suggested changes Sep 14, 2017

View reviewed changes

changed from python3.6 number format to old one

e23271d

philipcmonk approved these changes Sep 15, 2017

View reviewed changes

philipcmonk merged commit af92b69 into numerai:master Sep 15, 2017

philipcmonk mentioned this pull request Sep 15, 2017

Create performance benchmark #2

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark #9

Benchmark #9

sovaa commented Sep 13, 2017

philipcmonk commented Sep 13, 2017

philipcmonk commented Sep 13, 2017

philipcmonk Sep 13, 2017

sovaa Sep 13, 2017

philipcmonk Sep 13, 2017

philipcmonk Sep 13, 2017

sovaa commented Sep 14, 2017

philipcmonk left a comment

philipcmonk Sep 14, 2017

sovaa Sep 15, 2017

philipcmonk commented Sep 15, 2017



		def check_original(_: int):
		submission_1, submission_2 = gen_submission(), gen_submission()

Benchmark #9

Benchmark #9

Conversation

sovaa commented Sep 13, 2017

philipcmonk commented Sep 13, 2017

philipcmonk commented Sep 13, 2017

philipcmonk Sep 13, 2017

Choose a reason for hiding this comment

sovaa Sep 13, 2017

Choose a reason for hiding this comment

philipcmonk Sep 13, 2017

Choose a reason for hiding this comment

philipcmonk Sep 13, 2017

Choose a reason for hiding this comment

sovaa commented Sep 14, 2017

philipcmonk left a comment

Choose a reason for hiding this comment

philipcmonk Sep 14, 2017

Choose a reason for hiding this comment

sovaa Sep 15, 2017

Choose a reason for hiding this comment

philipcmonk commented Sep 15, 2017