Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark/Expected time for a N-digits address? #27

Open
stevefan1999-personal opened this issue Dec 21, 2019 · 9 comments
Open

Benchmark/Expected time for a N-digits address? #27

stevefan1999-personal opened this issue Dec 21, 2019 · 9 comments

Comments

@stevefan1999-personal
Copy link

I want to know the average case of getting the desired vanity address on various CPUs, compilers, OS, etc.

I compiled my mkp224o with clang 7 and march=native, in Debian Buster and with 2x Xeon E5-2670

hello:

./mkp224o hello -B -S 5 -j 16 -d keys
>calc/sec:5095952.833017, succ/sec:0.000000, rest/sec:159.888078, elapsed:0.100070sec
hello23twa7k536qggxfeqm4orwohwlca3ln6f43b4splcym57msaxid.onion

hellow:

./mkp224o hellow -B -S 5 -j 16 -d keys
...
>calc/sec:7007205.907981, succ/sec:0.000000, rest/sec:0.000000, elapsed:15.111171sec                           [27/9839]
hellowthsxpoy4pq4vensv2drz6qv562d3sfqg5pk4nwqnik4hii53ad.onion

hellowo:

./mkp224o hellowo -B -S 5 -j 16 -d keys
...
>calc/sec:7318805.372164, succ/sec:0.000000, rest/sec:0.000000, elapsed:155.117082sec
hellowoy3ahurx54upkhdut2ww6kwsmzrrtve6imdul4f5yr4kpsb2id.onion

Should I need more samples so that we can calculate the variance? This is pretty random at best.

@cathugger
Copy link
Owner

cathugger commented Dec 21, 2019 via email

@YoRyan
Copy link

YoRyan commented Dec 22, 2019

The math (a geometric discrete probability distribution) tells us that, on average, it will take 32^(# of chars) iterations to find a desired address. So take that number and divide by your calcs/sec to get the expected run time until your first match.

For example, my i7-3770 does ~15,400,000 calcs/sec. To find a 7-character prefix, the anticipated run time is

32^7/15400000 = 2231.15 secs = 37 mins

And indeed, it took me 40 minutes to find my first address.

(Note that this compares very favorably with shallot, which quotes "1 day" for a 1.5GHz CPU and a 7-character prefix.)

@stevefan1999-personal
Copy link
Author

The math (a geometric discrete probability distribution) tells us that, on average, it will take 32^(# of chars) iterations to find a desired address. So take that number and divide by your calcs/sec to get the expected run time until your first match.

For example, my i7-3770 does ~15,400,000 calcs/sec. To find a 7-character prefix, the anticipated run time is

32^7/15400000 = 2231.15 secs = 37 mins

And indeed, it took me 40 minutes to find my first address.

(Note that this compares very favorably with shallot, which quotes "1 day" for a 1.5GHz CPU and a 7-character prefix.)

Did this means it takes for any x in length of digits, it should theoretically take 2^(5x) tries to have at least one match? This is huge 🤔

@YoRyan
Copy link

YoRyan commented Dec 22, 2019

Correct! But it's just an average. You could get especially lucky or unlucky and find something sooner or later.

@cathugger
Copy link
Owner

(Note that this compares very favorably with shallot, which quotes "1 day" for a 1.5GHz CPU and a 7-character prefix.)

"1.5GHz CPU" isn't very descriptive, though, and that table is quite outdated (history), so I'd say it's just difference of CPU powers, and docs being outdated.
One of the reasons I didn't put any stats in README.txt in the first place.

Did this means it takes for any x in length of digits, it should theoretically take 2^(5x) tries to have at least one match?

No. It means, that on average you should get one match every 2^(5x)=32^x tries. That's a little bit different. You can get very lucky and hit much sooner. Or very unlucky and hit much later. I've seen both happening.
Statistically, yeah, you'll get one match every 32^x tries (if using only one filter of x chars), but for that to show you'll need to average over much more than one hit, so I'm not sure if such expectation formula is very trustworthy when N=1.
This stuff is sorta confusing, but I just don't wanna overpromise such things in readme, especially when most users want only one or few addresses.

@scribblemaniac
Copy link
Contributor

I agree with the math that's been presented here so far. If you want to be a bit more conservative, you can calculate the number of trials t it would take to get at least 1 match with n characters and a probability p with the formula t = log(1-p)/log(1-1/32^n). This comes from the distribution X~Binomial(t, 1/32^n) and P(X>=1)=p. Here's a table of what the number of trials looks like for some values common values. The number of digits are on the left and the probability of finding at least one success along the top:

50% 80% 90% 95% 99%
2 710 1648 2357 3067 4714
3 22713 52738 75450 98163 150900
4 726818 1687618 2414435 3141252 4828869
5 23258160 54003775 77261934 100520094 154523868
6 744261118 1728120799 2472381917 3216643035 4944763834
7 23816355775 55299865590 79116221365 102932577139 158232442729

The way to read this is like so: "It is expected that you will find at least one match for a 2 character search after 710 trials 50% of the time." So while yes on average it will take about ~24 billion trials to find a match for a 7 digit prefix, it will take ~158 billion trials to be 99% sure that you will find a match.

@cathugger
Copy link
Owner

cathugger commented Dec 22, 2019 via email

@YoRyan
Copy link

YoRyan commented Dec 22, 2019

"1.5GHz CPU" isn't very descriptive, though, and that table is quite outdated (history), so I'd say it's just difference of CPU powers, and docs being outdated.

It also reflects the fact that mkp224o is a lot faster, especially with the batching code enabled. ;)

This stuff is sorta confusing, but I just don't wanna overpromise such things in readme, especially when most users want only one or few addresses.

Right, there is always some uncertainty involved, but as we've seen, there is a way to quantify the exact probabilities. It is possible to be more precise than "some time within the next 30 seconds or next 30 years."

@DadiBit
Copy link

DadiBit commented Apr 3, 2020

You can calculate the number of trials t it would take to get at least 1 match with n characters and a probability p with the formula t = log(1-p)/log(1-1/32^n).

In C++ (I think in C too) it's a bit faster log(1-p)/log(1-1/32)*pow(32,n-1); rather then log(1-p)/log(1-1/pow(32,n));. This is of course a one time calculation before running the program and requires way less the a ms, so it shouldn't matter too much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants