Skip to content

Example auto tune as of 2018 12 06 with Titan X

magnum edited this page Jan 18, 2019 · 4 revisions

Example auto-tune. For the above example, we probably want to set min_keys_per_crypt to 12288 since that's the point where it lifts off - or possibly 24576 since that's when consecutive "!" or "+" ends:

Calculating best GWS for LWS=512; max. 200ms single kernel invocation.
Raw speed figures including buffer transfers:
gws:      3072    70471c/s  1154455922 rounds/s  43.591ms per crypt_all()!  <--- ~45 ms
gws:      6144   141378c/s  2316054396 rounds/s  43.457ms per crypt_all()!  <--- ~45 ms
gws:     12288   271768c/s  4452103376 rounds/s  45.215ms per crypt_all()+  <--- ~45 ms
gws:     24576   286046c/s  4686005572 rounds/s  85.916ms per crypt_all()+
gws:     49152   287033c/s  4702174606 rounds/s 171.241ms per crypt_all()
gws:     98304   288612c/s  4728041784 rounds/s 340.609ms per crypt_all()
gws:    196608   289690c/s  4745701580 rounds/s 678.682ms per crypt_all()+
gws:    393216   290196c/s  4753990872 rounds/s    1.354s per crypt_all()
gws:    786432   290719c/s  4762558658 rounds/s    2.705s per crypt_all()
gws:   1572864   290919c/s  4765835058 rounds/s    5.406s per crypt_all()
gws:   3145728   291966c/s  4782987012 rounds/s   10.774s per crypt_all()

Local worksize (LWS) 512, global worksize (GWS) 196608
DONE
Speed for cost 1 (key version [0:PMKID 1:WPA 2:WPA2 3:802.11w]) of 2
Raw:    289129 c/s real, 289129 c/s virtual, GPU util: 99%

Same with more details:

Calculating best GWS for LWS=512; max. 200ms single kernel invocation.
Raw speed figures including buffer transfers:
xfer: 34.784us, init: 64.736us, loop: 78x555.744us, pass2: 46.080us, final: 83.584us, xfer: 9.472us
gws:      3072    70471c/s  1154455922 rounds/s  43.591ms per crypt_all()!
xfer: 67.040us, init: 53.120us, loop: 78x554.112us, pass2: 40.256us, final: 54.560us, xfer: 16.928us
gws:      6144   141378c/s  2316054396 rounds/s  43.457ms per crypt_all()!
xfer: 131.360us, init: 70.880us, loop: 78x574.848us, pass2: 51.264us, final: 86.304us, xfer: 31.584us
gws:     12288   271768c/s  4452103376 rounds/s  45.215ms per crypt_all()+
xfer: 259.872us, init: 110.880us, loop: 78x1.093ms, pass2: 82.816us, final: 90.400us, xfer: 60.832us
gws:     24576   286046c/s  4686005572 rounds/s  85.916ms per crypt_all()+
xfer: 516.672us, init: 210.816us, loop: 78x2.180ms, pass2: 168.864us, final: 162.400us, xfer: 119.488us
gws:     49152   287033c/s  4702174606 rounds/s 171.241ms per crypt_all()
xfer: 1.031ms, init: 350.688us, loop: 78x4.337ms, pass2: 334.848us, final: 298.560us, xfer: 237.024us
gws:     98304   288612c/s  4728041784 rounds/s 340.609ms per crypt_all()
xfer: 2.057ms, init: 623.840us, loop: 78x8.644ms, pass2: 641.504us, final: 551.424us, xfer: 471.200us
gws:    196608   289690c/s  4745701580 rounds/s 678.682ms per crypt_all()+
xfer: 4.125ms, init: 1.147ms, loop: 78x17.260ms, pass2: 1.235ms, final: 1.033ms, xfer: 940.096us
gws:    393216   290196c/s  4753990872 rounds/s    1.354s per crypt_all()
xfer: 8.295ms, init: 2.177ms, loop: 78x34.462ms, pass2: 2.372ms, final: 2.010ms, xfer: 1.876ms
gws:    786432   290719c/s  4762558658 rounds/s    2.705s per crypt_all()
xfer: 16.451ms, init: 4.284ms, loop: 78x68.882ms, pass2: 4.608ms, final: 3.950ms, xfer: 3.765ms
gws:   1572864   290919c/s  4765835058 rounds/s    5.406s per crypt_all()
xfer: 32.922ms, init: 8.428ms, loop: 78x137.270ms, pass2: 9.164ms, final: 7.818ms, xfer: 7.514ms
gws:   3145728   291966c/s  4782987012 rounds/s   10.774s per crypt_all()
xfer: 65.836ms, init: 16.762ms, loop: 78x274.142ms (exceeds 200ms)

Local worksize (LWS) 512, global worksize (GWS) 196608
DONE
Speed for cost 1 (key version [0:PMKID 1:WPA 2:WPA2 3:802.11w]) of 2
Raw:    289129 c/s real, 289129 c/s virtual, GPU util: 99%