Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculating distance more quickly #2

Open
jncornett opened this issue Aug 23, 2012 · 1 comment
Open

Calculating distance more quickly #2

jncornett opened this issue Aug 23, 2012 · 1 comment

Comments

@jncornett
Copy link

Rather than using math functions (such as math.sqrt and math.pow) to calculate euclidean distance, use the builtin operators: x**2 and x**0.5

The code object disassembly shows fewer instructions in the latter:

>>> def foper(x1, y1, x2, y2): #Using builtins
...     return ((x2-x1)**2 + (y2-y1)**2)**0.5
... 
>>> import math
>>> def ffunc(x1, y1, x2, y2): #Using functions from math module
...     return math.sqrt(math.pow(x2-x1,2) + math.pow(y2-y1,2))
... 
>>> dis.dis(foper)
  2           0 LOAD_FAST                2 (x2)
              3 LOAD_FAST                0 (x1)
              6 BINARY_SUBTRACT     
              7 LOAD_CONST               1 (2)
             10 BINARY_POWER        
             11 LOAD_FAST                3 (y2)
             14 LOAD_FAST                1 (y1)
             17 BINARY_SUBTRACT     
             18 LOAD_CONST               1 (2)
             21 BINARY_POWER        
             22 BINARY_ADD          
             23 LOAD_CONST               2 (0.5)
             26 BINARY_POWER        
             27 RETURN_VALUE        
>>> dis.dis(ffunc)
  2           0 LOAD_GLOBAL              0 (math)
              3 LOAD_ATTR                1 (sqrt)
              6 LOAD_GLOBAL              0 (math)
              9 LOAD_ATTR                2 (pow)
             12 LOAD_FAST                2 (x2)
             15 LOAD_FAST                0 (x1)
             18 BINARY_SUBTRACT     
             19 LOAD_CONST               1 (2)
             22 CALL_FUNCTION            2
             25 LOAD_GLOBAL              0 (math)
             28 LOAD_ATTR                2 (pow)
             31 LOAD_FAST                3 (y2)
             34 LOAD_FAST                1 (y1)
             37 BINARY_SUBTRACT     
             38 LOAD_CONST               1 (2)
             41 CALL_FUNCTION            2
             44 BINARY_ADD          
             45 CALL_FUNCTION            1
             48 RETURN_VALUE        
>>> 

This can also be empirically show by the timeit results:

>>> import timeit
>>> toper = timeit.Timer("((23-76)**2 + (45-43)**2)**0.5")
>>> tfunc = timeit.Timer("math.sqrt(math.pow(23-76,2) + math.pow(45-43,2))", "import math")
>>> toper.timeit()
0.11890888214111328
>>> tfunc.timeit()
0.4799330234527588

Using the operators is about 75% faster.

@vschiavoni
Copy link

Using native operators is faster but not that faster. These are my results when *_2 and *_0.5 are replaced in the source code of clustering.py

%time python2.7 main.py > k-means-result.txt
python2.7 main.py > k-means-result.txt 22.45s user 0.14s system 99% cpu 22.610 total

This is the version where only math.pow is replaced with native operators:
%time python2.7 main.py > k-means-result_FAST.txt
python2.7 main.py > k-means-result_FAST.txt 19.85s user 0.14s system 99% cpu 20.001 total

This is the version where both math.pow and math.sqrt are replaced:

[veleno@irmbp:k-means]%time python2.7 main.py > k-means-result_FASTER.txt
python2.7 main.py > k-means-result_FASTER.txt 19.76s user 0.15s system 99% cpu 19.929 total

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants