-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pairwise distances: first implementation #2
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a nice first start! 👏
The generated .c
file is still present. You might want to erase it from your local repo cache using something like:
git rm --cached **/*.c
I left some comment, but in overall it's good: the Python API is concise and the implementation is done in cdef
functions which should map to C code with no much interaction with the CPython interpreter.
Looking forward to your future iterations. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The link for the benchmarks returns a 404.
The implementation looks good. You can now compare with pairwise_distances(metric="l1")
from sklearn on datasets with different sizes (on single thread and then on multi-thread).
src/pairwise_dist/_pairwise_dist.pyx
Outdated
|
||
for i in prange(n_rows_X_a, nogil=True): | ||
for j in range(n_rows_X_b): | ||
_compute_dist(X_a[i], X_b[j], n_features, &distances[i, j]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in this implementation passing a pointer for distances[i, j] seems unecessary. You could just do
distances[i, j] = _compute(dist(X_a[i], X_b[j], n_features))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See this issue scikit-learn/scikit-learn#17299. It looks like a good reproducer for it. Is it faster to do as explained in the issue ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are good with this PR!
No description provided.