Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid copying data back and forth between the Python runtime and the C++ library #66

Open
unzvfu opened this issue Feb 21, 2018 · 2 comments
Assignees

Comments

@unzvfu
Copy link

unzvfu commented Feb 21, 2018

According to the benchmark, copying to and from the C++ library currently takes 99% of the time.
Edit: Not true. Copying does take a lot of time in some places (see #79), but the copying mentioned below is not "99%" bad, more like "5%" bad.
Really the C++ library should just be passed an address to the memory it needs to look at directly, making the copy redundant. This may take advantage of a cleaner interface to the arrays provided by a resolution to issue #64.

This issue partly supersedes issue #29 where Brian says:

We also are dealing with "nice" python bitarrays which require some manipulation (1) before passing into native code. We might want to consider adding an accelerated interface that takes our custom bit packed data as plain python bytes.

1: [ffi.new("char[128]", bytes(f[0].tobytes())) for f in filters1]

I've started experimenting in

  • Branch feature-chunked-speedup for a C implementation of many x many comparisons.
  • Branch feature-direct-cffi builds ontop of that with a look at accessing bitarray data from C without a memcopy. Only does a bitarray popcount for now.
# Assume ba is a bitarray
addr = ba.buffer_info()[0]
pntr = ffi.cast("char *", addr)
lib.popcount(pntr)

The comments in issue #18 might still be relevant.

Aha! Link: https://csiro.aha.io/features/ANONLINK-68

@hardbyte
Copy link
Collaborator

We still need to measure just how much of the time is taken by memory copying and type conversions.

@hardbyte
Copy link
Collaborator

There was some further discussion on switching to array - #121 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants