Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: replace CFFI with a native C extension #76

Merged
merged 18 commits into from Aug 25, 2021

Conversation

tseaver
Copy link
Contributor

@tseaver tseaver commented Aug 24, 2021

Toward #68.

@tseaver tseaver requested a review from a team as a code owner August 24, 2021 22:09
@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Aug 24, 2021
Also, run unit tests during 'check' scripts.

Drop installation of 'cffi' / 'pycparser'.
$ venv/bin/python setup.py build_ext \
--include-dirs=$(pwd)/usr/include \
--library-dirs=$(pwd)/usr/lib \
--rpath=$(pwd)/usr/lib
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewers: this change, and the equivalent ones in the scripts/ directory files below) is needed to ensure that we build with the expected local version of libcrc32c.so: pip doesn't provide a fine-enough-grained way to pass through these arguments to the bulld_ext command, so we just run it ourselves, and let pip use the already-build extension when making the wheel.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this info. I was originally hoping to upgrade the build system to cibuildwheel because it has built-in cross-compiling support. If I understand cibuildwheel correctly it might not handle more fine-tuned manual steps like this very well, in which case we'll have to manually cross-compile as well. I'm still learning about this process, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't worked with cibuildwheel at all. If it is run from a Github action, maybe we could configure the build_ext step to run before kicking it off?

package_dir={"": "src"},
ext_modules=ext_modules,
cmdclass={"build_ext": BuildExtWithDLL},
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewers: rather than implicitly falling back to a pure-Python version (which is horribly slow, and almost certainly not what the user really wants), I chose not to trap / wrap the build_ext failures: it made debugging such failures on CI much simpler, and I think makes for a more pleasant UX.

We might want to consider doing and explicit build with the CRC32C_PURE_PYTHON envvar set, generating a none-any wheel. OTOH, that means that users might get that wheel unexpectedly.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only impacts the build step but not the end user-facing fallback feature, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. The difference here is that the user has to make an explicit choice at build time to build / install the pure Python fallback: given how poor its performance is, I think it makes sense to have the user opt-in.

Once they've done that, then the runtime warning will be emitted when they first import the library.

"""
return google_crc32c._crc32c_cffi.lib.crc32c_value(chunk, len(chunk))


Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewers: part of the win in this PR is avoiding this Python function call to get the chunk length computed: instead, the extension does that using PyArg_ParseTuple.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -178,32 +178,34 @@ def test_extend_w_multiple_chunks():
crc = 0

for chunk in iscsi_chunks(7):
crc = google_crc32c.extend(crc, chunk)
chunk_bytes = bytes(chunk)
crc = google_crc32c.extend(crc, chunk_bytes)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewers: one consequence of moving to the C extension is that we no longer implicitly take a list-of-int as a substitute for bytes. Given that no real user is ever going to pass that, I don't think it matters.

@tseaver
Copy link
Contributor Author

tseaver commented Aug 25, 2021

@andrewsg, @crwilcox I think this PR is ready for review.

@@ -39,7 +39,7 @@ py -%PYTHON_VERSION% -m pip install cmake

git submodule update --init --recursive

FOR %%V IN (32,64) DO (ddd
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😅

except SystemExit:
# If installation fails, it is likely a compilation error with CFFI
# Try to install again.
warnings.warn(
Copy link
Collaborator

@crwilcox crwilcox Aug 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be good to continue serving this warning? I think there is a runtime one as well, so may not be crucial.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I found was that trapping the SystemExit here caused CI to lose the error message from the build_ext step. I found that way more useful than the warning. Having the warning at runtime is a good belt-and-suspenders, though.

@tseaver
Copy link
Contributor Author

tseaver commented Aug 25, 2021

Note to reviewers: FWIW, I tried first to have the library build generate a static library (libcrc32c.a / crc32c.lib). That worked find for the local linux build, but I couldn't make that work cleanly on CI. I think that would be a better DX (they wouldn't need to get libcrc32c.so / crc32c.dll installed somewhere on the library search path), so we could revisit if you think it is important.

Copy link

@andrewsg andrewsg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for your time and expertise here.

$ venv/bin/python setup.py build_ext \
--include-dirs=$(pwd)/usr/include \
--library-dirs=$(pwd)/usr/lib \
--rpath=$(pwd)/usr/lib

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this info. I was originally hoping to upgrade the build system to cibuildwheel because it has built-in cross-compiling support. If I understand cibuildwheel correctly it might not handle more fine-tuned manual steps like this very well, in which case we'll have to manually cross-compile as well. I'm still learning about this process, though.

@@ -42,6 +42,14 @@ ${VENV}/bin/cmake \
# Install `libcrc32c` into CRC32C_INSTALL_PREFIX.
make all install

cd ${REPO_ROOT}

${VENV}/bin/python setup.py build_ext \

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my own edification, why does this step and the next need to be moved into the build_libcrc32c.sh file due to the migration off of cffi?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pip wheel doesn't provide fine-grained enough support for passing the --include-dirs / --library-dirs / --rpath options down to the build_ext command: we want to ensure that the extension is built against the expected headers / libraries, and so we run the biuld_ext command explicitly to compile the extension before invoking pip wheel.

package_dir={"": "src"},
ext_modules=ext_modules,
cmdclass={"build_ext": BuildExtWithDLL},
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only impacts the build step but not the end user-facing fallback feature, right?

"""
return google_crc32c._crc32c_cffi.lib.crc32c_value(chunk, len(chunk))


Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@tseaver tseaver merged commit b1bf461 into master Aug 25, 2021
@tseaver tseaver deleted the 68-perf-native-c-extension branch August 25, 2021 18:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants