Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BLD, TST: Build and test Pyodide wheels for pandas in CI #57896

Merged
merged 39 commits into from
May 8, 2024

Conversation

agriyakhetarpal
Copy link
Contributor

@agriyakhetarpal agriyakhetarpal commented Mar 18, 2024

Description

This PR adds a workflow, emscripten.yml, to the GitHub Actions CI activities where WebAssembly wheels for pandas using Pyodide/Emscripten are built and then tested. I have fixed up a few of the tests, but most of them are skipped for now and these differences probably do not affect major functionality.

@agriyakhetarpal
Copy link
Contributor Author

Here is one of the latest workflow runs on my fork: https://github.com/agriyakhetarpal/pandas/actions/runs/8331354944/job/22798065116

The associated failures right now are mostly coming from a singular test that (which is associated with a hypothesis generator that brings additional test cases?)

For example, there is a floating point inaccuracy, here – which could be corrected by setting the relative and absolute tolerances differently:

E           AssertionError: DataFrame.iloc[:, 2] (column name="dt_as_dt") are different
E           
E           DataFrame.iloc[:, 2] (column name="dt_as_dt") values are different (100.0 %)
E           [index]: [0, 1]
E           [left]:  [253402127999999, 1564703999999]
E           [right]: [253402127999998, 1564703999998]

and some tzlocal tests, which seem to bring overflow errors. Note: this output is truncated, please see the entirety of them in the workflow logs:

_______________ test_apply_out_of_range[tzlocal()-BusinessHour] ________________
[XPASS(strict)] OverflowError inside tzlocal past 2038
_____________ test_apply_out_of_range[tzlocal()-BusinessMonthEnd] ______________
[XPASS(strict)] OverflowError inside tzlocal past 2038
____________ test_apply_out_of_range[tzlocal()-BusinessMonthBegin] _____________
[XPASS(strict)] OverflowError inside tzlocal past 2038

I am not completely sure how to fix these tests – I don't get why these aren't skipped completely? I would appreciate another pair of eyes on this; happy to receive pointers from someone experienced with the codebase :)

@WillAyd
Copy link
Member

WillAyd commented Mar 19, 2024

I am not completely sure how to fix these tests – I don't get why these aren't skipped completely? I would appreciate another pair of eyes on this; happy to receive pointers from someone experienced with the codebase :)

At least for the second test it looks like it is imperatively skipping for platforms that are not 64 bit. I take it the platform these are running on is still 64 bit but we are cross compiling to a 32 bit WASM target?

if freq == "YE" and not IS64 and isinstance(tz, tzlocal):

@agriyakhetarpal
Copy link
Contributor Author

At least for the second test it looks like it is imperatively skipping for platforms that are not 64 bit. I take it the platform these are running on is still 64 bit but we are cross compiling to a 32 bit WASM target?

Hi, @WillAyd, yes – cross-compilation is indeed what is happening here and the platform is 64-bit, but the Python interpreter that runs the test suite is 32-bit (so it should, in theory, skip this, but it doesn't). I disabled the check in d54d198 which I hope should fix it.

@mroeschke mroeschke added the Testing pandas testing functions or related to the test suite label Mar 20, 2024
See pyodide/pyodide#4654 for
more discussion. This commit resolves a build error
coming from the `pyodide build` command which
broke due to a new `build` release by PyPA.
@agriyakhetarpal
Copy link
Contributor Author

Pyodide had a new release recently to fix an error in the pyodide build command. I have just merged main and pushed a944f52 to fix that. The next step from my end would be to fix some of the failing tests (which I did not have much progress on the last time and I don't want them to xfail them yet), but also to wait here until #53007 lands – I suppose that it would make running the tests against a built wheel a bit more difficult, but without the PR getting merged, I'm not sure how that will be navigated here. The good thing is that pandas-tests will be a pure Python package and will have a few APIs to collect and pass data files around, and installation for pure Python wheels is supported out-of-the-box in Pyodide.

If there's something else here that I should do in the meantime, I'm happy to do that – open to suggestions about resolving the two failing tests.

Copy link
Contributor

github-actions bot commented May 2, 2024

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label May 2, 2024
@agriyakhetarpal
Copy link
Contributor Author

Putting up my response to the above comment that I am still interested in working on this.

@rgommers
Copy link
Contributor

rgommers commented May 2, 2024

@agriyakhetarpal it seems reasonable to either skip the tzlocal test under WASM or lower the test tolerance specifically when running under WASM. If that is all that is needed to get the test suite to pass, I suggest doing that and then marking this as ready for review. It's easier for reviewers to give feedback when something works already.

@mroeschke mroeschke added this to the 3.0 milestone May 3, 2024
@mroeschke
Copy link
Member

Thanks for sticking with this. It's looking close to merge

@agriyakhetarpal
Copy link
Contributor Author

With most of the tests skipped where it is not possible or relevant to fix them, we are down to zero failures, and all other tests pass or xpass (or are xfailed for other reasons). The logs from the workflow run are here: https://github.com/pandas-dev/pandas/actions/runs/8945452670/job/24574452414. The doc build failures at the time of writing look to be unrelated to the changes made in this PR.

Some of the tests that have been skipped can be, perhaps, adapted to use a 32-bit version of them; I am happy to add a few of those where necessary. I'm marking this PR ready for review so that it can get the attention of more maintainers.

@agriyakhetarpal agriyakhetarpal changed the title [WIP]: BLD, TST: Build and test Pyodide wheels for pandas in CI BLD, TST: Build and test Pyodide wheels for pandas in CI May 3, 2024
@agriyakhetarpal agriyakhetarpal marked this pull request as ready for review May 3, 2024 22:41
@mroeschke mroeschke merged commit 4f743f9 into pandas-dev:main May 8, 2024
47 checks passed
@mroeschke
Copy link
Member

Great work here @agriyakhetarpal

@agriyakhetarpal agriyakhetarpal deleted the add-emscripten-ci branch May 8, 2024 17:01
@agriyakhetarpal
Copy link
Contributor Author

Thanks for the merge, @mroeschke! I wish to ask if I can write a PR directly for pushing these wheels nightly to the scientific-python Anaconda.org index, or is it preferred to open an issue first and then link a PR to it?

For context, this is a part of the greater goal as highlighted in #57891 (comment), in order to push towards interactive Sphinx-based documentation via JupyterLite. Doing this will be a much more trivial change in comparison; I am happy to contribute further.

@agriyakhetarpal
Copy link
Contributor Author

An example that highlights some interactive documentation is available for the API reference for PyWavelets: https://pywavelets.readthedocs.io/en/latest/ref/dwt-discrete-wavelet-transform.html (work on the Usage Examples section is underway)

@mroeschke
Copy link
Member

Feel free just to open a PR directly to add the wheels configurations to .github/workflows/wheels.yml and pyproject.toml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: out-of-tree Pyodide builds in CI for pandas
4 participants