Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finish out of tree build system (except xbuildenv deploy) #2823

Merged
merged 43 commits into from Jul 6, 2022

Conversation

hoodmane
Copy link
Member

@hoodmane hoodmane commented Jul 1, 2022

This completes the out of tree build CLI. This PR is paired up with:
numpy/numpy#21895
I have also successfully built scikit-learn, statsmodels, pandas, and astropy with this.

The last thing we need to do after this is set up deployment of the cross build environment. We can deploy one version to s3 for each tagged commit. I will do that in a separate PR after this is merged.

This is on top of #2821.

Checklists

  • Add a CHANGELOG entry
  • Add / update tests
  • Add new / update outdated documentation

]
os.environ["PYTHONPATH"] = ":".join(pythonpath)
try:
hostsitepackages = get_hostsitepackages()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This call will fail if we haven't installed the xbuildenv yet.

Copy link
Member

@ryanking13 ryanking13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @hoodmane. Probably Roman should review this too, but I think we are going in the proper direction. There are some hard-coded strings that would better be parameterized, but I think that can be done in a separate PR.

So if I understand correctly, this PR is to make the following available?

git clone https://github.com/a-package-that-can-be-built-with-pypa-build/pkg
cd pkg
pip install pyodide-build
pywasm
# pkg-1.0.0-wasm-cp310-cp310-emscripten_3_1_14_wasm32.whl


def main(args: argparse.Namespace) -> None:
xbuildenv_path = Path(args.xbuild_env[0])
version = "2"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is your future plan of versioning build envs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One version per Pyodide release. After we merge this I am planning to add a CI job to deploy the cross build environment as part of deploy-release. It would also be nice to periodically deploy a "nightly" build at some point. Maybe once a month or once every other week or something.

rmtree(xbuildenv_path, ignore_errors=True)
with NamedTemporaryFile(suffix=".tar") as f:
urlretrieve(
f"http://pyodide-cache.s3-website-us-east-1.amazonaws.com/xbuildenv/{version}.tar",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this your personal S3 bucket?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Roman set it up. We could use it for other things too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, as long as it remains a bit of dev oriented cache and we make no quality of service promises, I think it should be OK. Eventually, this could probably be also done with github releases though that's harder for nightly deployments.

Also if you can make it a .tar.gz to reduce the bandwidth that would be better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also is there anything platform/OS dependent in this tar? For instance, I see a,

xbuildenv/site-packages-extras/numpy/core/lib/libnpymath.a

is this intentional?

Should we add the linux-x86_64 suffix to indicate that it's linux specific? And also maybe name it something more explicit than "2.tar" otherwise it's not ideal when downloaded manually.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not linux specific, it's an Emscripten static library.

2.tar

I was thinking about naming them by version e.g., 0.21.0.tar.gz. But probably github releases is a good idea for the release versions.

@hoodmane
Copy link
Member Author

hoodmane commented Jul 5, 2022

So if I understand correctly, this PR is to make the following available?

Yup. See here for instance:
https://github.com/hoodmane/numpy/blob/12c656f228eb548a8f633876466345f15e39b095/.github/workflows/emscripten.yml#L49

@ryanking13
Copy link
Member

So if I understand correctly, this PR is to make the following available?

Yup. See here for instance: https://github.com/hoodmane/numpy/blob/12c656f228eb548a8f633876466345f15e39b095/.github/workflows/emscripten.yml#L49

Nice, it looks very cool :)

@hoodmane
Copy link
Member Author

hoodmane commented Jul 6, 2022

@rth would you mind reviewing this?

Copy link
Member

@rth rth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! The results that you manage to use it in numpy CI is great!

A few comments below otherwise LGTM.

rmtree(xbuildenv_path, ignore_errors=True)
with NamedTemporaryFile(suffix=".tar") as f:
urlretrieve(
f"http://pyodide-cache.s3-website-us-east-1.amazonaws.com/xbuildenv/{version}.tar",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, as long as it remains a bit of dev oriented cache and we make no quality of service promises, I think it should be OK. Eventually, this could probably be also done with github releases though that's harder for nightly deployments.

Also if you can make it a .tar.gz to reduce the bandwidth that would be better.

rmtree(xbuildenv_path, ignore_errors=True)
with NamedTemporaryFile(suffix=".tar") as f:
urlretrieve(
f"http://pyodide-cache.s3-website-us-east-1.amazonaws.com/xbuildenv/{version}.tar",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also is there anything platform/OS dependent in this tar? For instance, I see a,

xbuildenv/site-packages-extras/numpy/core/lib/libnpymath.a

is this intentional?

Should we add the linux-x86_64 suffix to indicate that it's linux specific? And also maybe name it something more explicit than "2.tar" otherwise it's not ideal when downloaded manually.

@@ -5,6 +5,7 @@
setuptools.setup(
entry_points={
"console_scripts": [
"pywasm = pyodide_build.out_of_tree.__main__:main",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a bit confusing with https://github.com/mohanson/pywasm

So I guess it would be a bit difficult to use subcommands here. Sill maybe something like pyodide-build-wasm even if it's a bit more verbose? pywasmbuild would also be better. Ideally, it would be to have pyodide in the name but maybe it will be too complex of a name if we include it.

Copy link
Member

@rth rth Jul 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although if you are calling it as pywasm build (I missed the subcommand part), build in the name would indeed be a bit redundant.

Maybe we should just call this command pyodide and make this the new CLI API? So it would be pyodide build when called in CI of other projects. Or pyodide cross-build or any similar sub-command.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main distinction in my mind is between in-tree and out-of-tree commands. E.g., mkpkg should be a public in-tree command, whereas it makes no sense out of tree. This is an out of tree command. People are likely to only need one CLI as appropriate to whether they are doing an in tree or out of tree build. So I think it makes sense to keep these as separate CLIs.

@hoodmane
Copy link
Member Author

hoodmane commented Jul 6, 2022

Is it okay if we fix the name of the CLI entrypoint and the .tar.gz extension in a subsequent PR? I think it would be easier to handle these things separately.

@rth
Copy link
Member

rth commented Jul 6, 2022

Yes, sure thanks! Very nice progress on this!

In scikit-learn/scikit-learn#23727 I was saying that it's maybe a bit early to run in CI, but it looks like it's closer than I thought thanks to all the work you did! Though having a robust runner for large test suites probably still needs some work, particularly those that can potentially produce fatal errors due to scipy.

@hoodmane hoodmane merged commit 03a05ab into pyodide:main Jul 6, 2022
@hoodmane hoodmane deleted the out-of-tree branch July 6, 2022 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants