Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NEW: Better package splitting for multi-output recipes with negative glob in outputs/files #5216

Open
wants to merge 42 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
a545986
conda_build.build: simple implementation of files exclusion
duncanmmacleod Feb 19, 2021
14e3548
tests: add test for glob exclusion in files
duncanmmacleod Feb 19, 2021
7fbc142
add docs for negative file matches
duncanmmacleod Feb 19, 2021
67904ce
news: add negative file match news entry
duncanmmacleod Feb 19, 2021
e56c573
conda_build.build: support separate include/exclude file lists
duncanmmacleod Nov 9, 2021
be19766
Merge branch 'main' into files-exclude
carterbox Mar 5, 2024
b667724
REF: Only consider new files to the PREFIX when file picking
carterbox Mar 6, 2024
063ff89
Update docs/source/resources/define-metadata.rst
carterbox Mar 6, 2024
1ab7638
Merge branch 'main' into files-exclude
carterbox Mar 8, 2024
de7715d
Update define-metadata.rst
carterbox Mar 8, 2024
f56e201
Update files-exclude.rst
carterbox Mar 8, 2024
0537c19
Merge branch 'main' into files-exclude
carterbox Mar 20, 2024
30a5e1b
STY: Add missing __future__ import
carterbox Mar 20, 2024
a92964f
STY: Address RUFF lints
carterbox Mar 20, 2024
a2e3963
STY: Remove extra whitespace in news
carterbox Mar 20, 2024
bfac4e0
Merge branch 'main' into files-exclude
jaimergp Mar 21, 2024
3c0b090
BUG: Use default value to make API backwards-compatible
carterbox Mar 21, 2024
fd5e956
Update news/files-exclude.rst
carterbox Mar 21, 2024
4c04d49
REF: Prevent errors when outputs/files is None
carterbox Mar 21, 2024
e16c470
Merge remote-tracking branch 'origin/files-exclude' into files-exclude
carterbox Mar 21, 2024
e598d6a
TST: Robust testing of new filematching
carterbox Mar 21, 2024
8dd906d
TST: Use defaults packages instead of conda-forge
carterbox Mar 21, 2024
642906b
DOC: Use admonition to warn users about explicit file lists
carterbox Mar 28, 2024
d1b12a1
DOC: Update explicit file lists docs
carterbox Mar 28, 2024
9dbd17c
DOC: Fix RST syntax
carterbox Apr 2, 2024
c3a26dd
DOC: Fix RST syntax
carterbox Apr 2, 2024
84aad48
DOC: Fix RST syntax
carterbox Apr 2, 2024
249a065
DOC: Fix RST syntax
carterbox Apr 2, 2024
55248d4
Merge branch 'main' into files-exclude
carterbox Apr 3, 2024
e42b6f0
TST: Prevent missing glob match error in test
carterbox Apr 3, 2024
a62e4b9
DOC: Fix RST syntax
carterbox Apr 10, 2024
e28565f
Update news to new template style
carterbox Apr 10, 2024
ee89c04
DOC: Fixup syntax highlighting in adjacent section
carterbox Apr 10, 2024
b0645c2
Merge branch 'main' into files-exclude
carterbox Apr 15, 2024
822b477
TST: Mark subpackage tests as serial tests to avoid errors on osx
carterbox Apr 23, 2024
475c0ec
Merge branch 'main' into files-exclude
carterbox Apr 23, 2024
4f988f9
TST: Avoid package collisions in subpackage tests
carterbox Apr 25, 2024
b33f6f6
TST: Avoid more package collisions in tests
carterbox Apr 25, 2024
256017a
Fix tests
isuruf May 31, 2024
e5fdcbf
Merge branch 'main' into files-exclude
isuruf May 31, 2024
606ee08
Update docs/source/resources/define-metadata.rst
isuruf May 31, 2024
6d33054
Remove serial
isuruf May 31, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
33 changes: 25 additions & 8 deletions conda_build/build.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
import subprocess
import sys
import time
import typing
import warnings
from collections import OrderedDict, deque
from os.path import dirname, isdir, isfile, islink, join
Expand Down Expand Up @@ -1802,7 +1803,7 @@ def post_process_files(m: MetaData, initial_prefix_files):
return new_files


def bundle_conda(output, metadata: MetaData, env, stats, **kw):
def bundle_conda(output, metadata: MetaData, env, stats, new_prefix_files: typing.Set[str], **kw):
log = utils.get_logger(__name__)
log.info("Packaging %s", metadata.dist())
get_all_replacements(metadata.config)
Expand Down Expand Up @@ -1915,10 +1916,26 @@ def bundle_conda(output, metadata: MetaData, env, stats, **kw):
if files:
# Files is specified by the output
# we exclude the list of files that we want to keep, so post-process picks them up as "new"
keep_files = {
os.path.normpath(pth)
for pth in utils.expand_globs(files, metadata.config.host_prefix)
}
if isinstance(files, dict):
# When file matching with include/exclude lists, only
# new_prefix_files are considered. Files in the PREFIX from other
# recipes (dependencies) are ignored
include = files.get("include", [])
exclude = files.get("exclude", [])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This syntax will be affected by the "key is set to an empty block, which means None in YAML" kind of issue. We should change it to:

Suggested change
include = files.get("include", [])
exclude = files.get("exclude", [])
include = files.get("include") or []
exclude = files.get("exclude") or []

It might also be a good opportunity to sneak @jakirkham's changes added by #4971.

Copy link
Contributor

@jaimergp jaimergp Mar 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See line 1818:

- files = output.get("files", [])
+ files = output.get("files") or []

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that files.get("include") may return None, but include must be a list. So we use the falsiness of None to replace it with []. Equivalent to:

include = files['include'] if ('include' in files and files['include'] != None) else []

Copy link
Contributor

@jaimergp jaimergp Mar 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might have

- output: something
  files:
    include:
      - path
    exclude:
      - not-this-path  # [linux]

and then you'll have files["exclude"] = None on macOS, for example (linux is false). Not sure at what point we are ensuring that include is a list.

Under that circumstance files.get("exclude", []) will be None instead of the expected []. Is that what you mean? I think we are saying the same 😬

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We are saying the same thing. I'm just trying to figure out if we should be doing something stronger than converting Falsy values into the empty list. Perhaps convert any non-list into a list? Something like:

include = files.get("include") if isinstance(files.get("include"), list) else []

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That might silence errors when users populate it with a dict or something accidentally. I think we can just play it simple with falsy promotions.

exclude_files = {
os.path.normpath(pth)
for pth in utils.expand_globs(exclude, metadata.config.host_prefix)
}
keep_files = {
os.path.normpath(pth)
for pth in utils.expand_globs(include, metadata.config.host_prefix)
}
keep_files = new_prefix_files.intersection(keep_files) - exclude_files
jaimergp marked this conversation as resolved.
Show resolved Hide resolved
else:
keep_files = {
os.path.normpath(pth)
for pth in utils.expand_globs(files, metadata.config.host_prefix)
}
pfx_files = set(utils.prefix_files(metadata.config.host_prefix))
initial_files = {
item
Expand Down Expand Up @@ -2093,7 +2110,7 @@ def bundle_conda(output, metadata: MetaData, env, stats, **kw):
return final_outputs


def bundle_wheel(output, metadata: MetaData, env, stats):
def bundle_wheel(output, metadata: MetaData, env, stats, new_prefix_files: typing.Set[str]):
ext = ".bat" if utils.on_win else ".sh"
with TemporaryDirectory() as tmpdir, utils.tmp_chdir(metadata.config.work_dir):
dest_file = os.path.join(metadata.config.work_dir, "wheel_output" + ext)
Expand Down Expand Up @@ -2817,7 +2834,7 @@ def build(
# This is wrong, files has not been expanded at this time and could contain
# wildcards. Also well, I just do not understand this, because when this
# does contain wildcards, the files in to_remove will slip back in.
if "files" in output_d:
if "files" in output_d and not isinstance(output_d["files"], dict):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if "files" in output_d and not isinstance(output_d["files"], dict):
if not isinstance(output_d.get("files") or (), dict):

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated this condition to:

                    if (
                        "files" in output_d
                        and output_d["files"] is not None
                        and not isinstance(output_d["files"], dict)
                    ):

output_d["files"] = set(output_d["files"]) - to_remove

# copies the backed-up new prefix files into the newly created host env
Expand All @@ -2833,7 +2850,7 @@ def build(
with utils.path_prepended(m.config.build_prefix):
env = environ.get_dict(m=m)
pkg_type = "conda" if not hasattr(m, "type") else m.type
newly_built_packages = bundlers[pkg_type](output_d, m, env, stats)
newly_built_packages = bundlers[pkg_type](output_d, m, env, stats, new_prefix_files)
# warn about overlapping files.
if "checksums" in output_d:
for file, csum in output_d["checksums"].items():
Expand Down
18 changes: 17 additions & 1 deletion docs/source/resources/define-metadata.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1298,7 +1298,7 @@ build prefix. Explicit file lists support glob expressions.
Directory names are also supported, and they recursively include
contents.

.. code-block:: none
.. code-block:: yaml

outputs:
- name: subpackage-name
Expand All @@ -1308,6 +1308,22 @@ contents.
- *.some-extension
- somefolder/*.some-extension

Files can be excluded by specifying `files` as a dictionary separating
files to `include` from those to `exclude`:
carterbox marked this conversation as resolved.
Show resolved Hide resolved

.. code-block:: yaml

outputs:
- name: subpackage-name
files:
include:
- a-file
- a-folder
- *.some-extension
- somefolder/*.some-extension
exclude:
- *.exclude-extension

Scripts that create or move files into the build prefix can be
any kind of script. Known script types need only specify the
script name. Currently the list of recognized extensions is
Expand Down
25 changes: 25 additions & 0 deletions news/files-exclude.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
Enhancements:
-------------

* implemented negative matching for `files` for `outputs`

Bug fixes:
----------

* <news item>

Deprecations:
-------------

* <news item>

Docs:
-----

* <news item>

Other:
------

* <news item>

1 change: 1 addition & 0 deletions tests/test-recipes/split-packages/copying_files/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@ echo "weee" > $PREFIX/somedir/subpackage_file1
# test glob patterns
echo "weee" > $PREFIX/subpackage_file1.ext
echo "weee" > $PREFIX/subpackage_file2.ext
echo "weee" > $PREFIX/subpackage_file3.ext
9 changes: 6 additions & 3 deletions tests/test-recipes/split-packages/copying_files/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,12 @@ requirements:
outputs:
- name: my_script_subpackage
files:
- subpackage_file1
- somedir
- "*.ext"
include:
- subpackage_file1
- somedir
- "*.ext"
exclude:
- "*3.ext"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we replace the old test with a new one instead of adding a new case? Maybe I would add a new output in this meta.yaml and then test the different behaviors differently, if possible (e.g. how the list-of-str behaviour does copy anything in PREFIX, but include/exclude doesn't?).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the test recipe to include multiple outputs: one uses the old file matching and one uses the new file matching.

test:
script: subpackage_test.py
script_interpreter: python
Original file line number Diff line number Diff line change
Expand Up @@ -31,3 +31,7 @@
contents = contents.decode()
assert "weee" in contents, 'incorrect file contents: %s' % contents
print("glob OK")

filename = os.path.join(os.environ['PREFIX'], 'subpackage_file3.ext')
assert not os.path.isfile(filename)
print("glob exclude OK")