Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change Cargo include/exclude rules to gitignore patterns #4268

Closed
behnam opened this issue Jul 10, 2017 · 47 comments · Fixed by #7170
Closed

Change Cargo include/exclude rules to gitignore patterns #4268

behnam opened this issue Jul 10, 2017 · 47 comments · Fixed by #7170
Assignees

Comments

@behnam
Copy link
Contributor

behnam commented Jul 10, 2017

Based on discussion on this PR: #4256

Current status on master: Stage 1.1.

FAQ

How to migrate?

If you are receiving warnings about some include/exclude pattern (like xyz) matching new files in the future, and that's not what you want, please consider adding a leading slash to the pattern (xyz becomes /xyz). (Pending on PR: #4378)

Got problems with the Warnings?

If you are here because of a cargo warning that doesn't look right, or you cannot find a fix that works in both old (current) and new interpretations, please file an issue, with content of the package.include/package.exclude configs in your manifest file, as well as a copy of the warnings you have received.

Also, don't forget to put a link to this issue in your report, and mention the POC (@behnam) in the issue.

Goal

The current interpretation of Cargo's package.include and package.exclude configs is based on UNIX Globs, as implemented in the glob crate. Now we want these configs to work as similar to gitignore as possible. The gitignore specification is also based on Globs, but has a bunch of additional features that enable easier pattern writing and more control. Therefore, we are migrating the interpretation for the rules of these configs to use the ignore crate, and treat each include/exclude rule as a single line in a gitignore file.

Major features supported by the new (gitignore-like) interpretation are:

  • Allow matching patterns fixed on the package root by using a leading slash (/) character.

    For example, exclude = [ "/data" ] will exclude file/directory named data under the root package directory. No other file/directory named data (like, /src/data) will be excluded.

  • Matching all paths under a directory by only naming the directory itself.

    For example, you can have include = [ "/src" ] instead of include = [ "/src/*" ] or include = [ "/src/**" ].

  • Support patterns with a trailing slash (/), which mean only match directories.

    For example, you can have exclude = [ "tables/" ] to exclude all directories named tables, with no effect on files.

Migration Stages

Stage 1. Implement new interpretation of include/exclude rules (the gitignore sytanx), compare the results with existing method, and warn if they don't agree on paths.

Also, error on any usage of negated patterns (rules staring with !), as they are treated literally at the moment, and will get a special meaning after the switch, which we want disabled by default. (! is the only new special character.)

Stage 1.1. Add leading-slash support to existing glob-based matching, to enable migration for patterns relying on matching the items in root. (Like "src/**/*", which should match other directories named src, except in the root.)

Stage 2. After at least one release and updating the docs, change the default to the new approach; and still warn if the new approach doesn't agree with the old approach.

Stage 3. In a later release, drop the old approach.

Possible Follow-ups

  • Enable negated patterns (rules staring with !) for both include/exclude rules. This gives more control to users.

  • Allow having both include/exclude rules at the same time. Not exactly related to this change, but easy and intuitive to do after these changes.

  • If needed, update workspace.exclude to stay close to how package.exclude works.

Fixes #3578

@behnam
Copy link
Contributor Author

behnam commented Jul 11, 2017

Turns out ripgrep's ignore also has some issues in edge cases. I have added integration tests that reveal the bugs: BurntSushi/ripgrep#551

I'm going to expand the test and cover more cases and make sure ignore package is in good shape before we move forward here.

behnam added a commit to behnam/rust-cargo that referenced this issue Jul 18, 2017
…ches

Add gitignore-like pattern matching logic to `list_files()` and throw
warnings for paths getting different inclusion/exclusion results from
the old and the new methods.

Migration Tracking: <rust-lang#4268>
bors added a commit that referenced this issue Jul 18, 2017
[sources/path] Add gitignore-like pattern matching and warn on mismatches

Add gitignore-like pattern matching logic to `list_files()` and throw
warnings for paths getting different inclusion/exclusion results from
the old and the new methods.

Migration Tracking: <#4268>
behnam added a commit to behnam/rust-cargo that referenced this issue Jul 19, 2017
bors added a commit that referenced this issue Jul 19, 2017
[src/doc/manifest] Add section on Migrating to `gitignore`-like pattern matching

Tracking issue: #4268
@alexcrichton
Copy link
Member

When munging around in the manifests today I realized that we've got another place where we're using the glob crate but probably shouldn't be. The workspace::expand_member_path function uses glob to expand glob inputs and it's actually a bug that the is_excluded function doesn't respect globs, but that's being fixed in #4297 (review)

@alexcrichton
Copy link
Member

@behnam would you be interested in looking to remove the usage of glob in workspace interpretation? I found it to be somewhat nontrivial but I don't think we'll have many backwards compatibility concerns as it was very very recently added.

@behnam
Copy link
Contributor Author

behnam commented Jul 19, 2017

Right, @alexcrichton. I added that to the follow up list yesterday and was thinking to get to it when getting to Stage 2. But thinking about it again, we can add a similar warning mechanism there and move forward changes on workspace.exclude (and related configs) alongside the current plan for package.exclude (and friends).

So, yes, I'll start working on it later today.

@alexcrichton
Copy link
Member

Thanks @behnam! Although I still need to get #4297 landed...

@behnam
Copy link
Contributor Author

behnam commented Jul 24, 2017

Sure, @alexcrichton. Please go ahead and land #4297.

Getting gitignore-like matching work for workspaces is a bit more complicated than for package include/exclude.

In package include/exclude, all paths are guaranteed to be a child of the root path, which is where the cargo manifest sits. This is similar to gitignore behavior, where rules only apply to children paths of some root, which is where the .gitignore file sits, or the git repo root. That's why I have hard-coded the assumption as debug-assertion in the matching logic: BurntSushi/ripgrep#551.

In workspace members/exclude configs—similar to package dependencies—the path can point to directories outside of the package root, usually a sibling directory. If we want to change their interpretation to gitignore-like matching, we need to first specify the behavior.

The new behavior may not be hard to define, but it would be something new that we need to keep track of. Also, there would be a question of where it belongs to the ripgrep implementation, or should we hard-code this special case in a local abstraction in cargo.

I'm still pondering about how to implement it. Would be glad to hear your thoughts on this, too.

@alexcrichton
Copy link
Member

Hm yeah I wasn't sure if this'd be a great application for it :(. I wonder if maybe just exclude should be gitignore-like and otherwise include shoudl be glob based?

@behnam
Copy link
Contributor Author

behnam commented Jul 26, 2017

otherwise include should be glob based?

You mean workspace.members, right? Yes, I think globs make much more sense in that case. They are supposed to match directories (and not files), and they are usually expected to list patterns matching non-children paths.

Then, we can limit workspace.exclude to only paths under the root, since that's where auto-inclusion happens. And that resolves the aforementioned matching design issue.

From How to teach this? perspective, I think we can be more explicit with the package root directory concept, and that's the only place with auto-inclusion of files and packages. And, we can also say that in all package.include, package.exclude, workspace.exclude options, the patterns match against this package root directory.

@matklad
Copy link
Member

matklad commented Apr 16, 2018

@behnam curious, what is the current state of affairs here? I think we were intending to fix some rather bad bug with this (don't remember which one), but looks like this is stalled and the original bug is still present? It may be me just imaging things of course :-)

@behnam
Copy link
Contributor Author

behnam commented Apr 17, 2018

We haven't had a report of any issues for a couple of releases, so it should be safe now to flip the switch. @alexcrichton?

@alexcrichton
Copy link
Member

Sounds good to me!

@matklad
Copy link
Member

matklad commented Jul 20, 2018

@behnam would you want to send the PR to change the behavior then?

@behnam
Copy link
Contributor Author

behnam commented Jul 20, 2018

I can do so in 3-4 weeks from now. If anyone else can do so earlier, please feel free to take over.

Side note: I think we want to still keep both logic and earn against mismatched, but updating the wording to denote the change.

Also need to update the docs at the same time as flipping the switch.

behnam added a commit to behnam/rust-cargo that referenced this issue Nov 12, 2018
Enable gitignore-like pattern matching for package's `include`/`exclude`
rules, as suggested in rust-lang#4268 and initially implemented in rust-lang#4270.

We still warn on projects affected by the divergent behavior. We can
drop the warning and clean up the source and tests in one or more
releases.
bjfish added a commit to bjfish/wasmer that referenced this issue Dec 14, 2018
warning: Pattern matching for Cargo's include/exclude fields is changing and file `emtests/README.md` WILL be included in a future Cargo version.
See rust-lang/cargo#4268 for more info
warning: Pattern matching for Cargo's include/exclude fields is changing and file `spectests/README.md` WILL be included in a future Cargo version.
See rust-lang/cargo#4268 for more info
@nox
Copy link
Contributor

nox commented Feb 17, 2019

The following warning is emitted when I publish inert:

Pattern matching for Cargo's include/exclude fields is changing and file .vscode/settings.json WILL be excluded in a future Cargo version.

With the following package.exclude setting:

https://github.com/nox/inert/blob/ea06ac74f2cf3ee83672ed9ccabc09635dd5aa1a/Cargo.toml#L12

@behnam
Copy link
Contributor Author

behnam commented Feb 23, 2019

@nox, I believe that's because "/.vscode" doesn't match any files with the glob rules (matches the dir only), and changing it to "/.vscode/**" should silent the warning.

Also, AFAIU, that's an indicator of the current (glob) pattern not doing what you think it's doing, and the content of that dir, /.vscode, is present in your package, specifically /.vscode/settings.json. Could you plz double-check this, to make sure we're understanding the situation correctly.

@behnam
Copy link
Contributor Author

behnam commented Feb 23, 2019

About the status of this migration, I have tried getting back to it, but cannot get cargo package build properly on my MacPort-enabled system (conflicts of git and iconv libs). I'm not sure when I can spend time on it again.

If anyone can volunteer to take this step, it would be really appreciated. See 6b61d98 for the changes need to get to Stage 2.

@nox
Copy link
Contributor

nox commented Feb 24, 2019

@nox, I believe that's because "/.vscode" doesn't match any files with the glob rules (matches the dir only), and changing it to "/.vscode/**" should silent the warning.

Oh I see. Shouldn't it already behave as if I specified "/.vscode/**" though? Why is the glob needed? Is that going to change when Cargo starts relying on .gitignore? The Git mechanism does allow to forbid an entire tree by just ignoring its top directory.

@mathstuf
Copy link
Contributor

@nox Yes, that's what the warning is about. Cargo doesn't behave as if /** were appended now, but once the gitignore behavior lands, it will. Hence the notice about the behavior change.

linabutler added a commit to mozilla/dogear that referenced this issue Mar 4, 2019
I thought this is how Cargo works today, but it's not. See
rust-lang/cargo#4268.
bors added a commit that referenced this issue May 10, 2019
Migrate package include/exclude to gitignore patterns.

This moves to the next phase of #4268.

This also includes a fdew more changes which can be removed if desired:
- Add support for `!` negate gitignore patterns.
- Add a warning if both package.include and package.exclude are specified.
@DoumanAsh
Copy link

DoumanAsh commented May 22, 2019

Could please someone clarify the current state of cargo pattern matching?

I got warnings suddenly and looking at this issue, decided to change from using include to exclude for single directory for C sources
"/brotli/" - which I expect to exclude directory brotli in root of crate.

But currently it throws warnings on all files inside of it, so I'm a bit unsure whether we still use glob patterns in nightly or not?
The same if I remove leading slash, but I want to exclude only directory in root of crate.

@ehuss
Copy link
Contributor

ehuss commented May 22, 2019

@DoumanAsh the latest nightly just switched to gitignore-style patterns (#6924) (docs).

/brotli/ differs in how it is interpreted by the old system (glob) and the new one (ignore). It's just showing a warning to let you know that it has changed. I'm planning to remove the warning in 1.38. If I'm reading your description correctly, you can ignore it if you only use nightly to publish, since /brotli/ matches everything. If you want to avoid the warning, I think /brotli/** works on both systems without warnings. When in doubt, run cargo package --list to see what will be included.

@DoumanAsh
Copy link

Ah I see, thank you. --list confirms that it works as I intended.

@XAMPPRocky
Copy link
Member

Is there anything left to be implemented or can this issue be closed? Probably at very least there needs to be a PR to update the manifest documentation.

@ehuss
Copy link
Contributor

ehuss commented Jun 25, 2019

@XAMPPRocky I plan to remove the warning in 1.38, so I'd like to leave it open until then.

The documentation was updated in the last PR. You can put "nightly" in the docs path to see the latest.

bors added a commit that referenced this issue Jul 23, 2019
Remove include/exclude glob warning.

This removes the warning when a package include/exclude pattern produces different results from glob vs gitignore.

The pre-warnings were added in 1.21 (released 2017-10-12) #4270.
The migration was done in 1.36 (released 2019-07-04) #6924.
This will remove the warnings in 1.38 (to be released 2019-09-26).

Closes #4268
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.