Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple --not-match-f override each other, instead of being applied additively #724

Open
includesec-erik opened this issue May 9, 2023 · 5 comments

Comments

@includesec-erik
Copy link

includesec-erik commented May 9, 2023

Describe the bug
When running the following command (and any similar command with more than one --not-match-f stated) the default expectation is that all files who have a basename ending with _test.go and _proto.go will not be included for counting consideration by cloc. The actual behavior of cloc seems to be that a single (I believe the last stated) not-match-f will be honored as a filter instead of all statements of --not-match-f

cloc . --not-match-f=".*\_proto.go" --not-match-f=".*\_test.go"

cloc; OS; OS version

  • cloc version: 1.96
  • Perl version: v5.14.2
  • Ubuntu Linux

To Reproduce
See comment on this youtube video for repro:
https://www.youtube.com/watch?v=eRLTkDMsCqs

Expected result
All not-match-f filters are applied within cloc for filtering consideration instead of only one.

Thanks for considering this Al, perhaps we can change cloc's default behavior to be additive filter with this command line option instead of single filter respected? Apparently this unexpected behavior has been around a while!

BTW does this situation also apply to match-f, match-d, and not-match-d command line options as well?

@AlDanial
Copy link
Owner

None of the --match-* or --not-match-* switches may be repeated. I didn't see the need since a single regex can handle multiple cases. Your two --not-match-f cases can be condensed to

cloc . --not-match-f=".*\_(proto|test).go"

I'm sure I'm overlooking situations where multiple copies of --not-match-f really are necessary. If you can describe such a use case I'll update the code to accommodate it.

@includesec-erik
Copy link
Author

includesec-erik commented May 11, 2023

Hi @AlDanial, thanks for the reply! Given your info, I'd categorize this as an enhancement request issue, not a bug.

You're correct in stating that all possible matches can be thought of and specified in a single regex, thanks for pointing that out.

I would say though that for users who are less regex experienced, or when I'm trying to explain to another party how to use cloc over email/phone call, it is tremendously simpler to use multiple parameters to build a list of filters. From what I've seen from working with other tech professionals who use other command line tools, this is a commonly expected pattern (additive list of filters) that works in other tools (Tokei for instance).

I totally understand if implementing this behavior change is a big ask why you might want to decline this enhancement request, but if it is a smaller ask, please consider it! Thank you.

@AlDanial
Copy link
Owner

It's not a big ask and I'm familiar with additive options (cloc's --force-lang and --script-lang can be specified multiple times). Still, the request will need to get on the back burner until I finish #722 (which will take me some time to implement cleanly).

@includesec-erik
Copy link
Author

includesec-erik commented May 11, 2023

Sounds good @AlDanial
Fight the good fight against Text::Glob!

AlDanial added a commit that referenced this issue May 27, 2023
this is a partial implementation, still need --match-f,
--match-d
@AlDanial
Copy link
Owner

I've begun work on this; try the latest commit to kick the tires on additive --not-match-f and --not-match-d

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants