Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i/prompting: implement path pattern matching and validation #13866

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

olivercalder
Copy link
Member

Path pattern matching is implemented via the doublestar package, which emulates bash's globstar matching. Patterns may include '*' wildcard characters (which match any number of non-separator characters), '**' doublestars (which match zero or more subdirectories), '?' wildcard characters (which match exactly one non-separator character), and nested groups delimited by '{' and '}'. Notably, path patterns are not allowed to have character classes delimited by '[' and ']', nor inverted classes of the form "[^abc]".

There is a limit on the number of groups allowed in path patterns, but up to that limit, groups may be arbitrarily nested or sequential.

This PR is split from #13730.

interfaces/prompting/patterns.go Outdated Show resolved Hide resolved
break
}
switch r {
case '{':
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm confused even more now. The spec does not mention {} for path patterns, so why validate this? Also, there's inconsistency between ValidatePathPattern and MatchPathPatter, where the latter is oblivious to {}, which means that a definition of a valid pattern is different for each of those calls.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MatchPathPattern directly calls doublestar.Match to do matching (discussed some here: https://docs.google.com/document/d/1tBnefdukP69EUJOlH8bgD2hrvZCYoE8-1ZlqRRYlOqc/edit?pli=1#heading=h.a0e0vfqho1ip). This should be called on expanded patterns, but this isn't technically necessary, since doublestar.Match happily matches patterns with '{' '}' groups in them (see: https://pkg.go.dev/github.com/bmatcuk/doublestar#Match). Path pattern validation should occur on patterns before they are expanded, so they may contain '{' '}'-defined groups.

}
case '\\':
// Skip next rune
_, _, err = reader.ReadRune()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't the escape sequence valid only for some of the characters that follow? eg what would a pattern like \c mean?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to doublestar.Match (https://pkg.go.dev/github.com/bmatcuk/doublestar#Match), \c matches c for any character c. I'm not adjusting that behavior at all, at the moment.

Comment on lines +98 to +101
matched, err := doublestar.Match(pattern, path)
if err != nil {
return false, err
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't the pattern be validated prior to being used? Perhaps ValidatePathPatter should call doublestar.Match with say "/" as path, discard the result and only check the error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doublestar.ValidatePattern is called as part of ValidatePathPattern, so the pattern should be valid. This error should thus never occur when the PathPatternMatch is called in practice, but it's a check in case PathPatternMatch is called in isolation without being validated. So I'm inclined to leave the error return in place rather than discarding the error, for completeness.

Path pattern matching is implemented via the doublestar package, which
emulates bash's globstar matching. Patterns may include '*' wildcard
characters (which match any number of non-separator characters), '**'
doublestars (which match zero or more subdirectories), '?' wildcard
characters (which match exactly one non-separator character), and nested
groups delimited by '{' and '}'. Notably, path patterns are *not* allowed
to have character classes delimited by '[' and ']', nor inverted
classes of the form "[^abc]".

There is a limit on the number of groups allowed in path patterns, but
up to that limit, groups may be arbitrarily nested or sequential.

Signed-off-by: Oliver Calder <oliver.calder@canonical.com>

i/prompting: fix typo and add notes to remove test boilerplate

Signed-off-by: Oliver Calder <oliver.calder@canonical.com>

i/prompting: use separate test suite for patterns

Signed-off-by: Oliver Calder <oliver.calder@canonical.com>

i/prompting: improve unit test coverage

Signed-off-by: Oliver Calder <oliver.calder@canonical.com>
@olivercalder olivercalder force-pushed the prompting-path-pattern-matching-validation branch from 1cab837 to 6e34c06 Compare May 2, 2024 04:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Samuele review Needs a review from Samuele before it can land
Projects
None yet
2 participants