Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for pattern matching across fields #107

Open
embano1 opened this issue Jun 27, 2023 · 4 comments
Open

Support for pattern matching across fields #107

embano1 opened this issue Jun 27, 2023 · 4 comments
Labels
enhancement New feature or request

Comments

@embano1
Copy link
Member

embano1 commented Jun 27, 2023

What is your idea?

This has come up in several user discussions, especially around change data capture (CDC) that is often used with EventBridge Pipes and EventBridge Kafka Sink Connector, so posting here as a feature request.

Currently it is not possible to pattern match across fields in a JSON object e.g., a DynamoDB stream event (snippet) where I want to filter events that have a common (or different) value across fields, such as the value of "S" in "NewImage" and "OldImage" below.

{  
  "NewImage" : { 
    "Status" : { 
      "S" : "OK"
    } 
  },
  "OldImage": { 
    "Status": { 
      "S": "ERROR" 
    } 
  }  
}

Another (non-CDC) example:

{
  "detail": {
    "brand": "Ford",
    "model": "Focus",
    "colour": "Blue",
    "interior-colour": "Black"
  }
}

Suggested pattern:

{
  "detail": {
    "interior-colour": [ { "anything-but": "$.detail.colour" } ]
  }
}

Would you be willing to make the change?

No

Additional context

Add any other context (such as images, docs, posts) about the idea here.

@embano1 embano1 added the enhancement New feature or request label Jun 27, 2023
@baldawar
Copy link
Collaborator

One follow up thought on this issue; if we were to support more payload formats (like avro or protobuf), then using $. as a indicator that we're using jsonpath would be confusing. We can work around this by following matcher like pattern. For example, [ { 'jsonpath' : '$.detail.colour' } ].

@timbray
Copy link
Collaborator

timbray commented Jun 29, 2023

I agree with Rishi on syntax. I worry about implementation. If there are a bunch of these rules, the match-finding performance obviously becomes linear in their number. Right?

@embano1
Copy link
Member Author

embano1 commented Jun 30, 2023

I agree with Rishi on syntax. I worry about implementation

+1

If there are a bunch of these rules, the match-finding performance obviously becomes linear in their number. Right?

That is definitely a concern, so there might be limits we have to impose around e.g., the number of such comparisons and path depth in a rule

@baldawar
Copy link
Collaborator

baldawar commented Jul 5, 2023

Wildcard is a good reference. Depending on how many you use, the performance degrades but there's a way to understand the worst-case complexity upfront via the evaluator. We could offer a similar solution or update the existing evaluator.

There's also some improvements we can make by trading off other dimensions. Take this rule

{
    "first": {
        "second": [{
            "jsonpath": "$.detail.colour"
        }]
    }
}

This can be recompiled to two sub-rules

{
    "$equal": [{
            "jsonpath": "$.detail.colour"
        },
        {
            "jsonpath": "$.first.second"
        }
    ]
}

and the perform computation similar to $or

private static void parseIntoOrRelationship(final List<Map<String, List<Patterns>>> rules,

or by making both of these as terminal rules for a match

Set<Double> terminalSubRuleIds = nameState.getTerminalSubRuleIdsForPattern(pattern);
.

There will always be some cost but if many folks have to build their own inefficient wrappers around ruler to support this behaviour, then its best to just include it in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants