Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added filter order optimizations #3389

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

Cali0707
Copy link
Member

This PR brings the filter order optimizations from eventing core to the java filter implementation

Proposed Changes

  • Create a FilterListOptimizer class that handles the optimization loop
  • Use the FilterListOptimizer class with the AnyFilter and AllFilter

Release Note

The Any filter and All filter are now dynamically reordered for performance

Docs

Signed-off-by: Calum Murray <cmurray@redhat.com>
@knative-prow knative-prow bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Oct 10, 2023
@knative-prow
Copy link

knative-prow bot commented Oct 10, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Cali0707

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow knative-prow bot added approved Indicates a PR has been approved by an approver from all required OWNERS files. area/data-plane labels Oct 10, 2023
@Cali0707
Copy link
Member Author

/cc @pierDipi

@knative-prow knative-prow bot requested a review from pierDipi October 10, 2023 20:41
@codecov
Copy link

codecov bot commented Oct 10, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 58.34%. Comparing base (794302d) to head (afaffb0).
Report is 186 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #3389      +/-   ##
============================================
- Coverage     61.48%   58.34%   -3.14%     
============================================
  Files           181       91      -90     
  Lines         12356     9279    -3077     
  Branches        265        0     -265     
============================================
- Hits           7597     5414    -2183     
+ Misses         4159     3431     -728     
+ Partials        600      434     -166     
Flag Coverage Δ
java-unittests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Cali0707
Copy link
Member Author

/hold
Until I run benchmarks to confirm this is faster...

@knative-prow knative-prow bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 11, 2023
import java.util.concurrent.locks.ReadWriteLock;
import org.slf4j.Logger;

public class FilterListOptimizer extends Thread {
Copy link
Member

@pierDipi pierDipi Oct 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A thread is way bigger than a go routing and during our scalability tests we reached the maximum thread count (which can't be easily increased in many platforms), so having more threads is a bit problematic until we have loom threads I don't really recommend increasing thread count (1 per consumer group is a lot)

Copy link
Member

@pierDipi pierDipi Oct 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe what we might gain from runtime optimizations we will lose in higher memory usage and risk of blocking the Vertx event loop

Comment on lines 32 to 38
private final List<FilterCounter> filters;

private final ArrayBlockingQueue<Integer> indexSwapQueue;

private final FilterListOptimizer filterListOptimizer;

private final ReadWriteLock readWriteLock;
Copy link
Member

@pierDipi pierDipi Oct 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, Vertx doesn't like locks and blocking operations, I'm pretty sure with this implementation we will block the event loop and that causes basically to block any event delivery for a particular trigger

Signed-off-by: Calum Murray <cmurray@redhat.com>
Signed-off-by: Calum Murray <cmurray@redhat.com>
@Cali0707
Copy link
Member Author

@pierDipi @matzew not sure if either of you have experience with Vert.x + JMH, but I am getting the following error while running benchmarks and am not sure how to fix it:

<JMH had finished, but forked VM did not exit, are there stray running threads? Waiting 24 seconds more...>

Non-finished threads:

Thread[#33,DestroyJavaVM,5,main]

Thread[#32,vert.x-eventloop-thread-0,5,main]
  at java.base@20.0.2/sun.nio.ch.EPoll.wait(Native Method)
  at java.base@20.0.2/sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:121)
  at java.base@20.0.2/sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:130)
  at java.base@20.0.2/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:147)
  at app//io.netty.channel.nio.SelectedSelectionKeySetSelector.select(SelectedSelectionKeySetSelector.java:68)
  at app//io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:879)
  at app//io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:526)
  at app//io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
  at app//io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
  at app//io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
  at java.base@20.0.2/java.lang.Thread.runWith(Thread.java:1636)
  at java.base@20.0.2/java.lang.Thread.run(Thread.java:1623)

Signed-off-by: Calum Murray <cmurray@redhat.com>
Signed-off-by: Calum Murray <cmurray@redhat.com>
@knative-prow-robot knative-prow-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 17, 2023
Signed-off-by: Calum Murray <cmurray@redhat.com>
Signed-off-by: Calum Murray <cmurray@redhat.com>
Signed-off-by: Calum Murray <cmurray@redhat.com>
Signed-off-by: Calum Murray <cmurray@redhat.com>
@knative-prow-robot knative-prow-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 18, 2023
@Cali0707
Copy link
Member Author

Cali0707 commented Oct 18, 2023

/unhold
/cc @pierDipi @matzew @aliok @creydr
With these changes we see the following performance tradeoffs:

AnyFilter

 name                                                                                                         mode    old count   new count   units    diff        
 AnyFilterBenchmark.AnyFilter2EventsMatch2DifferentFilters.benchmarkFilterCreation                            thrpt   4.535       2.046       ops/us   -54.88423%  
 AnyFilterBenchmark.AnyFilter2EventsMatch2DifferentFilters.benchmarkFilterEvaluation                          thrpt   37.575      60.47       ops/us   +60.93147%  
 AnyFilterBenchmark.AnyFilter2EventsMatch2DifferentFiltersOneFilterMatchesNeither.benchmarkFilterCreation     thrpt   2.994       1.487       ops/us   -50.33400%  
 AnyFilterBenchmark.AnyFilter2EventsMatch2DifferentFiltersOneFilterMatchesNeither.benchmarkFilterEvaluation   thrpt   23.86       45.153      ops/us   +89.24141%  
 AnyFilterBenchmark.AnyFilterFirstMatchAtEnd.benchmarkFilterCreation                                          thrpt   2.977       1.506       ops/us   -49.41216%  
 AnyFilterBenchmark.AnyFilterFirstMatchAtEnd.benchmarkFilterEvaluation                                        thrpt   21.248      62.942      ops/us   +196.22553% 
 AnyFilterBenchmark.AnyFilterFirstMatchAtStart.benchmarkFilterCreation                                        thrpt   2.994       1.509       ops/us   -49.59920%  
 AnyFilterBenchmark.AnyFilterFirstMatchAtStart.benchmarkFilterEvaluation                                      thrpt   107.058     88.027      ops/us   -17.77635%  
 AnyFilterBenchmark.AnyFilterMatchAllSubfilters.benchmarkFilterCreation                                       thrpt   2.832       1.465       ops/us   -48.26977%  
 AnyFilterBenchmark.AnyFilterMatchAllSubfilters.benchmarkFilterEvaluation                                     thrpt   106.5       87.972      ops/us   -17.39718%  
 AnyFilterBenchmark.AnyFilterWithExactFilterBenchmark.benchmarkFilterCreation                                 thrpt   8.954       2.513       ops/us   -71.93433%  
 AnyFilterBenchmark.AnyFilterWithExactFilterBenchmark.benchmarkFilterEvaluation                               thrpt   122.93      91.047      ops/us   -25.93590%  

AllFilter

 name                                                                                 mode    old count   new count   units    diff       
 AllFilterBenchmark.AllFilterFirstMatchEndOfArray.benchmarkFilterCreation             thrpt   2.698       1.518       ops/us   -43.73610% 
 AllFilterBenchmark.AllFilterFirstMatchEndOfArray.benchmarkFilterEvaluation           thrpt   69.308      68.486      ops/us   -1.18601%  
 AllFilterBenchmark.AllFilterFirstMatchStartOfArray.benchmarkFilterCreation           thrpt   2.746       1.461       ops/us   -46.79534% 
 AllFilterBenchmark.AllFilterFirstMatchStartOfArray.benchmarkFilterEvaluation         thrpt   42.821      51.506      ops/us   +20.28210% 
 AllFilterBenchmark.AllFilterMatchAllSubFilters.benchmarkFilterCreation               thrpt   2.797       1.42        ops/us   -49.23132% 
 AllFilterBenchmark.AllFilterMatchAllSubFilters.benchmarkFilterEvaluation             thrpt   17.866      18.442      ops/us   +3.22400%  
 AllFilterBenchmark.AllFilterNoMatchingFilters.benchmarkFilterCreation                thrpt   4.095       1.863       ops/us   -54.50549% 
 AllFilterBenchmark.AllFilterNoMatchingFilters.benchmarkFilterEvaluation              thrpt   69.715      68.882      ops/us   -1.19486%  
 AllFilterBenchmark.AllFilterOneNonMatchingFilterInMiddle.benchmarkFilterCreation     thrpt   2.649       1.502       ops/us   -43.29936% 
 AllFilterBenchmark.AllFilterOneNonMatchingFilterInMiddle.benchmarkFilterEvaluation   thrpt   39.46       47.646      ops/us   +20.74506% 
 AllFilterBenchmark.AllFilterWithExactFilter.benchmarkFilterCreation                  thrpt   7.963       2.515       ops/us   -68.41643% 
 AllFilterBenchmark.AllFilterWithExactFilter.benchmarkFilterEvaluation                thrpt   107.595     98.031      ops/us   -8.88889%  

I think these are reasonable tradeoffs, but if not let me know and I'm happy to close this PR :)

Also, it is worth noting that the AllFilterBenchmarks don'tn test for a case when this re-ordering will lead to the largest speedup i.e. one non-matching filter at the end of the filters array, so we would see an even larger speedup there.

@knative-prow knative-prow bot requested review from aliok and creydr October 18, 2023 19:12
@knative-prow knative-prow bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 18, 2023
@Cali0707
Copy link
Member Author

/test

Copy link

knative-prow bot commented Nov 15, 2023

@Cali0707: The /test command needs one or more targets.
The following commands are available to trigger required jobs:

  • /test build-tests
  • /test channel-integration-tests-sasl-plain
  • /test channel-integration-tests-sasl-ssl
  • /test channel-integration-tests-ssl
  • /test channel-reconciler-tests-sasl-plain
  • /test channel-reconciler-tests-sasl-ssl
  • /test channel-reconciler-tests-ssl
  • /test integration-tests
  • /test reconciler-tests
  • /test reconciler-tests-namespaced-broker
  • /test unit-tests
  • /test upgrade-tests

The following commands are available to trigger optional jobs:

  • /test reconciler-tests-keda
  • /test reconciler-tests-loom
  • /test reconciler-tests-namespaced-broker-loom

Use /test all to run the following jobs that were automatically triggered:

  • build-tests_eventing-kafka-broker_main
  • channel-integration-tests-sasl-plain_eventing-kafka-broker_main
  • channel-integration-tests-sasl-ssl_eventing-kafka-broker_main
  • channel-integration-tests-ssl_eventing-kafka-broker_main
  • channel-reconciler-tests-sasl-plain_eventing-kafka-broker_main
  • channel-reconciler-tests-sasl-ssl_eventing-kafka-broker_main
  • channel-reconciler-tests-ssl_eventing-kafka-broker_main
  • integration-tests_eventing-kafka-broker_main
  • reconciler-tests-namespaced-broker_eventing-kafka-broker_main
  • reconciler-tests_eventing-kafka-broker_main
  • unit-tests_eventing-kafka-broker_main
  • upgrade-tests_eventing-kafka-broker_main

In response to this:

/test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Cali0707
Copy link
Member Author

/test reconciler-tests-keda

Copy link

knative-prow bot commented Nov 15, 2023

@Cali0707: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
channel-integration-tests-ssl_eventing-kafka-broker_main afaffb0 link true /test channel-integration-tests-ssl
reconciler-tests-keda_eventing-kafka-broker_main afaffb0 link false /test reconciler-tests-keda

Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@Cali0707
Copy link
Member Author

Okay, with the benchmark logging issues fixed, I was able to benchmark this optimization and track the memory usage diff. The results were:

 name                                                                                                                       mode    old count   new count   units    diff
 AnyFilterBenchmark.AnyFilter2EventsMatch2DifferentFilters.benchmarkFilterCreation                                          thrpt   4.687       2.062       ops/us   -56.00597%
 AnyFilterBenchmark.AnyFilter2EventsMatch2DifferentFilters.benchmarkFilterCreation:gc.alloc.norm                            thrpt   1240.0      1608.07     B/op     +29.68306%
 AnyFilterBenchmark.AnyFilter2EventsMatch2DifferentFilters.benchmarkFilterEvaluation                                        thrpt   40.176      44.505      ops/us   +10.77509%
 AnyFilterBenchmark.AnyFilter2EventsMatch2DifferentFilters.benchmarkFilterEvaluation:gc.alloc.norm                          thrpt   0           0           B/op     +0.00000%
 AnyFilterBenchmark.AnyFilter2EventsMatch2DifferentFiltersOneFilterMatchesNeither.benchmarkFilterCreation                   thrpt   3.161       1.599       ops/us   -49.41474%
 AnyFilterBenchmark.AnyFilter2EventsMatch2DifferentFiltersOneFilterMatchesNeither.benchmarkFilterCreation:gc.alloc.norm     thrpt   1776.0      2152.179    B/op     +21.17562%
 AnyFilterBenchmark.AnyFilter2EventsMatch2DifferentFiltersOneFilterMatchesNeither.benchmarkFilterEvaluation                 thrpt   22.754      40.077      ops/us   +76.13167%
 AnyFilterBenchmark.AnyFilter2EventsMatch2DifferentFiltersOneFilterMatchesNeither.benchmarkFilterEvaluation:gc.alloc.norm   thrpt   0           0           B/op     +0.00000%
 AnyFilterBenchmark.AnyFilterFirstMatchAtEnd.benchmarkFilterCreation                                                        thrpt   3.139       1.565       ops/us   -50.14336%
 AnyFilterBenchmark.AnyFilterFirstMatchAtEnd.benchmarkFilterCreation:gc.alloc.norm                                          thrpt   1776.0      2120.075    B/op     +19.37359%
 AnyFilterBenchmark.AnyFilterFirstMatchAtEnd.benchmarkFilterEvaluation                                                      thrpt   24.342      59.753      ops/us   +145.47285%
 AnyFilterBenchmark.AnyFilterFirstMatchAtEnd.benchmarkFilterEvaluation:gc.alloc.norm                                        thrpt   0           0           B/op     +00.00000%
 AnyFilterBenchmark.AnyFilterFirstMatchAtStart.benchmarkFilterCreation                                                      thrpt   2.898       1.573       ops/us   -45.72119%
 AnyFilterBenchmark.AnyFilterFirstMatchAtStart.benchmarkFilterCreation:gc.alloc.norm                                        thrpt   1840.0      1992.083    B/op     +8.26538%
 AnyFilterBenchmark.AnyFilterFirstMatchAtStart.benchmarkFilterEvaluation                                                    thrpt   95.931      92.619      ops/us   -3.45248%
 AnyFilterBenchmark.AnyFilterFirstMatchAtStart.benchmarkFilterEvaluation:gc.alloc.norm                                      thrpt   0           0           B/op     +0.00000%
 AnyFilterBenchmark.AnyFilterMatchAllSubfilters.benchmarkFilterCreation                                                     thrpt   2.919       1.598       ops/us   -45.25522%
 AnyFilterBenchmark.AnyFilterMatchAllSubfilters.benchmarkFilterCreation:gc.alloc.norm                                       thrpt   1840.0      2248.075    B/op     +22.17798%
 AnyFilterBenchmark.AnyFilterMatchAllSubfilters.benchmarkFilterEvaluation                                                   thrpt   95.821      88.887      ops/us   -7.23641%
 AnyFilterBenchmark.AnyFilterMatchAllSubfilters.benchmarkFilterEvaluation:gc.alloc.norm                                     thrpt   0           0           B/op     +0.00000%
 AnyFilterBenchmark.AnyFilterWithExactFilterBenchmark.benchmarkFilterCreation                                               thrpt   9.04        2.954       ops/us   -67.32301%
 AnyFilterBenchmark.AnyFilterWithExactFilterBenchmark.benchmarkFilterCreation:gc.alloc.norm                                 thrpt   672.0       920.068     B/op     +36.91488%
 AnyFilterBenchmark.AnyFilterWithExactFilterBenchmark.benchmarkFilterEvaluation                                             thrpt   119.506     92.287      ops/us   -22.77626%
 AnyFilterBenchmark.AnyFilterWithExactFilterBenchmark.benchmarkFilterEvaluation:gc.alloc.norm                               thrpt   0           0           B/op     +0.00000%

So, on average this increased the memory usage of an any filter by approx. 400 bytes. For an all filter, this should be the same as the same code was used.

IMO, this is a reasonable tradeoff for the large (100%+) throughput improvements

Copy link
Contributor

This Pull Request is stale because it has been open for 90 days with
no activity. It will automatically close after 30 more days of
inactivity. Reopen with /reopen. Mark as fresh by adding the
comment /remove-lifecycle stale.

@github-actions github-actions bot added lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/data-plane size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants