Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run Filewatchers triggerfunction only once on changes on the watched file #3752

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

creydr
Copy link
Contributor

@creydr creydr commented Mar 11, 2024

We have seen the Filewatcher unit tests being flaky with:

09:56:39.778 [main] DEBUG dev.knative.eventing.kafka.broker.core.eventbus.ContractPublisher - Contract unchanged generation=0 lastGeneration=0
[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.310 s -- in dev.knative.eventing.kafka.broker.core.eventbus.ContractPublisherTest
[INFO] 
[INFO] Results:
[INFO] 
[ERROR] Errors: 
[ERROR]   FileWatcherTest.testFileModification:62 » ConditionTimeout Condition dev.knative.eventing.kafka.broker.core.file.FileWatcherTest$$Lambda/0x00007fc8b318c7a0 was not fulfilled within 10 seconds.
[INFO] 
[ERROR] Tests run: 130, Failures: 0, Errors: 1, Skipped: 0
[INFO] 

This is, as the MODIFY event can get send twice (once for update on the file content and once for update on the files modification date. See https://stackoverflow.com/a/25221600):

created file
15:53:06.445 [contract-file-watcher] INFO  dev.knative.eventing.kafka.broker.core.file.FileWatcher - Started watching /tmp/test10156420168117534422.txt
15:53:06.446 [contract-file-watcher] DEBUG dev.knative.eventing.kafka.broker.core.file.FileWatcher - running trigger func initialize
running trigger func
updating file
15:53:06.446 [contract-file-watcher] DEBUG dev.knative.eventing.kafka.broker.core.file.FileWatcher - Contract updates
15:53:06.446 [contract-file-watcher] DEBUG dev.knative.eventing.kafka.broker.core.file.FileWatcher - Got ENTRY_MODIFY on test10156420168117534422.txt
running trigger func
15:53:06.446 [contract-file-watcher] DEBUG dev.knative.eventing.kafka.broker.core.file.FileWatcher - Contract updates
15:53:06.447 [contract-file-watcher] DEBUG dev.knative.eventing.kafka.broker.core.file.FileWatcher - Got ENTRY_MODIFY on test10156420168117534422.txt
running trigger func
15:53:16.477 [contract-file-watcher] DEBUG dev.knative.eventing.kafka.broker.core.file.FileWatcher - Interrupted exception. Stopping filewatching thread

This can be fixed by either a Thread.sleep() right after the watcher.take(), or by checking the files modification date and only calling the trigger function if it got updated.

This is addressed in this PR. In addition, the trigger function is only called, when the watched file got updated (the filewatcher watches a directory and notifies on changes on all files): 94f3210

08:26:52.687 [contract-file-watcher] INFO  dev.knative.eventing.kafka.broker.core.file.FileWatcher - Started watching /tmp/test4810167469550644550.txt
08:26:52.688 [contract-file-watcher] DEBUG dev.knative.eventing.kafka.broker.core.file.FileWatcher - Calling trigger function for initial run
08:26:52.688 [contract-file-watcher] DEBUG dev.knative.eventing.kafka.broker.core.file.FileWatcher - Contract updates
08:26:52.689 [contract-file-watcher] DEBUG dev.knative.eventing.kafka.broker.core.file.FileWatcher - Got ENTRY_MODIFY for file: /tmp/test4810167469550644550.txt, count: 1
08:26:52.689 [contract-file-watcher] DEBUG dev.knative.eventing.kafka.broker.core.file.FileWatcher - Calling trigger func as we got a ENTRY_MODIFY on /tmp/test4810167469550644550.txt
08:26:52.689 [contract-file-watcher] DEBUG dev.knative.eventing.kafka.broker.core.file.FileWatcher - Contract updates
08:26:52.689 [contract-file-watcher] DEBUG dev.knative.eventing.kafka.broker.core.file.FileWatcher - Got ENTRY_MODIFY for file: /tmp/test4810167469550644550.txt, count: 1
08:26:52.696 [contract-file-watcher] DEBUG dev.knative.eventing.kafka.broker.core.file.FileWatcher - Modification date didn't change (1710142012688 - 1710142012688) . Skipping...
08:26:52.729 [contract-file-watcher] DEBUG dev.knative.eventing.kafka.broker.core.file.FileWatcher - Contract updates
08:26:52.730 [contract-file-watcher] DEBUG dev.knative.eventing.kafka.broker.core.file.FileWatcher - Got ENTRY_CREATE for file: /tmp/.java_pid93201.tmp, count: 1
08:26:52.730 [contract-file-watcher] DEBUG dev.knative.eventing.kafka.broker.core.file.FileWatcher - Calling trigger func as we got a ENTRY_CREATE on /tmp/.java_pid93201.tmp
08:26:54.041 [contract-file-watcher] DEBUG dev.knative.eventing.kafka.broker.core.file.FileWatcher - Contract updates
08:26:54.041 [contract-file-watcher] DEBUG dev.knative.eventing.kafka.broker.core.file.FileWatcher - Got ENTRY_DELETE for file: /tmp/.com.google.Chrome.80NSSy, count: 1
08:26:54.041 [contract-file-watcher] DEBUG dev.knative.eventing.kafka.broker.core.file.FileWatcher - Modification date didn't change (0 - 0) . Skipping...

@knative-prow knative-prow bot added approved Indicates a PR has been approved by an approver from all required OWNERS files. area/data-plane size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 11, 2024
Copy link

codecov bot commented Mar 11, 2024

Codecov Report

Attention: Patch coverage is 92.85714% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 52.48%. Comparing base (7a3464c) to head (ada785a).
Report is 1 commits behind head on main.

Files Patch % Lines
...ve/eventing/kafka/broker/dispatcher/main/Main.java 0.00% 1 Missing ⚠️
...tive/eventing/kafka/broker/receiver/main/Main.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #3752      +/-   ##
============================================
+ Coverage     52.44%   52.48%   +0.04%     
- Complexity      874      877       +3     
============================================
  Files           342      342              
  Lines         21415    21431      +16     
  Branches        284      286       +2     
============================================
+ Hits          11231    11248      +17     
  Misses         9274     9274              
+ Partials        910      909       -1     
Flag Coverage Δ
java-unittests 74.29% <92.85%> (+0.15%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@pierDipi pierDipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch @creydr !

/lgtm
/approve

@@ -42,13 +42,13 @@ public void tearDown() throws Exception {
Files.deleteIfExists(tempFile.toPath());
}

@Test
@RepeatedTest(20)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

@knative-prow knative-prow bot added the lgtm Indicates that a PR is ready to be merged. label Mar 11, 2024
Copy link

knative-prow bot commented Mar 11, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: creydr, pierDipi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@pierDipi
Copy link
Member

I guess some formatting issues on the build test

@knative-prow knative-prow bot removed the lgtm Indicates that a PR is ready to be merged. label Mar 11, 2024
@pierDipi
Copy link
Member

/lgtm

@knative-prow knative-prow bot added the lgtm Indicates that a PR is ready to be merged. label Mar 11, 2024
@matzew
Copy link
Contributor

matzew commented Mar 11, 2024

Do we want to backport this?

Copy link

knative-prow bot commented Mar 11, 2024

@creydr: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
channel-integration-tests-ssl_eventing-kafka-broker_main ada785a link true /test channel-integration-tests-ssl
channel-integration-tests-sasl-ssl_eventing-kafka-broker_main ada785a link true /test channel-integration-tests-sasl-ssl
channel-reconciler-tests-sasl-plain_eventing-kafka-broker_main ada785a link true /test channel-reconciler-tests-sasl-plain
channel-integration-tests-sasl-plain_eventing-kafka-broker_main ada785a link true /test channel-integration-tests-sasl-plain
integration-tests_eventing-kafka-broker_main ada785a link true /test integration-tests
reconciler-tests-keda_eventing-kafka-broker_main ada785a link true /test reconciler-tests-keda
channel-reconciler-tests-sasl-ssl_eventing-kafka-broker_main ada785a link true /test channel-reconciler-tests-sasl-ssl
channel-reconciler-tests-ssl_eventing-kafka-broker_main ada785a link true /test channel-reconciler-tests-ssl
upgrade-tests_eventing-kafka-broker_main ada785a link true /test upgrade-tests
reconciler-tests_eventing-kafka-broker_main ada785a link true /test reconciler-tests
reconciler-tests-namespaced-broker_eventing-kafka-broker_main ada785a link true /test reconciler-tests-namespaced-broker

Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@creydr
Copy link
Contributor Author

creydr commented Mar 11, 2024

/hold

{"@timestamp":"2024-03-11T09:21:19.386Z","@version":"1","message":"failed to parse from JSON","logger_name":"dev.knative.eventing.kafka.broker.core.eventbus.ContractPublisher","thread_name":"contract-file-watcher","level":"ERROR","level_value":40000,"stack_trace":"com.google.protobuf.InvalidProtocolBufferException: Expect message object but got: null\n\tat com.google.protobuf.util.JsonFormat$ParserImpl.mergeMessage(JsonFormat.java:1481)\n\tat com.google.protobuf.util.JsonFormat$ParserImpl.merge(JsonFormat.java:1458)\n\tat com.google.protobuf.util.JsonFormat$ParserImpl.merge(JsonFormat.java:1322)\n\tat com.google.protobuf.util.JsonFormat$Parser.merge(JsonFormat.java:486)\n\tat dev.knative.eventing.kafka.broker.core.eventbus.ContractPublisher.parseFromJson(ContractPublisher.java:98)\n\tat dev.knative.eventing.kafka.broker.core.eventbus.ContractPublisher.updateContract(ContractPublisher.java:74)\n\tat dev.knative.eventing.kafka.broker.core.file.FileWatcher.run(FileWatcher.java:127)\n\tat java.base/java.lang.Thread.run(Thread.java:1583)\n"}

@knative-prow knative-prow bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 11, 2024
@Cali0707
Copy link
Member

Cali0707 commented Apr 4, 2024

I think we maybe need to check if the file is empty somewhere in

if (Thread.interrupted()) {
return;
}
try (final var fileReader = new FileReader(newContract);

@creydr
Copy link
Contributor Author

creydr commented Apr 5, 2024

I think we maybe need to check if the file is empty somewhere in

if (Thread.interrupted()) {
return;
}
try (final var fileReader = new FileReader(newContract);

This PR has some issues with symlinks ATM. I have a WIP change for it but need to recheck on it...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/data-plane do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants