New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ZEPPELIN-5981] Check binded interpreter inside Paragraph.jobAbort() #4693
base: master
Are you sure you want to change the base?
Conversation
@elbakramer Thank you for the contribution. BTW, could you please check the CI? All tests for core-modules failed. I hope you investigate and fix them before starting reviews. |
@jongyoul I think the failures are due to CI actions taking too long, hopefully not by the code change. Could you please check if it's the CI issue first? Or is there any way to investigate the CI or just rerun by myself? |
@elbakramer I don't think we have any critical issues in CI now but let me restart it. FYI, I checked the log briefly and it looks like hanging for a long time in some tests. |
Hello, I re-ran the test but there are still some error with the same situation before. I checked the current CI and it didn't have a problem. Could you please check why the CI stops and hangs? |
I think now CI does not hang and all tests for core modules finish properly with the green check icons. But I'm still curious whether the errors and warnings shown in the Annotations block below should still be resolved. Could you please check if it's ok or not? https://github.com/elbakramer/zeppelin/actions/runs/7013954933 What I've found about the hang is that, it seems like the In order to prevent that creation, I've added some checks before actually calling the method:
I couldn't think of a direct way to check the existence of the interpreter instance. So it just indirectly checks whether if the job is currently running or not, assuming that there would be an interpreter instance if the job is in a running state. |
Thank you for checking and fixing it. I think your analysis looks good. Let's see how the CI goes. I ran it again. |
Could you please rebase this PR to trigger the latest CI again? one of them still fails. |
What is this PR for?
Inside an open zeppelin notebook, after running a paragraph by clicking the play icon button, canceling the running paragraph by clicking the pause icon button (or the Ctrl+Option+C shortcut) does not interrupt the running process and cancel the paragraph.
I've checked that
CANCEL_PARAGRAPH
message gets well passed through the websocket connection but server was somehow completely ignoring the message.After some more investigation, I've found that the reason for that was the
Paragraph
'sjobAbort()
method, which does nothing due to the null interpreter member.zeppelin/zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/Paragraph.java
Line 505 in 73e1aa3
I've checked some other methods of
Paragraph
such ascompletion()
,execute()
,jobRun()
,recover()
and they were initializing interpreter before doing something by callinggetBindedInterpreter()
, so I've changed thejobAbort()
method to do the same.zeppelin/zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/Paragraph.java
Line 258 in 73e1aa3
zeppelin/zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/Paragraph.java
Line 333 in 73e1aa3
zeppelin/zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/Paragraph.java
Line 399 in 73e1aa3
zeppelin/zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/Paragraph.java
Line 810 in 73e1aa3
Other than that, I have some doubts on whether it's right thing to return true on failing what was requested.
But after all, I just sticked to the original behavior (returning true for every case) to prevent some unexpected side effects.
What type of PR is it?
Bug Fix
Todos
N/A
What is the Jira issue?
How should this be tested?
for i in range(10): time.sleep(1)
Screenshots (if appropriate)
N/A
Questions: