New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Kyuubi return query status (FINISH_STATE) quickly, but job still executing. #6302
Comments
Hello @DA1OOO, |
cc @cxzl25 and @ulysses-you |
Hi @DA1OOO, can you find /**
* Cancel all jobs in the given job group ID.
* @param cancelFutureJobs if true, future submitted jobs in this job group will be cancelled
*/
def cancelJobGroup(groupId: String, cancelFutureJobs: Boolean = false): Unit = {
logInfo(s"Asked to cancel job group $groupId with cancelFutureJobs=$cancelFutureJobs")
eventProcessLoop.post(JobGroupCancelled(groupId, cancelFutureJobs))
} |
@XorSum $ grep "2ec5749d-9b85-454e-825f-b7fb13b868e9" _driver.log But when I manually cancel, the log you mentioned does appear. $ grep "c5c17ebd-9aaa-48d4-ae17-42203fb6fa60" _driver.log |
Could you provide more information to help me reproduce this bug? |
@XorSum
in table ios_test_behavior, column countdata are json string, and
|
@XorSum is above information enough to reproduce this case? |
@DA1OOO I tried running the code below on Kyuubi 1.9 and Spark 3.4.2, but I couldn't reproduce your bug. In other words, I can see the
|
I think it maybe relative to the number of data rows, because when i not use explode in subquery. it will not be reproduced. i dont have any test dataset, we meet this case in our production environment. |
We don't need real datasets in the production environment. |
ok, I will try to reproduce this case using some test dataset later. |
@XorSum after build this project the follow is the sql which could be used to reproduced this bug. this is my kyuubi version information and query's start time and finish time. this is actually the spark job information. you can see, before the job 310 finished, query's state has been change to FINISHED. relative log:
|
Code of Conduct
Search before asking
Describe the bug
When I submit query by kyuubi to Spark, kyuubi return query status (FINISH_STATE) quickly, but spark job still executing.
Then I find similar bug description in issue. @XorSum #5232, in this issue, said this fix has been merged into 1.8.0, but I try both 1.8.0, 1.8.1, 1.9.0, this problem will always arise, Is there any configuration that needs to be modified?
I find my SQL is also
(empty table) left join (big table)
.my SQL is as follow:
select * from ( select uid from db.table1 where d_s = 20240408 and abtest_id like 'sadjjfljlajf' ) t1 left join ( select vendor_id FROM db.table2 LATERAL VIEW explode (json_object_keys (countdata)) t1 AS kv WHERE d_s = 20240408 ) t2 on t2.vendor_id = t1.uid limit 5;
in this SQL, the left sub query result is empty (0 rows).
select count(1) from db.table1 where d_s = 20240408 and abtest_id like 'sadjjfljlajf'
right sub-query result is relative big (4432411935 rows).
select count(1) FROM db.table2 LATERAL VIEW explode (json_object_keys (countdata)) t1 AS kv WHERE d_s = 20240408
Affects Version(s)
kyuubi 1.8.0 / 1.8.1/1.9.0 | Spark 3.4.2
Kyuubi Server Log Output
No response
Kyuubi Engine Log Output
No response
Kyuubi Server Configurations
No response
Kyuubi Engine Configurations
No response
Additional context
No response
Are you willing to submit PR?
The text was updated successfully, but these errors were encountered: