You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have searched in the issues and found no similar issues.
What would you like to be improved?
While trying to use kyuubi with tableau through thrift server exposed by kyuubi, we have noticed that for small dataset everything works fine. But for larger dataset when the collect operation is called, data transfer fails due to driver going OOM. For the same reason kyuubi exposes kyuubi.operation.incremental.collect flag which can be set to true for incremental collect. While this looks like an ideal solution (which it is) but there are performance bottlenecks when using this flag.
I ran same query through STS & Kyuubi thrift server
Query data size :
Data size : ~5Gbs
Records count : 32854241
Results as below with spark.sql.shuffle.partitions = 50 :
STS took 12-15 mins to run the job & transfer the data to tableau.
Kyuubi thrift was left to run for 2+ hours & transfered data size was only ~600Mbs
As can be seen there is significant performance difference between the two.
Kyuubi version : 1.9
Spark Version : 3.1.2
Running kyuubi on AWS EMR (version 6.5.0) only on primary node
Upon looks at STS & Kyuubi code, i could see a lot of similarities but also differences here & there. One major point i noticed is that some logs were missing from kyuubi, which could help to notice what is happening in kyuubi during data transfer.
For e.g.
Below logs were getting printed in STS but not in kyuubi
24/04/24 06:49:51 INFO SparkExecuteStatementOperation: Received getNextRowSet request order=FETCH_NEXT and maxRowsL=10000 with a1d07d3a-d6bb-4706-99c0-728fa8115816
24/04/24 06:49:52 INFO SparkExecuteStatementOperation: Returning result set with 10000 rows from offsets [1320000, 1330000) with a1d07d3a-d6bb-4706-99c0-728fa8115816
To investigate further, maybe adding more logs is a good start to check what exactly is happening & where is the bottleneck.
Please feel free to suggest or ask for any other additional information if needed.
Are you willing to submit PR?
Yes. I would be willing to submit a PR with guidance from the Kyuubi community to improve.
No. I cannot submit a PR at this time.
The text was updated successfully, but these errors were encountered:
Code of Conduct
Search before asking
What would you like to be improved?
While trying to use kyuubi with tableau through thrift server exposed by kyuubi, we have noticed that for small dataset everything works fine. But for larger dataset when the collect operation is called, data transfer fails due to driver going OOM. For the same reason kyuubi exposes
kyuubi.operation.incremental.collect
flag which can be set totrue
for incremental collect. While this looks like an ideal solution (which it is) but there are performance bottlenecks when using this flag.I ran same query through STS & Kyuubi thrift server
Query data size :
Results as below with
spark.sql.shuffle.partitions
= 50 :As can be seen there is significant performance difference between the two.
Kyuubi version : 1.9
Spark Version : 3.1.2
Running kyuubi on AWS EMR (version 6.5.0) only on primary node
kyuubi-defaults.conf
configkyuubi.ha.addresses ..compute.internal
kyuubi.operation.incremental.collect true
spark.submit.deployMode cluster # ( have tried with client as well )
spark.serializer org.apache.spark.serializer.KryoSerializer
spark.scheduler.mode FAIR
spark.rdd.compress true
spark.shuffle.service.enabled true
spark.sql.hive.convertMetastoreParquet false
spark.sql.catalogImplementation hive
spark.sql.shuffle.partitions 50
spark.kryoserializer.buffer.max 1g
spark.driver.maxResultSize 25g
spark.driver.memory 35g
spark.executor.memory 25g
spark.driver.memoryOverhead 4g
spark.executor.memoryOverhead 3g
spark.cleaner.periodicGC.interval 10min
How should we improve?
Upon looks at STS & Kyuubi code, i could see a lot of similarities but also differences here & there. One major point i noticed is that some logs were missing from kyuubi, which could help to notice what is happening in kyuubi during data transfer.
For e.g.
Below logs were getting printed in STS but not in kyuubi
To investigate further, maybe adding more logs is a good start to check what exactly is happening & where is the bottleneck.
Please feel free to suggest or ask for any other additional information if needed.
Are you willing to submit PR?
The text was updated successfully, but these errors were encountered: