Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TASK][MEDIUM] Fix compatibility with Apache Spark master branch #6246

Open
3 tasks done
pan3793 opened this issue Apr 3, 2024 · 12 comments
Open
3 tasks done

[TASK][MEDIUM] Fix compatibility with Apache Spark master branch #6246

pan3793 opened this issue Apr 3, 2024 · 12 comments
Assignees

Comments

@pan3793
Copy link
Member

pan3793 commented Apr 3, 2024

What's the level of this task?

MEDIUM

Code of Conduct

Search before creating

  • I have searched in the task list and found no similar tasks.

Mentor

  • I have sufficient expertise on this task, and I volunteer to be a mentor of this task to guide contributors through the task.

Skill requirements

  • Knowledge on GitHub Actions, Maven, Spark

Background and Goals

Kyuubi has daily testing on the Spark master branch, but unfortunately, it has been failed for a while.

Implementation steps

Check and fix the daily test with the Apache Spark master branch
https://github.com/apache/kyuubi/actions/workflows/nightly.yml

Additional context

Introduction of 2024H1 Kyuubi Code Contribution Program

@ShockleysxX
Copy link

Please assign this task to me, I will try to solve it.

@pan3793
Copy link
Member Author

pan3793 commented Apr 3, 2024

@ShockleysxX thanks, assigned

@pan3793
Copy link
Member Author

pan3793 commented Apr 7, 2024

Revoked as you already have been assigned another task.

@liuxiaocs7
Copy link
Member

According to the Log displayed in CI

image

run locally

./build/mvn clean install -Pscala-2.13 -Pspark-master -pl extensions/spark/kyuubi-spark-lineage -am -Dmaven.javadoc.skip=true -V -Dtest=none -DwildcardSuites=org.apache.kyuubi.plugin.lineage.helper.SparkSQLLineageParserHelperSuite

got the same result

Discovery starting.
Discovery completed in 810 milliseconds.
Run starting. Expected test count is: 33
SparkSQLLineageParserHelperSuite:
*** RUN ABORTED ***
  java.lang.NoClassDefFoundError: jakarta/servlet/Servlet
  at org.apache.spark.metrics.sink.MetricsServlet.getHandlers(MetricsServlet.scala:50)
  at org.apache.spark.metrics.MetricsSystem.$anonfun$getServletHandlers$2(MetricsSystem.scala:91)
  at scala.Option.map(Option.scala:242)
  at org.apache.spark.metrics.MetricsSystem.getServletHandlers(MetricsSystem.scala:91)
  at org.apache.spark.SparkContext.<init>(SparkContext.scala:686)
  at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2937)
  at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:1117)
  at scala.Option.getOrElse(Option.scala:201)
  at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:1111)
  at org.apache.spark.sql.SparkListenerExtensionTest.spark(SparkListenerExtensionTest.scala:42)
  ...
  Cause: java.lang.ClassNotFoundException: jakarta.servlet.Servlet
  at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
  at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
  at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:520)
  at org.apache.spark.metrics.sink.MetricsServlet.getHandlers(MetricsServlet.scala:50)
  at org.apache.spark.metrics.MetricsSystem.$anonfun$getServletHandlers$2(MetricsSystem.scala:91)
  at scala.Option.map(Option.scala:242)
  at org.apache.spark.metrics.MetricsSystem.getServletHandlers(MetricsSystem.scala:91)
  at org.apache.spark.SparkContext.<init>(SparkContext.scala:686)
  at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2937)
  at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:1117)
  ...
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Kyuubi Project Parent 1.10.0-SNAPSHOT:
[INFO]
[INFO] Kyuubi Project Parent .............................. SUCCESS [  5.595 s]
[INFO] Kyuubi Project Util ................................ SUCCESS [  3.653 s]
[INFO] Kyuubi Project Util Scala .......................... SUCCESS [ 21.428 s]
[INFO] Kyuubi Project Common .............................. SUCCESS [ 50.013 s]
[INFO] Kyuubi Project Events .............................. SUCCESS [ 13.887 s]
[INFO] Kyuubi Dev Spark Lineage Extension ................. FAILURE [ 19.960 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------

add <jakarta.servlet-api.version>5.0.0</jakarta.servlet-api.version> to spark-master profile (https://github.com/apache/spark/blob/b299b2bc06a91db630ab39b9c35663342931bb56/pom.xml#L147)

find new error:

Discovery starting.
Discovery completed in 883 milliseconds.
Run starting. Expected test count is: 33
SparkSQLLineageParserHelperSuite:
ANTLR Tool version 4.13.1 used for code generation does not match the current runtime version 4.9.3
ANTLR Runtime version 4.13.1 used for parser compilation does not match the current runtime version 4.9.3
*** RUN ABORTED ***
  java.lang.ExceptionInInitializerError:
  at org.apache.spark.sql.catalyst.parser.AbstractParser.parse(parsers.scala:58)
  at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:55)
  at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(AbstractSqlParser.scala:68)
  at org.apache.spark.sql.SparkSession.$anonfun$sql$5(SparkSession.scala:701)
  at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:138)
  at org.apache.spark.sql.SparkSession.$anonfun$sql$4(SparkSession.scala:700)
  at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:918)
  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:699)
  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:730)
  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:761)
  ...
  Cause: java.lang.UnsupportedOperationException: java.io.InvalidClassException: org.antlr.v4.runtime.atn.ATN; Could not deserialize ATN with version 4 (expected 3).
  at org.antlr.v4.runtime.atn.ATNDeserializer.deserialize(ATNDeserializer.java:187)
  at org.apache.spark.sql.catalyst.parser.SqlBaseLexer.<clinit>(SqlBaseLexer.java:2949)
  at org.apache.spark.sql.catalyst.parser.AbstractParser.parse(parsers.scala:58)
  at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:55)
  at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(AbstractSqlParser.scala:68)
  at org.apache.spark.sql.SparkSession.$anonfun$sql$5(SparkSession.scala:701)
  at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:138)
  at org.apache.spark.sql.SparkSession.$anonfun$sql$4(SparkSession.scala:700)
  at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:918)
  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:699)
  ...
  Cause: java.io.InvalidClassException: org.antlr.v4.runtime.atn.ATN; Could not deserialize ATN with version 4 (expected 3).
  at org.antlr.v4.runtime.atn.ATNDeserializer.deserialize(ATNDeserializer.java:187)
  at org.apache.spark.sql.catalyst.parser.SqlBaseLexer.<clinit>(SqlBaseLexer.java:2949)
  at org.apache.spark.sql.catalyst.parser.AbstractParser.parse(parsers.scala:58)
  at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:55)
  at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(AbstractSqlParser.scala:68)
  at org.apache.spark.sql.SparkSession.$anonfun$sql$5(SparkSession.scala:701)
  at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:138)
  at org.apache.spark.sql.SparkSession.$anonfun$sql$4(SparkSession.scala:700)
  at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:918)
  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:699)
  ...
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Kyuubi Project Parent 1.10.0-SNAPSHOT:
[INFO]
[INFO] Kyuubi Project Parent .............................. SUCCESS [  5.822 s]
[INFO] Kyuubi Project Util ................................ SUCCESS [  7.390 s]
[INFO] Kyuubi Project Util Scala .......................... SUCCESS [ 24.927 s]
[INFO] Kyuubi Project Common .............................. SUCCESS [01:30 min]
[INFO] Kyuubi Project Events .............................. SUCCESS [ 32.613 s]
[INFO] Kyuubi Dev Spark Lineage Extension ................. FAILURE [ 18.767 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------

continue to add <antlr4.version>4.13.1</antlr4.version> to spark-master profile (https://github.com/apache/spark/blob/b299b2bc06a91db630ab39b9c35663342931bb56/pom.xml#L210)

build and test are ok in this module.

[INFO] Reactor Summary for Kyuubi Project Parent 1.10.0-SNAPSHOT:
[INFO]
[INFO] Kyuubi Project Parent .............................. SUCCESS [  5.369 s]
[INFO] Kyuubi Project Util ................................ SUCCESS [  3.648 s]
[INFO] Kyuubi Project Util Scala .......................... SUCCESS [ 14.716 s]
[INFO] Kyuubi Project Common .............................. SUCCESS [ 51.174 s]
[INFO] Kyuubi Project Events .............................. SUCCESS [ 14.505 s]
[INFO] Kyuubi Dev Spark Lineage Extension ................. SUCCESS [ 44.672 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------

but ./build/mvn clean install -Pscala-2.13 -Pspark-master -pl externals/kyuubi-spark-sql-engine -am -Dmaven.javadoc.skip=true -V -Dtest=none -DskipTests build fails

[INFO] compiling 59 Scala sources to /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/target/scala-2.13/classes ...
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/kyuubi/engine/spark/operation/ExecutePython.scala:27: object ws is not a member of package javax
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/kyuubi/engine/spark/operation/ExecutePython.scala:307: not found: value UriBuilder
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/kyuubi/engine/spark/operation/ExecutePython.scala:320: not found: value UriBuilder
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EnginePage.scala:80: type mismatch;
 found   : HttpServletRequest (in javax.servlet.http)
 required: HttpServletRequest (in jakarta.servlet.http)
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EnginePage.scala:135: type mismatch;
 found   : HttpServletRequest (in javax.servlet.http)
 required: HttpServletRequest (in jakarta.servlet.http)
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EnginePage.scala:258: type mismatch;
 found   : HttpServletRequest (in javax.servlet.http)
 required: HttpServletRequest (in jakarta.servlet.http)
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EnginePage.scala:341: type mismatch;
 found   : HttpServletRequest (in javax.servlet.http)
 required: HttpServletRequest (in jakarta.servlet.http)
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EnginePage.scala:421: type mismatch;
 found   : HttpServletRequest (in javax.servlet.http)
 required: HttpServletRequest (in jakarta.servlet.http)
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EnginePage.scala:510: type mismatch;
 found   : HttpServletRequest (in javax.servlet.http)
 required: HttpServletRequest (in jakarta.servlet.http)
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EnginePage.scala:547: type mismatch;
 found   : HttpServletRequest (in javax.servlet.http)
 required: HttpServletRequest (in jakarta.servlet.http)
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EngineSessionPage.scala:101: type mismatch;
 found   : HttpServletRequest (in javax.servlet.http)
 required: HttpServletRequest (in jakarta.servlet.http)
[ERROR] [Error] /home/liuxiao/Code/JavaProjects/kyuubi/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EngineSessionPage.scala:234: type mismatch;
 found   : HttpServletRequest (in javax.servlet.http)
 required: HttpServletRequest (in jakarta.servlet.http)
[ERROR] 12 errors found
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Kyuubi Project Parent 1.10.0-SNAPSHOT:
[INFO]
[INFO] Kyuubi Project Parent .............................. SUCCESS [  4.661 s]
[INFO] Kyuubi Project Util ................................ SUCCESS [  3.557 s]
[INFO] Kyuubi Project Util Scala .......................... SUCCESS [ 21.494 s]
[INFO] Kyuubi Project Common .............................. SUCCESS [ 52.340 s]
[INFO] Kyuubi Project Embedded Zookeeper .................. SUCCESS [  9.851 s]
[INFO] Kyuubi Project High Availability ................... SUCCESS [ 25.424 s]
[INFO] Kyuubi Project Events .............................. SUCCESS [ 14.723 s]
[INFO] Kyuubi Dev Spark Lineage Extension ................. SUCCESS [ 27.378 s]
[INFO] Kyuubi Project Hive JDBC Client .................... SUCCESS [ 13.132 s]
[INFO] Kyuubi Project Engine Spark SQL .................... FAILURE [  8.426 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE

seems javax and jakarta namespace conflicts in spark 4.x and kyuubi codebase now? @pan3793 any suggestions?

@pan3793
Copy link
Member Author

pan3793 commented Apr 8, 2024

You may find solution from apache/spark#45154

@liuxiaocs7
Copy link
Member

Hi, @pan3793, I have browsed through what was done in apache/spark#45154, which mainly accomplished the replacement of Jakrata to Javax namespaces in Spark-master, and indeed could be the main cause of this CI error.

I tried to make some changes locally and found that there are currently some compilation issues mainly with the web related code under Kyuubi Project Engine Spark SQL module.

In some implementations of the module, Kyuubi uses e.g. the javax.servlet.http.HttpServletRequest class, which leads to incompatibility when compiling with the latest Spark, which uses web-related class under jakarta namespace.

But after modifying it to jakarta, other versions of Spark CI under Kyuubi reported the error again.

It seems that javax and jakarta are difficult to deal with in the current scenario to pass all CI because Kyuubi uses classes such as WebUIPage/UIUtils from Spark WebUI, which also import classes in the corresponding namespace?

@pan3793
Copy link
Member Author

pan3793 commented Apr 9, 2024

We did a similar thing for Flink previously.

Suppose there are incompatible classes in different versions:

package old

class Old {
  OldR method()
}
package new

class New {
  NewR method()
}

Then we can introduce a Shim class to handle it.

package shim

class Shim {
  R method() {
     // dynamically bind the runtime class using reflection.
  }
}

@liuxiaocs7
Copy link
Member

Thanks for your help, i'll take a look later~

@pan3793
Copy link
Member Author

pan3793 commented Apr 17, 2024

@liuxiaocs7 do you have progress on this task? 4.0.0 preview version is on the way, I hope we can fix the compatibility before that.

@liuxiaocs7
Copy link
Member

@liuxiaocs7 do you have progress on this task? 4.0.0 preview version is on the way, I hope we can fix the compatibility before that.

Sorry for late, yes, in progress.

@pan3793 pan3793 assigned pan3793 and unassigned liuxiaocs7 May 20, 2024
@pan3793
Copy link
Member Author

pan3793 commented May 20, 2024

Revoked as no progress for a while, Spark 4.0.0-preview1 is on the way, I'm working on this.

@pan3793 pan3793 changed the title [TASK][EASY] Fix compatibility with Apache Spark master branch [TASK][MEDIUM] Fix compatibility with Apache Spark master branch May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants