Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discuss][CHIP-3] Spark UI Tab with progress grid. #685

Open
nikhilsimha opened this issue Feb 20, 2024 · 1 comment
Open

[Discuss][CHIP-3] Spark UI Tab with progress grid. #685

nikhilsimha opened this issue Feb 20, 2024 · 1 comment

Comments

@nikhilsimha
Copy link
Contributor

nikhilsimha commented Feb 20, 2024

Problem

Depending on the parallelism, we are simultaneously running several joinPart jobs. Besides that, we also have bootstrap and the final join potentially running in parallel. In the future [CHIP-4] we aim to have non-uniform step sizes for each joinPart and parallelize the construction of bootstrap and the final join. In addition we are also planning to add caching to groupBy's.

This means that it is hard to make sense of the progress of the job - specially if the number of join parts is more than 5. We have seen Joins with joinParts all the way up to 135!

Goal

We aim to create an additional tab in the spark UI if possible - that can show the progress of the current job.

We have to indicate the progress of several components

  1. Bootstrap
  2. JoinParts
  3. JoinPart Caches (in the future)
  4. Final Join

Each of these parts can have different step days, have intricate dependencies, and can run in parallel etc.

There is a lot of detail in the proposed UI - but it is best described with a mock up. See below.

overwatch

[ChatGPT][Un-tested] Feasibility of adding a custom spark UI based

Adding a custom tab to the Apache Spark UI involves extending the Spark UI with your own Scala or Java code. The process generally involves creating classes that extend the SparkUITab and WebUIPage classes provided by Spark, and then integrating these into the Spark UI. Here's a step-by-step example in Scala, assuming you're familiar with Spark's programming model and have a basic setup ready:

  1. Create a Custom Page Class: This class will represent the content of your custom tab.
import org.apache.spark.ui.{WebUIPage, WebUI}
import scala.xml.Node

class MyCustomPage(parent: WebUI)
  extends WebUIPage("myCustomPage") {
  // Define how your page should render its content
  override def render(request: HttpServletRequest): Seq[Node] = {
    <html>
      <body>
        <div>Hello, this is my custom page!</div>
      </body>
    </html>
  }
}
  1. Create a Custom Tab Class: This class will add a new tab to the Spark UI, holding your custom page(s).
import org.apache.spark.ui.{SparkUITab, SparkUI}

class MyCustomTab(sparkUI: SparkUI)
  extends SparkUITab(sparkUI, "myCustomTab") {
  // Add your custom page to this tab
  attachPage(new MyCustomPage(this))
  // You can add more pages here
}
  1. Integrate Your Custom Tab with Spark UI: To integrate your custom tab, you need to access the SparkUI instance. This might be done in your Spark application's driver program, a custom Spark listener, or through other extensions points provided by Spark. Here's a basic example to illustrate how it might look in a Spark listener:
import org.apache.spark.scheduler.{SparkListener, SparkListenerApplicationStart}
import org.apache.spark.ui.SparkUI

class MyCustomSparkListener extends SparkListener {
  override def onApplicationStart(applicationStart: SparkListenerApplicationStart): Unit = {
    applicationStart.sparkUI.foreach { ui =>
      ui.attachTab(new MyCustomTab(ui))
    }
  }
}
  1. Register Your Listener with Spark: You can register your custom listener through Spark's configuration by adding the following line to your spark-defaults.conf file or passing it as a configuration parameter:
--conf spark.extraListeners=your.package.MyCustomSparkListener

Replace your.package with the actual package name where your MyCustomSparkListener class resides.

Note: This example assumes you have a basic understanding of Scala, Spark, and how to compile and include your custom code with your Spark application. Integrating custom UI components requires your code to be compiled into a JAR that should be included in your Spark application's classpath. Depending on your Spark deployment mode (e.g., standalone, YARN, Mesos), the way you include your custom JAR might vary.

@nikhilsimha
Copy link
Contributor Author

nikhilsimha commented Feb 20, 2024

Feedback:

  1. Get resource allocation delay and incorporate into the UI
  2. Another tab for tuning tips
  3. Try to show memory consumption

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant