add option to do a spark-submit with a SparkListener to gather events from Spark #113

pjfanning · 2017-10-26T18:27:42Z

I was at Emily Curtin's Spark Summit Europe presentation today (which was very interesting). An attendee asked if Spark Bench gathered Spark executor metrics.
A SparkListener can be used to get benchmark data about how long was spent running tasks and how much data was shuffled (basically any data that can be seen in the Spark UI could be picked up and summarised).
https://spark.apache.org/docs/2.2.0/api/java/org/apache/spark/scheduler/SparkListener.html
spark-submit --conf spark.extraListeners=com.mycompany.MetricsListener
https://github.com/LucaCanali/sparkMeasure has a spark listener that gathers metrics.
https://github.com/groupon/sparklint also has one.

One possible design would be to

run spark-submit with a SparkListener that outputs the event data (eg as CSV)
run another spark job to summarise the event data and include the summary metrics with the other benchmark data

Another approach would be to run spark with spark.eventLog.enabled=true (and spark.eventLog.dir set) and parsing the json-lines output. https://github.com/groupon/sparklint also has code to summarise event logs to create metrics.

The text was updated successfully, but these errors were encountered:

ecurtin · 2017-10-28T09:03:40Z

Hi @pjfanning! I'm so glad you thought the talk was interesting :) For anybody else reading who wants to see it, they've told us it will be posted on Nov 3.

What you've outlined here is a great suggestion! While I have not tried it myself yet, adding listeners through the spark-submit conf should already work through existing means, like this:

spark-bench = {
  spark-submit-config = {
    spark-home = // ...
    spark-args = {
      // master, etc
    }
    conf = {
      "spark.extraListeners" = "com.mycompany.MetricsListener"
    }
  }
}

If that works out of the box, then getting that output bundled with the spark-bench output would be the logical next step.

@pjfanning Is this something you'd be interested in investigating?

Thanks again for your helpful suggestion! I am in shaky wifi territory for the next two days but will be back in regular communication after that :)

pjfanning · 2017-10-30T09:07:23Z

@ecurtin I may not have much time over the coming weeks but if I do find some time, I'll try prototyping something.

ecurtin · 2017-10-30T13:11:07Z

👍

pjfanning · 2017-11-03T20:49:21Z

I have a very early prototype at https://github.com/pjfanning/spark-bench/pull/2/files

Running bin/spark-bench.sh examples/minimal-example.conf on a distro with my change outputs

+-------+-------------+-------------+------------------+-----+------+------+---+-----------------+-----------------+--------------------+----------------------------+--------------------+--------------------+-----------------+-----------------------+------------+------------------+-------------------------+-------------------+--------------------+
|   name|    timestamp|total_runtime|    pi_approximate|input|output|slices|run|spark.driver.host|spark.driver.port|spark.extraListeners|hive.metastore.warehouse.dir|          spark.jars|      spark.app.name|spark.executor.id|spark.submit.deployMode|spark.master|spark.authenticate|spark.authenticate.secret|       spark.app.id|         description|
+-------+-------------+-------------+------------------+-----+------+------+---+-----------------+-----------------+--------------------+----------------------------+--------------------+--------------------+-----------------+-----------------------+------------+------------------+-------------------------+-------------------+--------------------+
|sparkpi|1509741483169|   1468030834|3.1425311425311424|     |      |    10|  0|    192.168.1.100|            64309|com.ibm.sparktc.s...|        file:/Users/pj.fa...|file:/Users/pj.fa...|com.ibm.sparktc.s...|           driver|                 client|    local[*]|              true|            not.so.secret|local-1509741482934|One run of SparkP...|
+-------+-------------+-------------+------------------+-----+------+------+---+-----------------+-----------------+--------------------+----------------------------+--------------------+--------------------+-----------------+-----------------------+------------+------------------+-------------------------+-------------------+--------------------+

**** MetricsSparkListener ****
stageCount=2
taskCount=11
jobCount=2
executorAddCount=1
executorRemoveCount=0

The aim is to gather more metrics with the listener and to include them with the other benchmarks.
This would involve writing the metric data to a file and having spark-bench read that data and extending the benchmark data with these additional metrics.

xiandong79 · 2017-11-28T05:22:34Z

a CSV file recording the task-durations of all tasks would be better.

ecurtin added Difficulty: Medium Type: Research/Design labels Oct 31, 2017

pjfanning mentioned this issue Nov 3, 2017

Add basic metric listener prototype pjfanning/spark-bench#2

Closed

ecurtin mentioned this issue Jan 10, 2018

Adding delay to some jobs #140

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add option to do a spark-submit with a SparkListener to gather events from Spark #113

add option to do a spark-submit with a SparkListener to gather events from Spark #113

pjfanning commented Oct 26, 2017 •

edited

ecurtin commented Oct 28, 2017

pjfanning commented Oct 30, 2017

ecurtin commented Oct 30, 2017

pjfanning commented Nov 3, 2017

xiandong79 commented Nov 28, 2017

add option to do a spark-submit with a SparkListener to gather events from Spark #113

add option to do a spark-submit with a SparkListener to gather events from Spark #113

Comments

pjfanning commented Oct 26, 2017 • edited

ecurtin commented Oct 28, 2017

pjfanning commented Oct 30, 2017

ecurtin commented Oct 30, 2017

pjfanning commented Nov 3, 2017

xiandong79 commented Nov 28, 2017

pjfanning commented Oct 26, 2017 •

edited