Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spark-history-server not actually working with the docker-compose.yaml in readme #160

Open
tlzxsun opened this issue Oct 10, 2023 · 2 comments

Comments

@tlzxsun
Copy link

tlzxsun commented Oct 10, 2023

There are two issues

  1. spark-history-server and master/worker are not on the same node, so the logs in /tmp/spark-events will not be read
  2. it seems there is no spark-defaults.conf in the master/worker node, the spark.eventLog.enabled is by default false, so logs are not write

I managed to make it working by mount /spark/conf/spark-defatuls.conf and /tmp/spark-events to the host, it seems working now.
But I think this is not so good, so I was expecting a better solution.
Thanks a lot~

version: '3'
services:
  spark-master:
    image: bde2020/spark-master:3.3.0-hadoop3.3
    container_name: spark-master
    ports:
      - "8080:8080"
      - "7077:7077"
    environment:
      - INIT_DAEMON_STEP=setup_spark
    volumes:
      - /tmp/spark-events-local:/tmp/spark-events
      - /Users/matteo/spark/conf/spark-defaults.conf:/spark/conf/spark-defaults.conf
  spark-worker-1:
    image: bde2020/spark-worker:3.3.0-hadoop3.3
    container_name: spark-worker-1
    depends_on:
      - spark-master
    ports:
      - "8081:8081"
    environment:
      - "SPARK_MASTER=spark://spark-master:7077"
    volumes:
      - /tmp/spark-events-local:/tmp/spark-events
      - /Users/matteo/spark/conf/spark-defaults.conf:/spark/conf/spark-defaults.conf
  spark-worker-2:
    image: bde2020/spark-worker:3.3.0-hadoop3.3
    container_name: spark-worker-2
    depends_on:
      - spark-master
    ports:
      - "8082:8081"
    environment:
      - "SPARK_MASTER=spark://spark-master:7077"
    volumes:
      - /tmp/spark-events-local:/tmp/spark-events
      - /Users/matteo/spark/conf/spark-defaults.conf:/spark/conf/spark-defaults.conf
  spark-history-server:
      image: bde2020/spark-history-server:3.3.0-hadoop3.3
      container_name: spark-history-server
      depends_on:
        - spark-master
      ports:
        - "18081:18081"
      volumes:
        - /tmp/spark-events-local:/tmp/spark-events
        - /Users/matteo/spark/conf/spark-defaults.conf:/spark/conf/spark-defaults.conf
@erick093
Copy link

erick093 commented Dec 2, 2023

could you please share the spark-defaults.conf file that you are using?

@tlzxsun
Copy link
Author

tlzxsun commented Dec 7, 2023

2. spark.eventLog.enabled

I am sorry all the files are gone now. But I think there is only one line in the conf
spark.eventLog.enabled true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants