Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] spark.eventLog.enable and spark.eventLog.dir not working #2018

Open
1 task
ahululu opened this issue May 9, 2024 · 2 comments
Open
1 task

[BUG] spark.eventLog.enable and spark.eventLog.dir not working #2018

ahululu opened this issue May 9, 2024 · 2 comments

Comments

@ahululu
Copy link

ahululu commented May 9, 2024

Description

Please provide a clear and concise description of the issue you are encountering, and a reproduction of your configuration.

If your request is for a new feature, please use the Feature request template.

  • ✋ I have searched the open/closed issues and my issue is not listed.

Reproduction Code [Required]

  hadoopConf:
    # EMRFS filesystem
    fs.s3.customAWSCredentialsProvider: com.amazonaws.auth.WebIdentityTokenCredentialsProvider
    fs.s3.impl: org.apache.hadoop.fs.s3a.S3AFileSystem
    fs.s3a.impl: org.apache.hadoop.fs.s3a.S3AFileSystem
    fs.s3a.endpoint: s3.us-east-1.amazonaws.com
    fs.s3.buffer.dir: /mnt/s3
    fs.s3.getObject.initialSocketTimeoutMilliseconds: "2000"
    mapreduce.fileoutputcommitter.algorithm.version.emr_internal_use_only.EmrFileSystem: "2"
    mapreduce.fileoutputcommitter.cleanup-failures.ignored.emr_internal_use_only.EmrFileSystem: "true"
  sparkConf:
    # Required for EMR Runtime
    spark.driver.extraClassPath: /usr/share/aws/aws-java-sdk-v2/*:/usr/lib/hudi/*:/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/home/hadoop/extrajars/*
    spark.driver.extraLibraryPath: /usr/share/aws/aws-java-sdk-v2/*:/usr/lib/hudi/*:/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:/docker/usr/lib/hadoop/lib/native:/docker/usr/lib/hadoop-lzo/lib/native
    spark.executor.extraClassPath: /usr/share/aws/aws-java-sdk-v2/*:/usr/lib/hudi/*:/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/home/hadoop/extrajars/*
    spark.executor.extraLibraryPath: /usr/share/aws/aws-java-sdk-v2/*:/usr/lib/hudi/*:/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:/docker/usr/lib/hadoop/lib/native:/docker/usr/lib/hadoop-lzo/lib/native
    spark.hadoop.hive.metastore.client.factory.class: com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory
    # History logs
    spark.eventLog.dir: s3://abc/def/
    spark.eventLog.enable: "true"

Steps to reproduce the behavior:

Expected behavior

It will produce a log file at s3://abc/def/

Actual behavior

nothing

Environment & Versions

  • Spark Operator App version: v1beta2-1.3.8-3.1.1-amzn-4
  • Helm Chart Version: spark-operator-7.0.0
  • Kubernetes Version: AWS EKS 1.29.0
  • Apache Spark version: 3.5.0
@peter-mcclonski
Copy link
Contributor

Two items of note:
1- Try using s3a:// rather than s3://
2- Try adding the following to your sparkConf: spark.eventLog.logBlockUpdates.enabled: "true"

@ahululu
Copy link
Author

ahululu commented May 13, 2024

Two items of note: 1- Try using s3a:// rather than s3:// 2- Try adding the following to your sparkConf: spark.eventLog.logBlockUpdates.enabled: "true"

not working.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants