-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configure spark.kubernetes.container.image depending on the Linux distribution #361
Conversation
…ffleTracking.enabled has been moved to the spark configuration spark-default.conf
@@ -269,6 +269,15 @@ def configure(self, opts, ports): | |||
conf.set('spark.kubernetes.namespace', os.environ.get('SPARK_USER')) | |||
conf.set('spark.master', self._retrieve_k8s_master(os.environ.get('KUBECONFIG'))) | |||
|
|||
# The image used by the Spark executor should be set by the spawner, as we need | |||
# different images for different platform types | |||
# Get the value of 'SPARK_K8S_EXECUTOR_CONTAINER_IMAGE_URL' environment variable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make the name of the variable more generic? Potentially we could use this for the physical Hadoop clusters too, right? Something like SPARK_EXECUTOR_IMAGE
.
# The image used by the Spark executor should be set by the spawner, as we need | ||
# different images for different platform types | ||
# Get the value of 'SPARK_K8S_EXECUTOR_CONTAINER_IMAGE_URL' environment variable | ||
# The default value is a stopgap till we implement the changes in the spawner |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment will get outdated soon, right? :) Similarly, the default could be made Alma9 already?
conf.set('spark.shuffle.service.enabled', 'false') | ||
conf.set('spark.dynamicAllocation.shuffleTracking.enabled', 'true') | ||
# The image used by the Spark executor should be set by the spawner, as we need | ||
# different images for different platform types |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we check the platform directly in runtime and adapt then? (instead of having to set an extra config in the spawner, that might be out of sync)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean that the SparkConnector could realize on what platform it is running on and, depending on that, set this option dynamically? The thing is we need to know exactly the URL of the image to propagate to the executor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which was already there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is already there exactly? We make available the tag as an environment variable (VERSION_DOCKER_IMAGE), but not the whole URL. Also, note that Luca uses his own images, so not even that tag is useful. In the current solution, the URLs are hardcoded in the SparkConnector's code, which should be avoided.
@LucaCanali @diocas we could think of just reusing the SWAN image for the Spark executors, get the tag from the environment (which is already there) and the rest of the URL from some configuration file for the SparkConnector -- what is dynamic is the tag, not the URL itself, but better in a config file than in the SparkConnector's code.
…ing on the Linux distribution
This is superseded by: https://github.com/swan-cern/jupyter-extensions/pull/372/files Since we are migrating to Alma9 for Spark k8s. The solution in the PR above is still hardcoding the URL of the image though. Ideally to be transformed into a configuration option for the connector. |
This is to add configuration for setting spark.kubernetes.container.image
The configuration is via an environemnt variable that can be set by the spawner.
This will allow to support executor containers built for different OSes (notably CC7 and Alma9)