Skip to content

pyspark not starting task on GPU #8094

Answered by revans2
saifmasood asked this question in Q&A
Discussion options

You must be logged in to vote

@saifmasood thank you for filing this.

Reading a CSV file happens in two different stages. The first stage is schema discovery. Schema discovery happens if you do not provide a schema for the CSV data, like you are doing in your query. We have not optimized schema discovery for CSV or JSON for a number of reasons. The output from the plugin shows that it saw the schema discovery portion and tried to translate at least parts of it to the GPU.

I see a few potential problems with your configs depending on what mode you are running in.

If you are in local mode, Spark does not deal with GPU resources well at all and will hang. Please remove all requests for GPU resources in local mode. Probabl…

Replies: 3 comments 1 reply

Comment options

You must be logged in to vote
0 replies
Answer selected by saifmasood
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
1 reply
@saifmasood
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants