Skip to content

Using read_csv within Databricks to open a local file #2177

Answered by zero323
ChuckConnell asked this question in Q&A
Discussion options

You must be logged in to vote

Additionally to #2177 (comment), this is just not going to work ‒ internally, read_* methods use standard Spark data sources

reader = default_session().read

so the same restriction apply. Any path you want to read has to be accessible to every Spark worker in your cluster and that's really not the case when you use your local file system.

In general, you should migrate your data to distributed file storage first ‒ in case of DBFS, the official CLI should do the trick.

dbfs cp -r ...

Replies: 3 comments 3 replies

Comment options

You must be logged in to vote
1 reply
@ChuckConnell
Comment options

Comment options

You must be logged in to vote
2 replies
@ChuckConnell
Comment options

@ChuckConnell
Comment options

Comment options

You must be logged in to vote
0 replies
Answer selected by HyukjinKwon
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants