toDF() isn't working on the shell #1

rush4ratio · 2017-08-16T13:21:43Z

I get the same error (see attached) when trying orgs.toDF().show() or memberships.select_fields(['organization_id']).toDF().distinct().show()

The text was updated successfully, but these errors were encountered:

mohitsax · 2017-08-17T00:23:48Z

Thanks for using AWS Glue.

Please refer to the step 5 in AWS Glue documentation on using a REPL shell at: http://docs.aws.amazon.com/glue/latest/dg/tutorial-development-endpoint-repl.html
The solution to resolve this error is as follows – you would have to stop the existing SparkContext and create a new one using GlueContext.
spark.stop()
glueContext = GlueContext(SparkContext.getOrCreate())

If you have further questions, you can also use the AWS Glue Forum: https://forums.aws.amazon.com/forum.jspa?forumID=262

rush4ratio · 2017-08-17T19:50:47Z

Thanks for the suggestion. I've tried it but unfortunately I still get the same error.

mohitsax · 2017-08-21T19:27:39Z

Thanks for trying out the fix. We were not able to reproduce the error on a REPL shell after using the above fix.
Could you please open up a support ticket.

Sergeant007 · 2017-11-27T14:35:16Z

The fix with spark.stop() worked for me. Let me also post the exact error message here for better indexing by search engines:

Caused by: ERROR XSDB6: Another instance of Derby may have already booted the database /home/glue/metastore_db.

yupinh · 2017-11-30T22:36:01Z

One workaround would be to disable hive support when sparkContext is initialized.

newconf = sc._conf.set("spark.sql.catalogImplementation", "in-memory")
sc.stop()
sc = sc.getOrCreate(newconf)

Let me know if this causes you additional problems.

laurikoobas · 2018-02-28T12:24:04Z

The spark.stop() "fix" worked for me as well.
The specific error message was:

ERROR Schema: Failed initialising database.
Unable to open a test connection to the given database. JDBC url = jdbc:derby:;databaseName=metastore_db;create=true, username = APP. Terminating connection pool (set lazyInit to true if you expect to start your database after your app). Original Exception: ------
java.sql.SQLException: Failed to start database 'metastore_db'

jgoeglein · 2018-04-05T22:28:00Z

I ran into this as well. Why isn't the development environment setup to support this from the beginning? There's lots of glue documentation out there using .toDF() that doesn't work out of the box (the first example in https://github.com/aws-samples/aws-glue-samples/blob/master/FAQ_and_How_to.md for example)

zalmane · 2019-09-02T20:44:09Z

Just ran into this. The above did not work for me.
Ended up starting pyspark with the following flag:
./bin/gluepyspark --conf spark.sql.catalogImplementation=in-memory

davehowell · 2019-11-22T04:43:10Z

apache/spark@ac9c053 this is a spark patch that should fix this in the Spark 3.0.0 release

Fixed Python library dependency on the Delta Lake example notebook

moomindani pushed a commit that referenced this issue Mar 2, 2023

Merge pull request #1 from aws-samples/master

04479a8

Fixed Python library dependency on the Delta Lake example notebook

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

toDF() isn't working on the shell #1

toDF() isn't working on the shell #1

rush4ratio commented Aug 16, 2017 •

edited

mohitsax commented Aug 17, 2017

rush4ratio commented Aug 17, 2017

mohitsax commented Aug 21, 2017

Sergeant007 commented Nov 27, 2017

yupinh commented Nov 30, 2017

laurikoobas commented Feb 28, 2018

jgoeglein commented Apr 5, 2018

zalmane commented Sep 2, 2019

davehowell commented Nov 22, 2019

toDF() isn't working on the shell #1

toDF() isn't working on the shell #1

Comments

rush4ratio commented Aug 16, 2017 • edited

mohitsax commented Aug 17, 2017

rush4ratio commented Aug 17, 2017

mohitsax commented Aug 21, 2017

Sergeant007 commented Nov 27, 2017

yupinh commented Nov 30, 2017

laurikoobas commented Feb 28, 2018

jgoeglein commented Apr 5, 2018

zalmane commented Sep 2, 2019

davehowell commented Nov 22, 2019

rush4ratio commented Aug 16, 2017 •

edited