Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spark dataframes lead to continous Spark queries #221

Open
sparekh-bbg opened this issue Aug 18, 2021 · 0 comments
Open

Spark dataframes lead to continous Spark queries #221

sparekh-bbg opened this issue Aug 18, 2021 · 0 comments

Comments

@sparekh-bbg
Copy link

Using this extension in a notebook that also uses pySpark leads to continuous Spark queries. By default, Spark is a lazy-evaluation system, and only runs queries when there is an output operation on a dataframe. With the extension loaded, however, there are Spark queries running continuously.

My guess is the extension appears to try to continuously convert/show any variables that are Spark dataframes. This is a special problem with larger dataframes, as it keeps the Spark instance continuously busy.

Using Jupyter Lab version 3 and lckr-jupyterlab-variableinspector 3.0.9

One mitigation may be to have a setting/option to skip variables that point to Spark DFs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant