Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

variable scope problems in function definition #687

Closed
seegy opened this issue Jun 13, 2019 · 14 comments · Fixed by #688
Closed

variable scope problems in function definition #687

seegy opened this issue Jun 13, 2019 · 14 comments · Fixed by #688

Comments

@seegy
Copy link

seegy commented Jun 13, 2019

Hi there,

I'm using enterprise gateway on our kubernetes cluster (following this instructions).

Using a python notebook (in my case the kernel is elyra/kernel-spark-py:dev) I found the problem, that I cannot use variables inside a function definition, which are defined outside.

Example:

a = 123

def fun():
    print(a)
    
fun()

ends in:

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
/usr/local/bin/kernel-launchers/python/scripts/launch_ipykernel.py in <module>
     4     print(a)
     5
----> 6 fun()

/usr/local/bin/kernel-launchers/python/scripts/launch_ipykernel.py in fun()
     2
     3 def fun():
----> 4     print(a)
     5
     6 fun()

NameError: name 'a' is not defined

Same with import...

Any ideas?

@kevin-bates
Copy link
Member

hmm - I think this might be related to running as a Spark application somehow since I see the very same thing running in a YARN cluster:
image

Also, I'm not able to reproduce this using the regular (non-spark) python kernel in k8s...
image

@kevin-bates
Copy link
Member

ok - this is an EG issue and was introduced by this commit: 2cdac42

@jcrist - we're going to need your help on this one. It seems the use of namespace is producing side affects on the embedded kernel running in Spark. If I revert the Dask changes, I see the expected results. Note that there was another side affect of the Dask changes that I addressed in PR #540 - so whatever happens for this issue we'll need to ensure those behaviors don't reappear.

Your help is greatly appreciated!

@jcrist
Copy link
Contributor

jcrist commented Jun 13, 2019

Hmmm, that's unfortunate. This looks to be due to ipython/ipython#62, and doesn't have a good workaround.

The namespace thing was to avoid polluting the started kernel namespace with all the stuff enterprise gateway uses to setup a kernel. I recommend the following:

def start_ipython(locals, ns, **kwargs):
    from IPython import embed_kernel

    for k in list(locals):
        if not k.startswith('__'):
            del locals[k]

    locals.update(ns)

    embed_kernel(local_ns=locals, **kwargs)


if __name__ == '__main__':
    # Existing code...

    # Replace the `embed_kernel` call with this:
    start_ipython(locals(), namespace, connection_file=connection_file, ip=ip)

I've tested this works locally, and should fix things for you. I don't have time to test this myself on enterprise-gateway, but am happy to respond to the issue here.

@kevin-bates
Copy link
Member

Thank you for your quick response - it's greatly appreciated!

I have applied the changes above to the version you created (prior to #540). This does appear to address the scenario listed in this issue. However, any attempts to reference sc, etc., result in a recursive call that blows the stack. Here are the last few frames produced when running sc.appName...

/hadoop/yarn/local/usercache/gateway/appcache/application_1559854930815_0039/container_e05_1559854930815_0039_01_000001/launch_ipykernel.py in __getattr__(self, name)
    115             self._init_thread.join(timeout=None)
    116             # now return attribute/function reference from actual Spark object
--> 117             return getattr(globals()[self._spark_session_variable], name)
    118 
    119 

/hadoop/yarn/local/usercache/gateway/appcache/application_1559854930815_0039/container_e05_1559854930815_0039_01_000001/launch_ipykernel.py in __getattr__(self, name)
    113         else:
    114             # wait on thread to initialize the Spark session variables in global variable scope
--> 115             self._init_thread.join(timeout=None)
    116             # now return attribute/function reference from actual Spark object
    117             return getattr(globals()[self._spark_session_variable], name)

/opt/conda/lib/python2.7/threading.pyc in join(self, timeout)
    928         if not self.__started.is_set():
    929             raise RuntimeError("cannot join thread before it is started")
--> 930         if self is current_thread():
    931             raise RuntimeError("cannot join current thread")
    932 

/opt/conda/lib/python2.7/threading.pyc in currentThread()
   1149     """
   1150     try:
-> 1151         return _active[_get_ident()]
   1152     except KeyError:
   1153         ##print "current_thread(): no current thread for", _get_ident()

RuntimeError: maximum recursion depth exceeded

@jcrist
Copy link
Contributor

jcrist commented Jun 13, 2019

Ahh, hmmm, this has to do with this bit of code, which is accessing raw globals (generally an antipattern):

https://github.com/jupyter/enterprise_gateway/blob/a1b0f679e51d08f4238b4ba9b9efd1afabb27973/etc/kernel-launchers/python/scripts/launch_ipykernel.py#L124

Is there a noticeable lag if you initialize the spark context variables before starting the kernel? Since the spark application is already running, I wouldn't expect much of an issue? Otherwise you'll need to propagate the locals dict down through all the spark lazy loading machinery so that the proper namespace is being updated.

@kevin-bates
Copy link
Member

kevin-bates commented Jun 13, 2019

I see. So you're asking if we not create a thread, but do the init synchronously? Yeah, the spark context creation takes on the order of 5-10 seconds and we need that time to be in the background so users can attempt to do non-spark things first (and not have to wait the additional time for sc creation).

I'm barely following what's going on but will try the locals dict propagation.

EDIT: @jcrist Actually, I have no idea what to do here. 🙁

kevin-bates added a commit to kevin-bates/enterprise_gateway that referenced this issue Jun 13, 2019
The python kernel launcher was not initializing the namespace used
in the embed_kernel properly.  This change reverts an earlier change
to use the spark context variables from the global scope, instead
moving them to the local namespace.

Fixes jupyter-server#687
@kevin-bates
Copy link
Member

@jcrist - ok, I think I've battled through things and ferreted out what your comments were saying relative to the locals() dict. Your help on this has been invaluable!

Could you please review the changes and let us know if we can make this better?

kevin-bates added a commit to kevin-bates/enterprise_gateway that referenced this issue Jun 14, 2019
The python kernel launcher was not initializing the namespace used
in the embed_kernel properly.  This change reverts an earlier change
to use the spark context variables from the global scope, instead
moving them to the local namespace.

Fixes jupyter-server#687
Closes jupyter-server#688
@seegy
Copy link
Author

seegy commented Jun 14, 2019

Wow. I'm impressed by your engagement!

Let me know, if I can help you somehow

@kevin-bates
Copy link
Member

@seegy - if you'd like to take an updated image for a spin, I've pushed an image to elyra/kernel-spark-py:scope. Since this image is built on elyra/kernel-py, it too has been pushed with a scope tag.

To confirm you're running with the correct image, ensure that its /usr/local/bin/kernel-launchers/python/scripts/launch_ipykernel.py file contains the updated code.

I'd like to wait for @jcrist's opinion before merging.

@seegy
Copy link
Author

seegy commented Jun 15, 2019

hey @kevin-bates ,

unfortunately, I'm on vacation the upcoming week. So, I will have no access to our k8s environment...
But I will let you know when I could find the time to test the update.

@kevin-bates
Copy link
Member

@seegy , no worries. Enjoy your vacation! I moved this to WIP since we're still working things out.

kevin-bates added a commit to kevin-bates/enterprise_gateway that referenced this issue Jun 19, 2019
The python kernel launcher was not initializing the namespace used
in the embed_kernel properly.  This change reverts an earlier change
to use the spark context variables from the global scope, instead
moving them to the local namespace.

Fixes jupyter-server#687
Closes jupyter-server#688
kevin-bates added a commit that referenced this issue Jun 19, 2019
* Fix scoping issues relative to python kernels on Spark

The python kernel launcher was not initializing the namespace used
in the embed_kernel properly.  This change reverts an earlier change
to use the spark context variables from the global scope, instead
moving them to the local namespace.

Fixes #687
Closes #688
@kevin-bates
Copy link
Member

@seegy and @suryag10, Since this issue is now closed and the fix has been merged, I have updated the docker default images elyra/kernel-py:dev and elyra/kernel-spark-py:dev and removed those tagged with scope.

Thank you for reporting this issue and your timely testing - its been appreciated!

@seegy
Copy link
Author

seegy commented Jun 24, 2019

image
👌
thanks a lot!

@esevan
Copy link
Contributor

esevan commented Jun 25, 2019

@kevin-bates Thanks. Saved my day 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants