ExecutePreprocessor using jupyter_kernel_mgmt APIs #809

takluyver · 2018-05-07T20:26:00Z

This ports the execute preprocessor to my experimental jupyter_kernel_mgmt and jupyter_protocol APIs. I did this to start giving those APIs some 'real world' use.

takluyver · 2018-10-23T11:44:17Z

nbconvert/preprocessors/execute.py

+        except ioloop.TimeoutError:
+            raise TimeoutError("Cell execution timed out")
+        except ErrorInKernel as e:
+            reply = e.reply_msg


API question: should the client object raise an exception if the reply has status: 'error'? At the moment it does, but based on this and similar code in jupyter_kernel_test, I'm inclined to change it to return the reply, whatever the status of that reply.

takluyver · 2018-10-23T13:20:06Z

This PR now drops testing for Python < 3.5. I had planned that using the new kernel management code would be the time to require Python 3.

Everything actually works with Python 3.4, but it's an annoyance for the tests, because there's a minor change in IPython 7 which affects it, and IPython 7 only installs on Python 3.5 and above.

takluyver · 2018-10-25T10:09:55Z

I think I'm now happy with this PR itself, if anyone wants to take a look.

The larger question that I haven't tried to tackle yet is how to integrate the new APIs with the notebook server itself.

minrk · 2018-10-25T11:31:17Z

This is really awesome, thanks @takluyver! I'm really excited about the kernel-mgmt packages.

minrk · 2018-10-25T11:31:41Z

nbconvert/preprocessors/execute.py

        self.log.debug("Executing cell:\n%s", cell.source)
-        exec_reply = self._wait_for_reply(msg_id, cell)
+        try:
+            reply = self.kc.execute_interactive(


Great to see execute_interactive working out here

minrk · 2018-10-25T11:34:57Z

nbconvert/preprocessors/execute.py

+                idle_timeout=self.iopub_timeout,
+                raise_on_no_idle=self.raise_on_iopub_timeout,
+            )
+        except ioloop.TimeoutError:


Not a problem for now, but could this ioloop.TimeoutError -> TimeoutError happen in jupyter_protocol? Does that seem appropriate to you?

Yup, I think it would be appropriate for the BlockingKernelClient to do that translation.

mgeier · 2018-11-07T16:32:24Z

This looks like it could potentially fix #878 and therefore make #886 obsolete, which is great!

However, when I'm trying to create the nbsphinx docs (which uses nbconverts ExecutePreprocessor), I'm getting a myriad of messages like this:

ERROR:traitlets:Exception from message handler <function IOLoopKernelClient._execution_future.<locals>.watch_for_idle at 0x7f1d094a88c8>
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/jupyter_kernel_mgmt/client.py", line 115, in _call_handlers
    handler(msg, channel)
TypeError: watch_for_idle() takes 1 positional argument but 2 were given
ERROR:traitlets:Exception from message handler <function IOLoopKernelClient._execution_future.<locals>.watch_for_idle at 0x7f1d094a88c8>
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/jupyter_kernel_mgmt/client.py", line 115, in _call_handlers
    handler(msg, channel)
TypeError: watch_for_idle() takes 1 positional argument but 2 were given
etc., etc.

... and it takes much longer than normally. However, in the end everything seems to be executed correctly.

Steps how to reproduce: https://github.com/spatialaudio/nbsphinx/blob/master/CONTRIBUTING.rst

UPDATE: I'm getting the same error messages when simply running python3 -m nbconvert --execute on my notebooks, so it doesn't seem to have anything to do with nbsphinx.
Am I missing some up-to-date dependency?

takluyver · 2018-11-08T16:38:09Z

It may fix #878, but it will quite possibly break something else you rely on. ;-)

Thanks for the report; I've been doing a bunch of changes recently in connection with making the notebook server use the same new APIs, and I've probably missed something.

takluyver · 2018-11-08T16:42:38Z

At some point, in connection with this, I'm thinking of making it so that nbconvert --execute on a Python notebook defaults to the Python nbconvert is running on; I think not doing that is a common source of confusion. As kernel names proliferate, it's also less useful to store in notebook metadata the name of the last kernel it was run with, because that name may only be meaningful on one system.

takluyver · 2018-11-08T16:49:01Z

This commit should fix the problem you described: takluyver/jupyter_kernel_mgmt@8a74331

mgeier · 2018-11-09T10:25:44Z

It may fix #878, but it will quite possibly break something else you rely on. ;-)

I'm not really worried about that. The only thing I'm relying on, is that I can use and empty string (which is the default value for the traitlet) for kernel_name and it should use the kernel stored in the notebook:

pp = nbconvert.preprocessors.ExecutePreprocessor(kernel_name='')

... and if I specify a non-empty string, it should use that string to start an appropriate kernel.
I didn't think anything could go wrong with those simple assumptions ... until #878 happened!

I'm thinking of making it so that nbconvert --execute on a Python notebook defaults to the Python nbconvert is running on

This sounds quite strange to me, TBH.

Why should a Python notebook have a different behavior than, say, a Julia notebook?
Shouldn't the fact that nbconvert is implemented in Python be an implementation detail?
Obviously Python 2 vs. Python 3 would be a problem, but Python 2 is dying, anyway.
But what happens if the notebook uses PyPy? Or some other Python implementation?

As kernel names proliferate, it's also less useful to store in notebook metadata the name of the last kernel it was run with, because that name may only be meaningful on one system.

That's unfortunate, but I don't think it's reason enough to simply ignore the information stored in the notebook.

Probably some fallback mechanism should be implemented for the case that the specified kernel is not found?
But then there should be at least a warning message. And probably there should even be an additional configuration parameter that enables and disables this behavior? And the solution shouldn't be Python-specific, because the same problem might happen with other languages as well, right?

This commit should fix the problem you described: takluyver/jupyter_kernel_mgmt@8a74331

Thanks, this indeed fixes the problem!

takluyver · 2018-11-10T12:04:27Z

Shouldn't the fact that nbconvert is implemented in Python be an implementation detail?

Yes and no. I see quite a few people installing Jupyter into an environment, running it from that environment, and then getting confused that it's not seeing the things they installed in that environment. This is inconsistent and based on hidden state: neither the notebook metadata nor your installed kernelspecs are clearly visible. It's a recurring source of confusion.

The fact that it's Python matters to a degree, because it gives us an obvious default kernel to use: the one on the same Python interpreter as the parent process. There's no obvious default for other languages.

My plan for command-line tools like nbconvert is something like this:

If the user explicitly specifies a kernel to run with (e.g. with a command-line parameter), use that.
Otherwise, get the language from the notebook metadata.
If there is exactly one kernel available for that language, run with that.
If the language is Python, use the Python kernel in nbconvert's environment.
If we have multiple kernels available for the language, tell the user what they are and how to explicitly specify one to use.

This will undoubtedly be less convenient in some cases, but it's hopefully easier to understand and more predictable.

mgeier · 2018-11-13T10:11:05Z

neither the notebook metadata nor your installed kernelspecs are clearly visible.

But that's no reason to unconditionally ignore them, right?

If this "hidden" information is useless, it should be removed.
If not, it should be taken into account.

I'm not a conda user, so I don't know about those problems and I don't have an idea how to solve them, but blatantly ignoring stuff doesn't seem to be the right solution ...

I have the feeling that you should at least add one step to your 5-step-program mentioned above:

[same as above]
If a kernel is specified in the notebook metadata, check if it exists and if yes, use it
[continue with 2. from above]

I still think that taking the fact that nbconvert is implemented in Python into account is an ugly hack, but if it really is the only way to provide a good experience to a large part of the user base, then I guess it could be justified.

takluyver · 2018-11-13T11:31:54Z

If this "hidden" information is useless, it should be removed.

I do want to remove the precise kernel name from notebook metadata - if I run a notebook with kernel foo on my computer and send it to you, there's no guarantee that you have a kernel foo or that it is remotely compatible if you do. Kernel names are only meaningful in the context of a particular system/user/environment, whereas notebook metadata should be globally meaningful.

The reason this hasn't happened already is that it's not clear how to select a kernel instead when opening a notebook interactively. Should notebooks be associated with a particular environment, or just with a language? If they're associated with an environment, how does that link work? Or does associating a file with an environment break people's expectations about how they run the same code in different environments?

The kernelspecs are useful for telling Jupyter what kernels exist and how to start them. But they're a source of confusion because a notebook can behave in different ways depending on what kernelspecs you have installed. It's a particular issue with the default python3 (/python2) kernel - we special case it so Jupyter will work without explicitly installing it, but then if you do install it, you've effectively pinned Jupyter to whichever environment you installed it from.

We designed the kernelspec mechanism with the idea that one kernelspec would typically represent one language, or one major version of a language. So you might have your Python 2 and 3 kernels and an R kernel. We explicitly put off dealing with environments, because it had already been a long and tiring discussion, and we wanted to move forwards. So kernelspecs have ended up being used for environments, and they're not a good fit. That's what I'm now trying to address.

takluyver · 2018-11-13T11:34:37Z

(Writing that has helped me further my thinking a bit. I'm going to add a post to my relevant notebook PR).

mgeier · 2018-11-13T11:57:02Z

👍 That sounds good! Removing the kernel name is definitely better than ignoring it!

I don't really grok all that notebook metadata, but it always seemed to me that there is a lot of redundant information in there. It sounds good to me to remove some of it.

mgeier · 2018-11-18T13:07:35Z

@takluyver

It may fix #878, but it will quite possibly break something else you rely on. ;-)

It turns out that you were absolutely right!

The extra_arguments argument to nbconvert.preprocessors.ExecutePreprocessor seems to be ignored when using this PR.

kevin-bates

@takluyver - I used the PR review to update status relative to async juptyer_kernel_mgmt PR takluyver/jupyter_kernel_mgmt#23 - in particular since a change was necessary to handle notebooks that already contain provider-prefixed kernel names.

We should probably rebase this PR with master before completing the exercise, but this appears to be a good first step.

kevin-bates · 2019-08-29T22:13:36Z

nbconvert/preprocessors/execute.py

-            finally:
-                for attr in ['nb', 'km', 'kc']:
-                    delattr(self, attr)
+                kernel_name = 'spec/' + nb.metadata.get('kernelspec', {})[


This will produce a duplicated 'spec/' prefix when the existing notebook file has been persisted using the new kernel providers. I modified this to the following when investigating how async support in jupyter_kernel_mgmt affects existing clients (per this comment).

# Check for unset kernel_name first and attempt to pull from metadata. Only then # should we check for provider id prefixes, otherwise we could end up with a double prefix. kernel_name = self.kernel_name if not kernel_name: try: kernel_name = nb.metadata.get('kernelspec', {})['name'] except KeyError: kernel_name = 'pyimport/kernel' # Ensure kernel_name is of the new form in case of older metadata if '/' not in kernel_name: kernel_name = 'spec/' + kernel_name

kevin-bates · 2019-08-29T22:13:46Z

nbconvert/preprocessors/execute.py

+                kernel_name = 'pyimport/kernel'
+
+        self.log.info("Launching kernel %s to execute notebook" % kernel_name)
+        conn_info, self.km = kf.launch(kernel_name, cwd=path)


To address the async jupyter_kernel_mgmt changes, the only change required to get things working is replacing this line with the following (along with accompanying import asyncio statement)...

conn_info, self.km = asyncio.get_event_loop().run_until_complete(kf.launch(kernel_name, cwd=path))

However, what is strange (and independent of this change) is that when I run nbconvert via a debugger (pycharm), I get the following exception:

File "/Users/kbates/repos/oss/gateway-experiments/nbconvert/nbconvert/exporters/exporter.py", line 315, in _preprocess nbc, resc = preprocessor(nbc, resc) File "/Users/kbates/repos/oss/gateway-experiments/nbconvert/nbconvert/preprocessors/base.py", line 47, in __call__ return self.preprocess(nb, resources) File "/Users/kbates/repos/oss/gateway-experiments/nbconvert/nbconvert/preprocessors/execute.py", line 318, in preprocess nb.metadata['language_info'] = info_dict['language_info'] File "/opt/anaconda3/envs/kernel-mgmt-dev/lib/python3.6/contextlib.py", line 92, in __exit__ raise RuntimeError("generator didn't stop")

Since this occurs in either case, I figured its a side-effect of having a yield in a contextmanager - but I'm also a relative novice in python - so that may not be what's going on. When run from the command line, also in both cases, everything works.

These changes also ignore the case where the kernel manager instance could be passed as an argument. We should remove that parameter if that's no longer applicable.

@takluyver -

Since this occurs in either case, I figured its a side-effect of having a yield in a contextmanager - but I'm also a relative novice in python - so that may not be what's going on. When run from the command line, also in both cases, everything works.

Well, obviously a yield in a context manager is correct (yes, I've learned a little), but I suspect something is getting side-affected by the async changes in conjunction with the content manager. If I break things down to not use a context manager, nbconvert --execute no longer encounters the "generator didn't stop" issue.

I've gone ahead and created a PR against your branch in case you want to take these changes. If not, just close the PR. Thanks

MSeal · 2020-02-04T08:02:56Z

As an FYI the execute preprocessor code has been lifted to https://github.com/jupyter/nbclient and shortly I'll be releasing the first version there and changing these files being touched to reference that library instead. I know this disrupts this PR severely in needing to move the PR to that repo instead, but that change for the 6.0 release finally got some tracking and this PR has been open for a very long time.

On a side note any extra input on the code sitting in nbclient before we release would be helpful. For now it's a faithful clone of functionality with a few minor interface changes from what's in master right now.

kevin-bates · 2020-02-04T15:16:38Z

Thanks for the FYI @MSeal - that's good to know and makes sense. When/if nbclient moves to the new framework, it will need these changes.

davidbrochart · 2020-03-09T15:11:27Z

What is the status of this PR, is somebody working on it?

takluyver · 2020-03-10T20:37:47Z

Not at the moment, but there is an enhancement proposal under discussion (jupyter/enhancement-proposals#45 ) which would include adopting jupyter_kernel_mgmt as an official Jupyter package. That's probably necessary before any serious Jupyter infrastructure relies on it.

davidbrochart · 2020-03-10T21:44:12Z

Thanks, looks like the JEP is moving forward.

takluyver mentioned this pull request May 21, 2018

Pass config to kernel from ExecutePreprocessor for configurable kernels #816

Merged

takluyver mentioned this pull request Oct 23, 2018

Is there any instruction to use it with (a fork of) notebook ? takluyver/jupyter_kernel_mgmt#3

Open

Adapt ExecutePreprocessor to jupyter_kernel_mgmt APIs

e517770

takluyver force-pushed the jupyter_kernel_mgmt branch from 9cf745d to e517770 Compare October 23, 2018 11:26

takluyver added 3 commits October 23, 2018 12:35

Remove code using jupyter_client

c6f7b98

Move imports of jupyter_kernel_mgmt inside method

0c0eccd

Change dependency to jupyter_kernel_mgmt

0409efb

takluyver commented Oct 23, 2018

View reviewed changes

takluyver added 6 commits October 23, 2018 12:50

Don't generate default value for ExecutePreprocessor.kernel_name

66eb16a

Remove Python 2.7 from test matrix

852b7c6

Fix getting kernel info reply

771dd2c

Use python3 kernel name in test

547d926

Update exception format for IPython 7

93e8d64

Drop testing on Python 3.4

05bd50e

takluyver mentioned this pull request Oct 23, 2018

Test failure with IPython 7 #898

Closed

takluyver added 2 commits October 25, 2018 10:32

Update for jupyter_kernel_mgmt 0.2.0

5009a7e

Eliminate duplicate log message

e1afd39

takluyver added this to the 6.0 milestone Oct 25, 2018

takluyver changed the title ~~WIP: ExecutePreprocessor using jupyter_kernel_mgmt APIs~~ ExecutePreprocessor using jupyter_kernel_mgmt APIs Oct 25, 2018

minrk reviewed Oct 25, 2018

View reviewed changes

mgeier mentioned this pull request Nov 13, 2018

WIP Recreate old behaviour of ExecutePreprocessor(kernel_name="") #886

Closed

takluyver mentioned this pull request Nov 13, 2018

WIP: Use new kernel management APIs in notebook server jupyter/notebook#4170

Closed

kevin-bates mentioned this pull request Aug 28, 2019

Async / asynchronous execution #1092

Open

kevin-bates reviewed Aug 29, 2019

View reviewed changes

maartenbreddels mentioned this pull request Aug 30, 2019

feat: add client class that uses async/await to integrate with asyncio jupyter/jupyter_client#471

Closed

golf-player mentioned this pull request Feb 21, 2020

Issue running nbclient with a gateway url jupyter/nbclient#26

Closed

MSeal removed this from the 6.0 milestone Jul 2, 2020

willingc added the status:pending-jep Needs JEP acceptance or action label Sep 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ExecutePreprocessor using jupyter_kernel_mgmt APIs #809

ExecutePreprocessor using jupyter_kernel_mgmt APIs #809

takluyver commented May 7, 2018 •

edited

takluyver Oct 23, 2018

takluyver commented Oct 23, 2018

takluyver commented Oct 25, 2018

minrk commented Oct 25, 2018

minrk Oct 25, 2018

minrk Oct 25, 2018

takluyver Oct 25, 2018

mgeier commented Nov 7, 2018 •

edited

takluyver commented Nov 8, 2018

takluyver commented Nov 8, 2018

takluyver commented Nov 8, 2018

mgeier commented Nov 9, 2018

takluyver commented Nov 10, 2018

mgeier commented Nov 13, 2018

takluyver commented Nov 13, 2018

takluyver commented Nov 13, 2018

mgeier commented Nov 13, 2018

mgeier commented Nov 18, 2018

kevin-bates left a comment

kevin-bates Aug 29, 2019

kevin-bates Aug 29, 2019

kevin-bates Jan 31, 2020

MSeal commented Feb 4, 2020

kevin-bates commented Feb 4, 2020

davidbrochart commented Mar 9, 2020

takluyver commented Mar 10, 2020

davidbrochart commented Mar 10, 2020

ExecutePreprocessor using jupyter_kernel_mgmt APIs #809

Are you sure you want to change the base?

ExecutePreprocessor using jupyter_kernel_mgmt APIs #809

Conversation

takluyver commented May 7, 2018 • edited

takluyver Oct 23, 2018

Choose a reason for hiding this comment

takluyver commented Oct 23, 2018

takluyver commented Oct 25, 2018

minrk commented Oct 25, 2018

minrk Oct 25, 2018

Choose a reason for hiding this comment

minrk Oct 25, 2018

Choose a reason for hiding this comment

takluyver Oct 25, 2018

Choose a reason for hiding this comment

mgeier commented Nov 7, 2018 • edited

takluyver commented Nov 8, 2018

takluyver commented Nov 8, 2018

takluyver commented Nov 8, 2018

mgeier commented Nov 9, 2018

takluyver commented Nov 10, 2018

mgeier commented Nov 13, 2018

takluyver commented Nov 13, 2018

takluyver commented Nov 13, 2018

mgeier commented Nov 13, 2018

mgeier commented Nov 18, 2018

kevin-bates left a comment

Choose a reason for hiding this comment

kevin-bates Aug 29, 2019

Choose a reason for hiding this comment

kevin-bates Aug 29, 2019

Choose a reason for hiding this comment

kevin-bates Jan 31, 2020

Choose a reason for hiding this comment

MSeal commented Feb 4, 2020

kevin-bates commented Feb 4, 2020

davidbrochart commented Mar 9, 2020

takluyver commented Mar 10, 2020

davidbrochart commented Mar 10, 2020

takluyver commented May 7, 2018 •

edited

mgeier commented Nov 7, 2018 •

edited