WIP: Expose some information about notebook execution state #79

golf-player · 2020-05-30T23:07:23Z

I'd find it useful to have the client expose some information about what's going on during the execution.

I'm particularly interested in knowing what's happening in the execution at the current time, and also which cell is being run.

This is very, very rough, and just to give you the gist of what I'm looking for. Let me know if y'all would be interested in such a feature and I'll make it less rough and add tests and stuff.

Please let me know what you think. Maybe rather than doing it like this, exposing hooks for users could be the way to go?

davidbrochart · 2020-05-31T09:12:24Z

Great, I think it would be interesting to have. We could also attach timing information to these events?

golf-player · 2020-05-31T20:01:16Z

What kind of timing information? Like when a particular state transition happened?

davidbrochart · 2020-05-31T21:02:28Z

Yes, we already have some timing information in the cell metadata, so this would cover the rest of the execution process.

golf-player · 2020-05-31T22:00:28Z

@davidbrochart I've made an update. You mean something like that?

davidbrochart · 2020-06-01T07:23:27Z

Yes, and I guess we don't need self.state anymore?

golf-player · 2020-06-01T08:11:23Z

good point; removed.

davidbrochart · 2020-06-01T08:17:34Z

nbclient/client.py

@@ -32,6 +33,15 @@ def timestamp():
    return datetime.datetime.utcnow().isoformat() + 'Z'


+class ExecutionState(enum.Enum):
+    NOTHING = 0
+    STARTUP = 1


Maybe IDLE instead of NOTHING, STARTING_UP instead of STARTUP?

davidbrochart · 2020-06-01T08:48:02Z

nbclient/client.py

@@ -323,7 +323,7 @@ def reset_execution_trackers(self):
        self.output_hook_stack = collections.defaultdict(list)
        # our front-end mimicing Output widgets
        self.comm_objects = {}
-        self.state_history = []
+        self.state_history = [ExecutionState.IDLE, timestamp()]


It must be self.state_history = [(ExecutionState.IDLE, timestamp())]. Maybe we should add a basic test?

hah I noticed this immediately after pushing. I will add some state checking to tests regardless

davidbrochart · 2020-06-01T08:49:17Z

nbclient/client.py

@@ -512,7 +534,7 @@ async def async_execute(self, reset_kc=False, **kwargs):
            info_msg = await self.async_wait_for_reply(msg_id)
            self.nb.metadata['language_info'] = info_msg['content']['language_info']
            self.set_widgets_metadata()
-
+        self._update_state(ExecutionState.COMPLETE)


Should we add self._update_state(ExecutionState.IDLE) just after that?

I guess idle and complete have some overlap, but they're still separate useful states, right?

Yes, IDLE always follows COMPLETE, except for the first state.

makes sense.

MSeal · 2020-06-10T18:53:34Z

So I'm a little late to the conversation here, but one thing to consider is having the hook and responding to status more in papermill where there's more of a pattern of user registered control and plugin capability on top of nbclient. I don't see any issue with tracking current state here as done with the enum pattern though. Would need some tests here for a merge.

golf-player · 2020-06-11T03:19:01Z

Hooks IMO is the superior solution here. I am curious though, why the plugin-friendliness belongs to papermill rather than some mixin or something in nbclient. Is it intended for people to extend nbclient with inheritance? If so, are the public (by convention, things not starting with an underscore) members stable enough to depend on? Or should I wait till 1.0.0?

@MSeal I see that you also maintain or at least contribute to papermill, so that's why it's possible there. Generally, if I subclassed nbclient and extended things to put hooks in places I need them, would that be a good long-term solution? If so, I'd close this issue.

#81 also referencing this issue which is asking for similar stuff.

MSeal · 2020-06-11T03:43:56Z

Is it intended for people to extend nbclient with inheritance

Generally yeah. Or by calling the methods from higher order logical constructs that have further aims. e.g. https://github.com/nteract/testbook (alpha) is using nbclient but to support all sorts of other execution pattern requirements by organizing it's actions in wrapper classes.

The difference is that nbclient is the low level primitive library with the encapsulation of cell execution captured. Papermill is an opinionated library with plugin registry systems and flexibility for higher level abstractions. It's not a hard rule, but more of a guideline. Papermill predates nbclient some so more of the flexibility and opinions grew there. Rather than moving the execution logic solely to papermill we wanted to have a less opinionated, simple execution library that can be inherited / called functionally from various applications so nbclient was made.

All that being said, I'm not opposed to adding capabilities to nbclient. I was just pointing out that registering hooks fits more naturally with the abstractions in papermill than nbclient given the goal of the two libraries. So don't take my post as a rule to avoid improvements in nbclient please.

Generally, if I subclassed nbclient and extended things to put hooks in places I need them, would that be a good long-term solution?

That's reasonable as well since nbclient is meant to be reused in more complex execution patterns. I would maybe think of having a new notebook execution function that takes a post-cell function and leave the rest as vanilla nbclient. We're unlikely to rework the execution contracts in nbclient before 1.0 and will continue to support methods we expose as best as we can.

@davidbrochart uses nbclient in other libraries, so he might have a different flavor on how he views the libraries.

Hope that clarifies some

golf-player · 2020-06-11T03:48:51Z

It clarifies a lot, thanks. NBConvert predates papermill (I think), and I think that's I mixed things up a bit

I'll rework this to add hooks then. Thanks for the information.

MSeal · 2020-06-11T03:51:00Z

Yeah nbconvert (where nbclient came from) does predate papermill, though it's execution library was not being maintained for a while there.

golf-player · 2020-06-13T04:44:52Z

Anyway, I've added 4 basic hooks, which IMO expose very useful information. What other hooks do you think should exist, do you think this is being done the right way? Would love some opinions.

golf-player · 2020-06-17T17:29:48Z

@MSeal @davidbrochart any thoughts/comments about this?

davidbrochart · 2020-06-17T18:08:03Z

nbclient/util.py

+        future = hook(*args)
+    else:
+        loop = asyncio.get_event_loop()
+        future = loop.run_in_executor(None, hook, *args)


Why not just call hook(*args) here?

I think it'd be preferable to use kwargs so it's more forward compatible?

I don't want to block execution, which calling hook(*args) will do, so I'm executing it on a threadpool executor

Yeah good call on the kwargs.

OK just remembered why I went with args..
run_in_executor only takes args, not kwargs....

I suppose it's worth it to use functools.partial for this though

MSeal

Seems simple enough overall. I'll think on if we want to skip traitlets for this or leave them, but for now the minor comments addressed and some clearer doc strings on the hook options and I'd be fine with a merge.

MSeal · 2020-06-18T16:22:06Z

nbclient/client.py

@@ -223,6 +223,35 @@ class NotebookClient(LoggingConfigurable):

    kernel_manager_class = Type(config=True, help='The kernel manager class to use.')

+    on_kernel_create = Any(


This might be deceptive to users, because it's only run when nbclient makes a kernelmanager and not when a kernel is created. The kernel creation aspect is fairly abstracted away from nbclient so I would instead make a notebook_start hook after the kernel setup is completed if you're going for pre-cell execution hooks.

MSeal · 2020-06-18T16:29:19Z

nbclient/client.py

+        help="""A callable which executes when the kernel is created.""",
+    ).tag(config=True)
+
+    on_cell_start = Any(


These are somewhat awkward as Any traitlets :/ Until we decide to move off them I guess this is how it'd be

Should we use a Callable trait? It should also be typed as t.Callable.

Is there a Callable trait? I couldn't find one, and I copied the Any from the timeout func.

IMO, typing on traits is a bit redundant (except in this sort of case)

It could be typed like t.Optional[t.Callable] since it can be None

You're right, it looks like Callable has been added to traitlets in the past but then removed. I still think static typing traits is valuable, because it can catch bugs before runtime. You're right, it should be t.Optional[t.Callable].

MSeal · 2020-06-18T16:29:50Z

nbclient/util.py

+        future = hook(*args)
+    else:
+        loop = asyncio.get_event_loop()
+        future = loop.run_in_executor(None, hook, *args)


I think it'd be preferable to use kwargs so it's more forward compatible?

MSeal · 2020-06-18T16:31:07Z

nbclient/client.py

-        if self.force_raise_errors or not cell_allows_errors:
-            if (exec_reply is not None) and exec_reply['content']['status'] == 'error':
+        if (exec_reply is not None) and exec_reply['content']['status'] == 'error':
+            run_hook(self.on_cell_error, cell, cell_index)


Do you want this hook for any error (including ignored ones)? It might require that it be specified that suppressed errors would trigger the error handling hook. The caller below may not know if it's a suppressed error or not

Yeah I explicitly did it this way since people wouldn't use the hook if they were also suppressing errors. What kind of errors make it here that wouldn't be suppressed?

golf-player · 2020-06-20T00:48:09Z

@MSeal thanks for the input and review. I've addressed most of the things. Not too sure what was to be made clearer in the docstring, but I've given it a stab.

I've no idea why this is breaking tests...

nbclient/client.py

MSeal · 2020-06-22T16:43:05Z

@golf-player probably good for now. Let's fix those two whitespace issues causing the linter tests to fail and I think we can merge.

golf-player · 2020-06-22T23:51:15Z

@MSeal fixed the trailing whitespace issue (not sure what I was doing when tox told me it was a problem....)

golf-player · 2020-06-23T00:10:16Z

also fixed conflicts and added typing to the function in util.py

.bumpversion.cfg

davidbrochart · 2020-06-24T05:23:53Z

@golf-player that's great, could you add a test?

golf-player · 2020-06-24T20:32:58Z

Yeah I'll do some tests sometime this week.

This will enable tracking of execution process without subclassing the way papermill does.

chrisjsewell · 2020-08-01T14:55:24Z

Hey guys, +1 for this PR 👍. In executablebooks/jupyter-book#833 (comment) we were discussing about logic for skipping cell execution, e.g. if the cell contains a certain metadata tag.
It feels like with a small adjustment to these hooks that might be possible, something like:

response = run_hook(self.on_cell_start, cell=cell, cell_index=cell_index)
if response is False:
    self.log.debug("Skipping cell execution due to hook response %s", cell_index)
    return cell

(although the current async nature of the hook I guess makes it trickier)

chrisjsewell · 2020-08-01T15:41:55Z

nbclient/client.py

+        default_value=None,
+        allow_none=True,
+        help=dedent("""
+        A callable which executes before a cell is executed.


This may be just my ignorance on async, but is this sentence technically true?
It looks like in run_hook you are enforcing these functions to be asynchronous, with no await, meaning that although they start execution before the cell is executed, they may not actually finish before the cell is executed?
Also, if this is the case, is it wise to be parsing a non-copy of the thread unsafe cell object to the hook?

I believe you're right. There's no guarantee it executes prior to the cell being executed, which means potentially people could expect mutating the cell or something in the hook would occur prior to the actual execution of the cell. Thanks for the catch there.

Do you have any suggestions on how to handle this?

I mostly wanted this feature so I could do something like a live indicator of what cell was running at a given time (which wouldn't be a problem), so this didn't occur to me. And for the same reason, I didn't want to block the execution of a cell with a hook. Maybe an option to make the hook block? Or perhaps making the hook a coro means it gets executed as a Task, and otherwise it blocks?

Heya, yeh I think the hook calls should always be await'ed. If you still wanted to add this non-blocking type behaviour I would say add it within your hook function, rather than it being intrinsically within nbclient (although personally I wouldn't advise it, because it feels like it could possibly result in a "mess" of task completion timings)
It also feels like maybe you should just specify that all hook functions should be Awaitable, rather than having this async wrapping behaviour that again is not made super clear to users from the traitlet

golf-player · 2020-08-03T02:40:42Z

Sorry, I've been generally non-productive the last month or so and let this thing slip. I'll get it up to date with master and hopefully finish things off this coming week.

Assuming I'm able to address the new comments

davidbrochart · 2020-09-15T06:40:20Z

Hi @golf-player, do you still plan to work on this PR? We were discussing in jupyter/nbconvert#1380 and we think that it would be helpful.

devintang3 · 2021-12-17T19:09:20Z

Hi @golf-player @davidbrochart - I'm interested in this enhancement as well. If I'm following the conversation correctly, it sounds like the only things missing is a rebase and tests? If so, I'd be willing to continue this.

Not too sure what the etiquette is on continuing another person's PR as well, so any guidance on that would be appreciated.

davidbrochart · 2021-12-20T08:19:18Z

Hi @devintang3, if you want to continue that work, great! Rebasing and adding tests would be a good start.
@golf-player could give you commit rights on his branch, or you can open a new PR based on his branch, so that you pick up the commits.

davidbrochart reviewed Jun 1, 2020

View reviewed changes

golf-player force-pushed the master branch from 1c923a2 to b9faf64 Compare June 13, 2020 03:33

davidbrochart reviewed Jun 17, 2020

View reviewed changes

MSeal suggested changes Jun 18, 2020

View reviewed changes

davidbrochart reviewed Jun 20, 2020

View reviewed changes

nbclient/client.py Outdated Show resolved Hide resolved

nbclient/client.py Outdated Show resolved Hide resolved

golf-player force-pushed the master branch from bc5d0f4 to 3ff3e87 Compare June 23, 2020 00:09

golf-player force-pushed the master branch from 3ff3e87 to 629c763 Compare June 24, 2020 02:17

davidbrochart reviewed Jun 24, 2020

View reviewed changes

.bumpversion.cfg Outdated Show resolved Hide resolved

Add basic hooks during execution

cb5e705

This will enable tracking of execution process without subclassing the way papermill does.

golf-player force-pushed the master branch from 629c763 to cb5e705 Compare June 24, 2020 20:36

chrisjsewell mentioned this pull request Aug 1, 2020

Tag to skip cell execution executablebooks/jupyter-book#833

Open

chrisjsewell reviewed Aug 1, 2020

View reviewed changes

davidbrochart mentioned this pull request Sep 14, 2020

Execute preprocess cells jupyter/nbconvert#1380

Merged

chrisjsewell mentioned this pull request Sep 16, 2020

Incompatibility with nbconvert 6 chrisjsewell/pytest-notebook#10

Closed

davidbrochart mentioned this pull request Sep 9, 2021

Create some extension points for notebook / cell execution #158

Open

devintang3 mentioned this pull request Dec 28, 2021

Client hooks #188

Merged

		@@ -223,6 +223,35 @@ class NotebookClient(LoggingConfigurable):

		kernel_manager_class = Type(config=True, help='The kernel manager class to use.')

		on_kernel_create = Any(

WIP: Expose some information about notebook execution state #79

Are you sure you want to change the base?

WIP: Expose some information about notebook execution state #79

Conversation

golf-player commented May 30, 2020 • edited

davidbrochart commented May 31, 2020

golf-player commented May 31, 2020

davidbrochart commented May 31, 2020

golf-player commented May 31, 2020

davidbrochart commented Jun 1, 2020

golf-player commented Jun 1, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

golf-player Jun 1, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MSeal commented Jun 10, 2020

golf-player commented Jun 11, 2020

MSeal commented Jun 11, 2020

golf-player commented Jun 11, 2020

MSeal commented Jun 11, 2020

golf-player commented Jun 13, 2020

golf-player commented Jun 17, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

golf-player Jun 20, 2020 • edited

Choose a reason for hiding this comment

MSeal left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

golf-player Jun 23, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

golf-player commented Jun 20, 2020 • edited

MSeal commented Jun 22, 2020

golf-player commented Jun 22, 2020

golf-player commented Jun 23, 2020

davidbrochart commented Jun 24, 2020

golf-player commented Jun 24, 2020

chrisjsewell commented Aug 1, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

golf-player commented Aug 3, 2020

davidbrochart commented Sep 15, 2020

devintang3 commented Dec 17, 2021

davidbrochart commented Dec 20, 2021

golf-player commented May 30, 2020 •

edited

golf-player Jun 1, 2020 •

edited

golf-player Jun 20, 2020 •

edited

golf-player Jun 23, 2020 •

edited

golf-player commented Jun 20, 2020 •

edited

chrisjsewell commented Aug 1, 2020 •

edited