Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconnect to running session: keeping output #2833

Closed
oersted opened this issue Aug 15, 2017 · 66 comments · Fixed by jupyterlab/jupyter-collaboration#279
Closed

Reconnect to running session: keeping output #2833

oersted opened this issue Aug 15, 2017 · 66 comments · Fixed by jupyterlab/jupyter-collaboration#279

Comments

@oersted
Copy link

oersted commented Aug 15, 2017

There's been a lot of requests in the past to:

  • Reconnect to a long-running notebook and keep receiving new output.
  • Reconnect to a notebook running on the server side and recover all output when reconnecting.

I'd like to resurface this issue. To the best of my knowledge, these features have been planned at least since 2015 (earliest reference I found: link). It hasn't been implemented yet because it directly conflicts with the architecture of the stack and would require a big refactor.

I don't know if this is easier in the JupyterLab context or not. But now that this project is in very active development, it might be a good opportunity to revisit this and try to get it in now at the foundations.

I found to open issues on the topic in the old notebooks repo:

@blink1073
Copy link
Member

Hi @oersted, this work needs to be done on the server side, and JupyterLab uses the same server as the classic notebook, so the work to be done is the same.

@grahamanderson
Copy link

Is there some kind of alternative?
For myself, I have long running (data-science-y and scraping) Jupyter notebooks running (remotely) on a GCloud compute instance.
Kind of defeats the purpose if the script halts the moment I close my macbook. Or, am I mistaken?
On my end, when I lose browser connection, I can't reconnect to the Jupyter notebook without halting the script and losing output
Is there a plausible work-around?

@vidartf
Copy link
Member

vidartf commented Sep 15, 2017

You could make a python client ("headless frontend") that auto-saves when new messages are received, and automatically shuts down when last pending execution is finished. If you spin this up in a separate process (from the notebook) before starting your execution, I think you should be able to do what you want. But this is not really a JupyterLab issue.

@ZmeiGorynych
Copy link

So am I correct in understanding that when you close the browser, the notebook keeps running, there is simply no way to connect to it again to view the output? If so, the logical thing to do is to have it dump summaries to file every so often, and use another, quick-to-restart notebook to display these.

@grahamanderson
Copy link

grahamanderson commented Sep 15, 2017

Thanks for responding :) Perhaps, it's not a JuypterLab specific issue.
That said, what is the process for...

  1. executing code on an AWS/GCloud hosted notebook
  2. putting your macbook to sleep
  3. logging back into the notebook as it's running code?

How do Jupyter experts work with long running (remotely hosted) scripts?

Am I supposed to login to the notebook--only after the scripts have finished? Sometimes, scripts takes days to execute...

On my end, my code is inserting rows into a SQLite3 (which I can check) and sending me notifications by pushbullet. So, there are notifications. I'm also experimenting with KeepAlive Settings and Tmux on both ends (local and Gcloud)

If this question is not appropriate to JupyterLab (which is currently open in Chrome), what is an appropriate forum to ask this question?

@blink1073
Copy link
Member

Hi @grahamanderson, I think the mailing list would be the best forum.

@Sarnath
Copy link

Sarnath commented Oct 24, 2018

so, will this issue be addressed? Its so irritating when you have no idea what is happening on your long-running notebook. :(

@blink1073
Copy link
Member

@Sarnath, this will be implemented when we implement real time collaboration, which is in progress.

@SaschaHeyer
Copy link

@blink1073 any updates?

@blink1073
Copy link
Member

The RTC work is ongoing and being tracked in #5382. Adding the server side model is an extra step after that groundwork is laid.

@cossio
Copy link

cossio commented Jan 17, 2020

The RTC work is ongoing and being tracked in #5382. Adding the server side model is an extra step after that groundwork is laid.

Awesome! Looking so much forward to this issue being fixed.

@aseedb
Copy link

aseedb commented Mar 2, 2020

I am desperately waiting for this issue to be fixed. Please please fix it :-)

@hoangmt
Copy link

hoangmt commented Mar 16, 2020

I have been reading all the related issues but there is no solution proposed. I really need this function.

@jasongrout
Copy link
Contributor

We too would love if people would help contribute toward solving this!

@konradsitarz
Copy link

I'll need this functionality and I'm willing to help. Can anyone give me insight where to start? ✌️

@jasongrout
Copy link
Contributor

I think first step is to see if the current output buffering in the notebook server is working in JupyterLab. If there is one client connected to a kernel, and that client disconnects, then the server should be buffering output for when that one client reconnects. To test this, have a kernel that is generating output, then disconnect and reconnect to that kernel and see if output is buffered and sent to the client. There should be some messages in the server log about the buffering as well.

@eldad-a
Copy link

eldad-a commented May 5, 2020

Thanks to all the Devs for their continuous contribution!
This scientific ecosystem is so helpful.

There is a partial work-around, which I re-post here for those who may find it helpful; it does not resolve the matter, yet provides some of the desired functionality.
Complemented with a log-file which is continuously updated, one can monitor the progress of the execution.

There is a "half" of a work-around based on the ipycache extension.
Well, it may not be even a "half" but it is the best I found so far:
A cell can be "cached" such that running it again would load the output instead of executing the code again.

For example:
After executing a cell containing

%load_ext ipycache
%mkdir -vp ipycache
import time

A cell can be "decorated" using the %%cache magic:

%%cache ipycache/test_long 
i = 1
while i<30:
    print '{}, '.format(i),
    time.sleep(1)
    i+=1

The browser can be closed
Running the last cell again will not execute it again but load the cached output (and display it).
However, one has to wait for the execution to complete.
So this is will not help in monitoring the progress of a running code and I am not sure how it deals with errors.

Originally posted in ipython/ipython#4140 (comment)

@hamishcraze
Copy link

Hey guys, possibly a dumb suggestion, but why not just send all the output of a running cell to the first cell. If the issue is that cells lose which output maps to them, can't you just have a work around like:

"Insert generic message about why this is happening"
"Hey we see you reconnected to a running terminal, rerouting all output to first cell"

I mean I'm getting the messages of output under websocket traffic so the session is still receiving output.
That seems like an easy work around. Or am I missing something nuanced about the problem?

@saulshanabrook
Copy link
Member

Just a note that the current work on a real time data model is meant to address this issue as well: https://github.com/jupyterlab/rtc

@davidbrochart
Copy link
Contributor

I made some progress towards restoring notebook state, using jupyverse/jpterm:

Peek.2023-11-09.11-29.mp4

@sa-
Copy link

sa- commented Nov 9, 2023

@davidbrochart That's awesome! Is this done by storing a Ydoc in the kernel?

I'm imagining a setup with kernel gateway or enterprise gateway. It would be very nice to have jupyter running locally, and then ideally nothing would break if there is a power or internet outage during my 2 hour long cell execution

@davidbrochart
Copy link
Contributor

There is a YDoc representing the notebook but not in the kernel. This YDoc lives in the (jupyverse) server and in the (jpterm) client.
I'm currently working on doing the same for widgets, and in this case yes, there will be a YDoc representing the widget in the kernel, in the server and in the client. The widget will synchronize between the kernel and the server using the Comm protocol, and between the server and the client using a WebSocket. It will allow widget state restore as well.

@Wh1isper
Copy link
Contributor

Wh1isper commented Nov 9, 2023

It seems to go further with my comment before that using Y-CRDT(YDOC) to establish consistency, we can develop a wide variety of applications. 👍

@sa-
Copy link

sa- commented Nov 9, 2023

@davidbrochart that is great to hear. I would like to contribute but I'm new to the jupyter codebase - is there anything I can help with?

@jliebers
Copy link

jliebers commented Nov 9, 2023

@davidbrochart Your video looks amazing. I just want to applaud 👏 any effort towards resolving this issue, since it is the biggest disappointments that jupyterlab currently has (at least to me :) ). Any solution towards resolving this issue is highly welcome - thank you for your work in this direction!

@davidbrochart
Copy link
Contributor

And here is a demo showing full notebook state recovery, including widgets:

Peek.2023-11-10.11-29.mp4

@sa-
Copy link

sa- commented Nov 10, 2023

That's amazing!

@TeaCult
Copy link

TeaCult commented Nov 24, 2023

Wouldn't that be easier to keep a running session as an html document on the server side, and connect to it just to change this html. So that it would never disconnects only the renderer and editor of this html could disconnect. Basically I am trying to say that run an jupyter html in the server. And connect to it like google sheets(multiple users or single) and edit this html and trigger run and render output from this. Wouldn't it preserve states easily ?

Another things come to mind, whenever a notebook disconnects , server can keep dumping output to a temp html which is exact copy of disconnected ones and show it in a list of sessions. When you click to a session, you download this html and server will keep dumping ouput to this html. Also may have cell and widget states ?

Or does it already work like this ?

@davidbrochart
Copy link
Contributor

It doesn't work like this, there is no such thing as an HTML document on the server side.

@jianghuife
Copy link

And here is a demo showing full notebook state recovery, including widgets:

Peek.2023-11-10.11-29.mp4

How do I achieve this effect?

@davidbrochart
Copy link
Contributor

If you are asking about how to do it in jpterm, please open an issue there.
Otherwise, there is a PR in JupyterLab.

@3f6a
Copy link

3f6a commented Apr 12, 2024

Is this really closed? If so, yay! 🥳 Very exciting

@krassowski
Copy link
Member

Well, it is not yet implemented in jupyter-server, and not released but there definitely is some progress here ;)

@astrojuanlu
Copy link

xref jupyter-server/jupyter_server#1274

@3f6a
Copy link

3f6a commented Apr 16, 2024

Well, it is not yet implemented in jupyter-server

Does this mean that this feature will become available in jupyterlab-server before jupyter-server? Or something else?

@davidbrochart
Copy link
Contributor

No, it means that this feature will be available if you run JupyterLab with jupyverse instead of jupyter-server, until it is implemented in jupyter-server.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment