Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NotebookCell is missing outputs field #1419

Open
karthiknadig opened this issue Feb 6, 2024 · 15 comments
Open

NotebookCell is missing outputs field #1419

karthiknadig opened this issue Feb 6, 2024 · 15 comments
Labels
feature-request Request for new features or functionality
Milestone

Comments

@karthiknadig
Copy link
Member

karthiknadig commented Feb 6, 2024

See vscode.d.ts : https://github.com/microsoft/vscode/blob/ad05897293964be52ad97505124830937d096953/src/vscode-dts/vscode.d.ts#L14441

{
"name": "NotebookCell",
"properties": [
{
"name": "kind",
"type": {
"kind": "reference",
"name": "NotebookCellKind"
},
"documentation": "The cell's kind"
},
{
"name": "document",
"type": {
"kind": "base",
"name": "DocumentUri"
},
"documentation": "The URI of the cell's text document\ncontent."
},
{
"name": "metadata",
"type": {
"kind": "reference",
"name": "LSPObject"
},
"optional": true,
"documentation": "Additional metadata stored with the cell.\n\nNote: should always be an object literal (e.g. LSPObject)"
},
{
"name": "executionSummary",
"type": {
"kind": "reference",
"name": "ExecutionSummary"
},
"optional": true,
"documentation": "Additional execution summary information\nif supported by the client."
}
],
"documentation": "A notebook cell.\n\nA cell's document URI must be unique across ALL notebook\ncells and can therefore be used to uniquely identify a\nnotebook cell or the cell's text document.\n\n@since 3.17.0",
"since": "3.17.0"
},

@dbaeumer
Copy link
Member

dbaeumer commented Feb 7, 2024

Actually LSP has no support for NotebookCellOutput since I didn't see a use case since LSP servers have no notebook execution and you usually don't have language smarts in outputs.

Could you explain the use case why you think that should flow to the server?

@dbaeumer dbaeumer added the feature-request Request for new features or functionality label Feb 7, 2024
@dbaeumer dbaeumer added this to the Backlog milestone Feb 7, 2024
@karthiknadig
Copy link
Member Author

This came up in a discussion with @rebornix when we were going over each field in NotebookCell vs NotebookCellData. Basically, when creating, moving, or reordering cells as a part of notebook formatting would we preserve the output of the original cells. How would we handle preserving output when formatting?

@dbaeumer
Copy link
Member

dbaeumer commented Feb 8, 2024

Actually formatting a notebook (e.g. ordering the cells) might not be something a LSP server should do. I think a LSP server should be able to format a code cell of a specific language. If we have a notebook with TypeScript, C# and Python cells which language server should be responsible to format the overall notebook?

@karthiknadig
Copy link
Member Author

karthiknadig commented Feb 8, 2024

Formatting tools like ruff, black support formatting notebooks. Currently these tools are formatting and saving directly to disk. This can break history, and some functionality since VS Code now has to re-read the file from disk. Why shouldn't a python tool refactor (combine, split, move or remove) just the python cells?

Think import organizer pulling all the import statements from various cells into a top cell that just does all imports? Code refactoring that splits or combines cells of some language. Currently all of this is being done by writing the notebook to disk after formatting.

@rebornix and I are thinking about the selecting a overall formatter. Since the current DocumentSelector paradigm is not sufficient for this.

@dbaeumer
Copy link
Member

dbaeumer commented Feb 9, 2024

I think we need to keep two things apart:

  • formatting or rewriting the content of a cell(s) in a language specific manner. This needs to be done by the corresponding language server.
  • taking the output of the above formatting and basically recreating a notebook from it based on the former notebook . IMO this is nothing a language server should do since language servers very likely will never have the full picture of the notebook. (e.g. a Python language server will very likely never see the C# cells)

@rebornix
Copy link
Member

rebornix commented Feb 9, 2024

Put outputs and ployglot aside (for now), I personally find that there is a need for notebook level language support. It's between project/workspace and single file, notebook formatting and code actions are two good examples where the language server needs to know about the context of notebooks and might need to adjust the notebook document accordingly.

I'm imagining scenarios: users work on Python or Julia or C# notebook (single language) document, and they want the document to be formatted as a whole (e.g., imports are all moved to the first cell). Current LSP messages/types can't express this though.

@karthiknadig
Copy link
Member Author

I want to clarify that when I mentioned ruff and black, this applies to notebook document as a whole. That is, you can run ruff mynotebook.ipynb. Currently, ruff-lsp constructs a in-memory notebook JSON using data that it gets from LSP notebook sync, writes the formatted content to the disk.
reference:
A custom object with output tracking: https://github.com/astral-sh/ruff-lsp/blob/4afe87915f89574340f4a7c36d398478cc5574a5/ruff_lsp/server.py#L361-L367
JSON object from LSP Notebook sync: https://github.com/astral-sh/ruff-lsp/blob/4afe87915f89574340f4a7c36d398478cc5574a5/ruff_lsp/server.py#L387-L416
Since LSP does not allow inserting/removing cells, any structure changes as a part of formatting results is discarding whole formatting: https://github.com/astral-sh/ruff-lsp/blob/4afe87915f89574340f4a7c36d398478cc5574a5/ruff_lsp/server.py#L1340-L1349

If the formatting tool decides to add/remove cells then I think we need to allow this to occur over LSP.

@dhruvmanila
Copy link

dhruvmanila commented Feb 12, 2024

Hi, chiming in as the author of the PR implementing Notebook support in ruff-lsp. For context, I work at Astral.

From what I understood from this conversation, and correct me if I'm wrong, is that "formatting" in this context is referred not just for code formatting (like textDocument/formatting) but also something like code actions which, as @karthiknadig mentioned, could potentially add or remove code cells (for instance, aggregating all imports in a code cell).

Now, I just want to clarify how ruff-lsp performs Notebook formatting. As mentioned by @karthiknadig, it does create an in-memory Notebook JSON which is according to the official format spec and is what is expected by Ruff. This is then sent to Ruff via stdin which formats the Python code cells and returns the formatted Notebook JSON to stdout. The server converts the output JSON to a list of TextDocumentEdit which is then sent as a workspace edit. This means that the server doesn't touch the notebook file on disk and we only update the cell content via LSP protocol (using edits).

The reason for this process is that there's no official support for Notebook document formatting (notebookDocument/formatting). Now, each cell is represented as a TextDocument which is what VS Code sends a formatting request for. So, notebook formatting is only supported at a cell level and when someone wants to actually format an entire notebook, VS Code will send multiple requests for each code cell.

And, if someone invokes the ruff command on the Notebook file directly (ruff notebook.ipynb), in that case we do write the formatted file on disk while preserving the order and other metadata. But, this is unrelated to how our server works.

Does this help? Feel free to ping me with any other questions or input as required :)

@karthiknadig
Copy link
Member Author

@dhruvmanila the issue is that if you use WorkspaceEdit API from VS code extension you can add or remove cells, but from LSP there is no option to do that for cases like ipynb.

One of the asked for features for formatting in the black/autopep8/isort extension repos is to properly support formatting for ipynb files, which means allowing not only document edits but allow notebook structural changes.

@dbaeumer
Copy link
Member

What is describe doesn't sound like formatting to me (not in the sense as we use formatting in VS Code). It sounds more like code actions.

And I am not sure if we should add this to LSP since it would require to mirror the whole Notebook structure inclusive NotebookEdits into LSP. What I could think about is a Notebook protocol which is based on the BaseProtocol (see: https://microsoft.github.io/language-server-protocol/specifications/base/0.9/specification/) which would allow a server to acts as both a language server and a notebook server.

@rebornix
Copy link
Member

Since it's still evolving pretty fast (we had notebook code actions in VS Code a few months ago and now we are exploring notebook level formatting), I wonder we can firstly try custom messages but craft a spec for it. With @karthiknadig 's auto code generation tool, we can auto build libraries for notebook-specific protocol, which work like plugins to the existing standard LSP libraries.

Then language servers like ruff-lsp can import the types as

from notebooklsprotocol.types import (
    NOTEBOOK_DOCUMENT_FORMATTING
}

if we see common interest on it, we can then make it a real notebook protocol.

@dbaeumer
Copy link
Member

Some tips: both the client and the server supports feature you can plug into them (@rebornix is this what you refer to as plugins?). This allows to develop new features outside of this repository but already have some nice way to integrate them since they participate on the client and server side in things like capabilities initialization. If you need support with feel free to ping.

@rebornix
Copy link
Member

Some tips: both the client and the server supports feature you can plug into them (@rebornix is this what you refer to as plugins?).

Yes this is accurate. I like this approach!

@dhruvmanila
Copy link

(Sorry for the late reply.)

One of the asked for features for formatting in the black/autopep8/isort extension repos is to properly support formatting for ipynb files, which means allowing not only document edits but allow notebook structural changes.

As @dbaeumer pointed out, I'm not sure if making structural changes sounds like formatting. Although, if a cell contains only imports that are unused, then we might as well delete the entire cell.

Since it's still evolving pretty fast (we had notebook code actions in VS Code a few months ago and now we are exploring notebook level formatting), I wonder we can firstly try custom messages but craft a spec for it.

If I understand this correctly, this will kind off be like an experimental feature which the server can opt-in. I like this approach as well.

@karthiknadig
Copy link
Member Author

Sorry for overloading "formatting", my concern was that currently there is no way to add or remove cells. The core WorkspaceEdit.set API provides a way to insert or remove cells. Currently, with LSP we can create, delete, rename, edit documents, and apply snippets. One of the things we can't do from LSP is insert or delete cells, which you can do from the core API.

I also understand there is larger issue of document selection, to associate with a particular language server, and how the notebook itself is handled, as in the notebook server case. Currently, I think we can improve the experience with notebooks. If we can define it as an extension over LSP using similar schema, we could generate the types for easier consumption by existing language servers for narrow cases like single language notebooks. The lsprotocol package already supports registration of custom features to LSP. We can investigate and see if such an extension is sufficient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request Request for new features or functionality
Projects
None yet
Development

No branches or pull requests

4 participants