Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTF-8 support #1224

Open
kleinesfilmroellchen opened this issue Apr 5, 2023 · 4 comments
Open

UTF-8 support #1224

kleinesfilmroellchen opened this issue Apr 5, 2023 · 4 comments
Labels
feature-request Request for new features or functionality help wanted Issues identified as good community contribution opportunities
Milestone

Comments

@kleinesfilmroellchen
Copy link

For about half a year now, UTF-8 character positioning support has been supported by the LSP protocol since 3.17.0 and multiple servers, such as clangd and rust-analyzer, have supported it either since then or before then (in fact, clangd "invented" this capability with an LSP extension).

This feature would be highly useful in this client library. For my server (and probably many others), UTF-8 is much simpler to handle, already supported by the LSP interface library (lsp-types in my case), and of course supported by many IDEs. Only the glue code client shim doesn't cooperate yet.

Reopens #748 in a new issue, since the author of that issue and the maintainers of this repo have not done so.

@dbaeumer dbaeumer added the help wanted Issues identified as good community contribution opportunities label Apr 6, 2023
@dbaeumer dbaeumer added this to the Backlog milestone Apr 6, 2023
@dbaeumer
Copy link
Member

dbaeumer commented Apr 6, 2023

The biggest concern I have with this is that it forces the client to open and read every file that is part of a response to do the conversion. Servers usually already processed the content and can IMO do the conversion more efficiently.

But if someone finds a good implementation I am happy to have a look.

@tooltitude-support
Copy link

The biggest concern I have with this is that it forces the client to open and read every file that is part of a response to do the conversion. Servers usually already processed the content and can IMO do the conversion more efficiently.

Does it mean that you are using utf16 internall in vscode. Am I right?

@dbaeumer
Copy link
Member

Yes, VS Code is written in JavaScript which is utf16 based

@kleinesfilmroellchen
Copy link
Author

kleinesfilmroellchen commented Apr 16, 2023

The biggest concern I have with this is that it forces the client to open and read every file that is part of a response to do the conversion.

Doing the conversion is necessary in general, isn't it? Most files are UTF-8, node converts that and other encodings to UTF-16. I don't see the point.

Servers usually already processed the content and can IMO do the conversion more efficiently.

It still complicates simple servers written in UTF-8-based languages, such as Python, or Rust. Even the JVM handles UTF-8 more transparently than JavaScript even though it interally uses UTF-16.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request Request for new features or functionality help wanted Issues identified as good community contribution opportunities
Projects
None yet
Development

No branches or pull requests

3 participants