Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding in StatusBar #405

Open
inkydragon opened this issue Jun 22, 2023 · 2 comments
Open

Encoding in StatusBar #405

inkydragon opened this issue Jun 22, 2023 · 2 comments

Comments

@inkydragon
Copy link
Contributor

inkydragon commented Jun 22, 2023

The encoding in the status bar currently shows the code page of the editor.
instead of the file's encoding.

You may open a non-UTF-8 encoding file to reproduce the issue.
And there are many test files in uchardet: NotepadNext\src\uchardet\test

void EditorInfoStatusBar::updateEncoding(ScintillaNext *editor)
{
switch(editor->codePage()) {

qDebug("Using codec: '%s'", codec ? codec->name().constData() : "");


I am not sure how to fix this issue:

  1. Update editor->codePage() after detecting the file encoding
  2. Or add a new property (in ScintillaNext) to save the file encoding
@dail8859
Copy link
Owner

The short answer is...I don't know. 😄

I barely know how to properly handle encoding, but when it starts to come to mixing Qt, uchardet (which isn't being fully utlized), and Scintilla (that has it's own codec handling) it gets rather confusing very quickly.

I am not sure how to fix this issue:

Maybe both? Scintilla needs the proper code page, but also the file encoding might need stored somewhere to properly save the file back to disk? Unfortunately, I don't have any good answers.

I tried to dig into Notepad++ a while back to see how it handles this, but the code is a bit messy and hard to match up with Qt in places.

If it is something you are wanting to look into I think whatever you determine is fine. Even if it isn't the perfect solution, I think any kind of improvements are desirable.

@dail8859
Copy link
Owner

I spent some time looking at this. I think the first step (not a perfect solution) is to focus on UTFxx and ignoring uchardet for now. So something like this:

  1. Detect the file encoding (Qt can do this by checking for BOMs), save this information.
  2. Set the Scintilla's codepage to UTF8.
  3. Convert the file to a UTF8 encoding.
  4. Load it in the editor
  5. User edits file as normal...
  6. When saving the file, convert the UTF8 bytes from Scintilla to the previously stored encoding information.

As you suggested the encoding is better to be shown in the status bar.

I have most of this working already but the encoding APIs changed enough between Qt5 and Qt6 enough to make this annoying. Qt5 actually is easier to work with.

Things that still need to be considered:

  • Single byte encodings e.g. ASCII, Latin-1
  • Generating BOMs if needed
  • uchardet
  • Other non-UTF code pages

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants