Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problematic edit of diacritics that are in form of decomposed characters #506

Open
pskowronek opened this issue May 20, 2023 · 2 comments
Open
Assignees
Labels
bug pr-welcome Pull requests are welcome as this issue might not get looked at for a while

Comments

@pskowronek
Copy link

pskowronek commented May 20, 2023

Description
Problematic edit of diacritics that are in form of decomposed characters (see here). So, diacritics can be represented as composed character from UTF-8 or in decomposed form: letter + accent. The first type works good, no problem, whereas the second one has problem while editing such character - if you place cursor after such character or a little bit in the middle and hit Tab or Space, then the character is being split into two - a letter and its accent.

Those decomposed characters are being used by the newest macOS versions for filenames on file system (and on older macOS on certain volumes, like those mounted from sparse bundles). I guess, they can also be used when people have standard keyboard layouts and they want to type diacritics - probably there's a way to type a letter plus add accent to it.

Steps to Reproduce
Specific steps to reproduce the behavior:

  1. Replace sample file RSyntaxTextAreaDemo/src/main/resources/org/fife/ui/rsyntaxtextarea/demo/JavaExample.txt with the one that contains decomposed characters, like this (unzip it first):
    javaexampletxt.zip
  2. Run demo app ./gradlew run
  3. Diacritics are shown properly
  4. Try to place cursor after diacritic or in the middle - hit Tab or space

Expected behavior

The word is split properly and diacritic character stays intact

Actual behavior

The diacritic character is split into two characters - a letter and its accent

Screenshots
Initially presented OK:
rsta-show

After hitting Tab near diacritics:
rsta-edit

Java version
Used Java 14 since this project has older Gradle, but tried with muCommander that is using RSyntaxTextArea and java 20 - the problem is still there.

macOS version
10.15.7

Additional context
btw, I still can see such behavior in the newest IntellijIDEA v2023.1.2 (that is also using Java/Swing).
More details can be found here: mucommander/mucommander#941

@pskowronek pskowronek added the bug label May 20, 2023
@bobbylight bobbylight self-assigned this Jun 14, 2023
@bobbylight bobbylight added the pr-welcome Pull requests are welcome as this issue might not get looked at for a while label Jun 14, 2023
@bobbylight
Copy link
Owner

I might need some help on this one, as I don't know much about this topic, but I'm happy to take a look!

@pskowronek
Copy link
Author

pskowronek commented Jun 17, 2023

@bobbylight I can try to assist, however, how technically it should be solved I don't have idea. Please see the referenced bug in muCommander I gathered some links (especially mucommander/mucommander#941 (comment)). However, in Intellij IDEA a similar bug is still there.

One idea was to check if cursor is in the middle of a character that it is a composite - but how to accurately tell if it is the case? Probably by checking if the next character is 'accent', and if so, then the character before the cursor and check via java if it is composite (I think java 20 can tell that, don't have api at hand tho). The next question is - how many characters before should be checked.

Good start is here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug pr-welcome Pull requests are welcome as this issue might not get looked at for a while
Projects
None yet
Development

No branches or pull requests

2 participants