Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2 character wide glyphs don't work properly in st #56

Open
JimPix1 opened this issue Feb 28, 2022 · 19 comments
Open

2 character wide glyphs don't work properly in st #56

JimPix1 opened this issue Feb 28, 2022 · 19 comments

Comments

@JimPix1
Copy link

JimPix1 commented Feb 28, 2022

Unicode 9 added glyphs and emojis that are 2 characters wide instead of one which has caused problems for me like overlap with text (adding a space fixes this which isn't a huge issue) and the main issue of characters in a line that contains a 2 char wide glyph getting deleted. It's quite random and it can be worked around but is very annoying regardless. Letters in words can be deleted randomly just because I have a desktop emoji in the same line.

This is also an st issue, not restricted to flexipatch and afaik some terminals have fixed the issue but not st. Anyone know how it could be fixed?

@bakkeby
Copy link
Owner

bakkeby commented Mar 1, 2022

I am only aware of the wide glyphs patch ref. https://www.reddit.com/r/suckless/comments/jt90ai/update_support_for_proper_glyph_rendering_in_st/

It can be enabled in dwm-flexipatch via

st-flexipatch/patches.def.h

Lines 394 to 399 in 4a156b9

/* Adds proper glyphs rendering in st allowing wide glyphs to be drawn as-is as opposed to
* smaller or cut glyphs being rendered.
* https://github.com/Dreomite/st/commit/e3b821dcb3511d60341dec35ee05a4a0abfef7f2
* https://www.reddit.com/r/suckless/comments/jt90ai/update_support_for_proper_glyph_rendering_in_st/
*/
#define WIDE_GLYPHS_PATCH 0

From what I have heard it doesn't work in all cases or for all fonts.

@JimPix1
Copy link
Author

JimPix1 commented Mar 1, 2022

I do have the patch but it doesn't seem to solve it. I'm using the Hack font and JoyPixels for emojis and glyphs

@bakkeby
Copy link
Owner

bakkeby commented Aug 11, 2022

I am not sure of the origin of this other than that someone on reddit wrote it:
https://0x0.st/oeSQ.diff

Edit: originates from here: https://www.reddit.com/r/suckless/comments/jt90ai/update_support_for_proper_glyph_rendering_in_st/

I haven't had time to look too closely at it. It may be that it is too invasive or complicated to integrate into this project but adding it here as a reference.

From what I understand the mechanism for drawing characters on the screen is changed from:

  • drawing the background, drawing the character, move to the next character to
  • draw the background for the whole line, draw all the characters on that line, move to the next line

This should avoid wide glyphs from being cut when the background is drawn for the next character.

@bakkeby
Copy link
Owner

bakkeby commented Aug 11, 2022

I had a look at that patch and it is similar to the original wide glyphs patch with some minor tweaks. The aim of that is to avoid wide glyphs from being cut, which is different to the issue @JimPix1 describes.

Using the font2 patch and having fallback fonts that have wider glyphs than the primary font (which should be a monocle font with fixed widths) will create an overlap.

@JimPix1
Copy link
Author

JimPix1 commented Aug 11, 2022

Yeah my issue isn't the fonts being cut, it's opposite haha.
It's almost like the 2nd character of the glyph is just a space, and only the first part of it is the glyph. It's honestly very odd and I still haven't managed to fix this.

@bakkeby
Copy link
Owner

bakkeby commented Aug 11, 2022

Does it work for you if you do not use the WIDE_GLYPHS_PATCH?

I get that it is cropped by default.

image

Compared to when the wide glyphs patch is used.

image

I have noticed that if I use lsd (LSDeluxe) that the icon takes up two characters, but that the first character is the actual glyph and the second is a space as you say. You can see this if you try to select the text in which case it will be cut in half if you only select one or the other half. It could very well be that this is the same in your case. It makes sense doing it that way.

@JimPix1
Copy link
Author

JimPix1 commented Aug 11, 2022

if i dont use the wide glyphs patch it looks much worse
image
image

@bakkeby
Copy link
Owner

bakkeby commented Aug 11, 2022

Right, but why do you have in your prompt?

The plain text suggests that the front does not have the character, but why this one?

https://charbase.com/fef0-unicode-arabic-letter-alef-maksura-final-form

@JimPix1
Copy link
Author

JimPix1 commented Aug 11, 2022

My prompt isn't related to this

The link you sent is for fef0, what I sent was fe0f, although fe0f also isn't related to the glyph I showed so it's very peculiar

@bakkeby
Copy link
Owner

bakkeby commented Aug 12, 2022

My prompt isn't related to this

🤦🏻‍♂️ yes sorry, at first I saw the trailing > and assumed that was the end of your prompt :)

fe0f, that makes more sense.
https://charbase.com/fe0f-unicode-variation-selector

https://en.wikipedia.org/wiki/Variation_Selectors_(Unicode_block)

So that is a combination character that is a hint that we want the emoji-style with colour rather than monochrome when rendering the preceding character.

I can't replicate getting the <fe0f> in plain text, maybe it depends on how you are copying this. If I artificially insert that the combination character is possibly ignored (using fe0e to indicate the monochrome version does nothing, might have something to do with whether the font actually provides variants of the same character).

In any case this is not related to the overlap / cropping issue you are experiencing. I was just curious.

It is clear that in st each character are placed in their respective column. If the character is too wide to fit in that column then it will be cropped.

The wide glyphs patch / solution merely works around the cropping bit by allowing wide glyphs to bleed into the next column(s) / character(s).

I had a look at how this is handled in alacritty and it looks like there the character is allocated multiple columns depending on the width of the character.

image

and highlighting that character selects the whole block.

image

That seems like a nice way of handling it. Wonder what it would take to implement that in st.

@JimPix1
Copy link
Author

JimPix1 commented Aug 12, 2022

That does seem like a good solution and would solve all of the irritating problems and glitches I get

@veltza
Copy link
Contributor

veltza commented Aug 12, 2022

Desktop Computer emoji is combined by two unicode codepoints: U+1F5A5, U+FE0F

So try to avoid those combined emojis because st is not designed to handle unicode variation selectors (U+FE0F is variation selector-16). I have tested several terminals like Alacritty, XFCE terminal and Wezterm, and they all have the same problem.

The only terminal that seems to work well with all emojis is Kitty. So, if someone is willing to fix emojis in st, take a look at Kitty's source code, because I'm afraid the suckless guys won't fix emoji-related issues.

Keep in mind that some text editors like nano don't like combined emojis either, even on Kitty. Vim and Kitty seem to work fine.

@bakkeby
Copy link
Owner

bakkeby commented Aug 12, 2022

Flag emoji's is another example where emojis are combined, e.g.

Emoji: 🇨🇦 Flag: Canada

Which may appear in the terminal as the two emojis C and A.

The behaviour that I noted from alacritty where the emoji uses two character blocks seem to be the case also in a stock st.

image

The only way I get the overlapping bit is if I use the font2 patch and use a larger emoji fallback font. Is this only an issue when it comes to using glyphs from different fonts?

Edit: and more so if the fallback font is a non-monospaced font

@veltza
Copy link
Contributor

veltza commented Aug 12, 2022

Here is the problem with Desktop Computer emoji: U+1F5A5, U+FE0F

wcwidth(0x1f5a5) returns 1
wcwidth(0xfe0f) returns 0

So, st thinks that the emoji is not a wide glyph character.

And because wcwidth() is a system-dependent function...

The behavior of wcwidth() depends on the LC_CTYPE category of the
current locale.

...it needs to be rewritten like Kitty did:

https://github.com/kovidgoyal/kitty/blob/master/kitty/wcwidth-std.h
https://github.com/kovidgoyal/kitty/blob/master/kitty/wcswidth.c

I'm not an expert in this field, so there might be an easier solution too, but unfortunately, wcwidth is not the only problem, because the variation selector needs to be taken care of too.

@bakkeby
Copy link
Owner

bakkeby commented Aug 12, 2022

Interestingly kitty also supports combined characters.

I don't know that much about this either. In the above case I'd think that the U+1F5A5, U+FE0F should be treated as one character, not two.

I would think the problem is that when a grapheme cluster of 18 bytes like 👨‍👩‍👦 is found then it is treated as five characters; three emojis and two combining characters.

👨<200d>👩<200d>👦

I wonder if libgrapheme would make this easier, but I suspect that if this was an idea that someone would have tried to integrate this already.

Most likely there is an underlying limitation in the Xft library that prevents this anyway. After all Xft does not officially support colour emoji yet, but that said BGRA was merged recently ref. MR12. Pending release.

@veltza
Copy link
Contributor

veltza commented Aug 12, 2022

In the above case I'd think that the U+1F5A5, U+FE0F should be treated as one character, not two.

Exactly. That is a part of the problem.

I wonder if libgrapheme would make this easier, but I suspect that if this was an idea that someone would have tried to integrate this already.

I need to take a look at that too.

Here is a quick and dirty fix, if someone wants to experiment with it. It attempts to resolve only the variation selector 16 issues, so it's just a hack. It looks like it's working fine with vim, bat, cat and less, but not in shell's command line.

diff --git a/st.c b/st.c
index 04380ff..bd6e61f 100644
--- a/st.c
+++ b/st.c
@@ -3094,6 +3094,8 @@ tputc(Rune u)
 		len = utf8encode(u, c);
 		if (!control && (width = wcwidth(u)) == -1)
 			width = 1;
+		if (u == 0xFE0F)
+			width = 1;
 	}
 
 	if (IS_SET(MODE_PRINT))
@@ -3239,6 +3241,12 @@ check_control_code:
 			gp[1].u = '\0';
 			gp[1].mode = ATTR_WDUMMY;
 		}
+	} else if (u == 0xFE0F) {
+		if (term.c.x > 0 && !(term.line[term.c.y][term.c.x-1].mode & ATTR_WDUMMY)) {
+			term.line[term.c.y][term.c.x-1].mode |= ATTR_WIDE;
+			gp->mode = ATTR_WDUMMY;
+			gp->u = u;
+		}
 	}
 	if (term.c.x+width < term.col) {
 		tmoveto(term.c.x+width, term.c.y);

emoji-fix

This fix is only for testing. Do not use it!

Edit: I just found out that this fix doesn't work well because there are some line wrapping issues. So back to the drawing board.

@veltza
Copy link
Contributor

veltza commented Aug 13, 2022

Forget about that crappy fix, because it doesn't work well with text editors. I learned that emoji support is just a big rabbit hole and there isn't an easy or universal solution for it. The problem is that both the terminal and the program must support emojis in exactly the same way. For example, if a text editor or shell thinks an emoji's character width is 2 and the terminal thinks it is 1, or vice versa, cursor movement is broken. Even though Kitty is very good at emojis, some emojis are still broken in vim. That could be vim's fault too.

@bakkeby
Copy link
Owner

bakkeby commented Aug 13, 2022

I tried cloning and installing libxft (instead of using the libxft-bgra library) as color emoji support has been merged in.

I can confirm that it does now show emojis in colour, and I can also confirm that the library does not support grapheme clusters. I suppose that new Xft version is not going to be the saviour in this case.

I hacked dwm to use libgrapheme to work out the utf-8 offsets just to see if it would work.

@JimPix1
Copy link
Author

JimPix1 commented Aug 13, 2022

I don't really mind this problem it's just the fact that the wide glyphs will mess up any text after it on the same line which you can work around but it's very annoying and you essentially need to delete the glyph whenever you edit a line containing it and add it back after

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants