Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accuracy calculation not working properly in Vietnamese #4218

Open
2 tasks done
Miaozxje opened this issue Apr 25, 2023 · 22 comments
Open
2 tasks done

Accuracy calculation not working properly in Vietnamese #4218

Miaozxje opened this issue Apr 25, 2023 · 22 comments
Labels
stuck We don't know how to proceed with this

Comments

@Miaozxje
Copy link

Did you clear cache before opening an issue?

  • I have cleared my cache

Is there an existing issue for this?

  • I have searched the existing issues

Does the issue happen when logged in?

Yes

Does the issue happen when logged out?

Yes

Does the issue happen in incognito mode when logged in?

Yes

Does the issue happen in incognito mode when logged out?

Yes

Account name

No response

Account config

No response

Current Behavior

When typing in Vietnamese, we have to typing 2 or 3 letters in other to making 1 correct letter
And it seems like monkeytype calculate those characters as incorrect

Example: when we typing "ước" it's actually the combination of

  • <"uo" and "s" and "w" and "c" > of in Telex, having prefix as "uo" and other characters can be in different ordinal
  • <"uo" and "7" and "2" and "c"> in VNI, also having prefix as "uo" and other characters can be in different ordinal

(Telex and VNI is 2 most common Vietnamese typing method) and monkeytype calculate that we have 2 errors

Expected Behavior

Base on the example above
Monkeytype should count those input as "Correct" and not "Incorrect" as it's now

Steps To Reproduce

  1. Open Monkeytype
  2. Select Vietnamese as typing language
  3. Typing in with Telex or VNI encode

Environment

  • OS: Windows 11
  • Browser: Google Chrome
  • Browser Version: 12.0.5615.138 (Official Build) (64-bit)

Anything else?

No response

@Miaozxje Miaozxje added the bug Something isn't working label Apr 25, 2023
@Miodec Miodec added help wanted Extra attention is needed stuck We don't know how to proceed with this labels May 5, 2023
@ngntrgduc
Copy link

Same problem. I think it is better to remove errors when typing in Vietnamese, or maybe check if a word is correct after user press Space key), because with a custom test of "ước" like above, for the Telex method:

  • If I type "uocws" fast, monkeytype will count for 5 error, but when I type slower, monkeytype just counts for 2 error (same above), sometimes I got 4 error , and I don't know why.
  • If I type "uowcs" monkeytype will count for 5 errors, and 4 errors if I type slower.
  • If I type "uwowcs" monkeytype will count for 6 errors, and 5 errors if I type slower.
    ...

All possible ways to type "ước" in Telex (as I remember 🥲): uocws, uocsw, uowcs, uowsc, uwowsc, uwowcs, uoswc, uoscw, usowc, usocw, uwosc, uwocs, ... (and maybe more). And I think it all be count as Correct.

I don't use VNI method but I think there will be the same issue with VNI method.

@Miodec
Copy link
Member

Miodec commented May 5, 2023

All possible ways to type "ước" in Telex (as I remember 🥲): uocws, uocsw, uowcs, uowsc, uwowsc, uwowcs, uoswc, uoscw, usowc, usocw, uwosc, uwocs, ... (and maybe more). And I think it all be count as Correct.

giphy

@Miodec
Copy link
Member

Miodec commented Jul 17, 2023

So, I just gave it another test, and it seems to be working fine, giving me 100% accuracy (on Windows 11 22H2 and MacOS 13.4). I tried typing ước with uocws, uowcs and uwowcs - all gave me 100% accuracy.

@ngntrgduc
Copy link

But in my case, with a custom test:

image

  • uocws give:

image

  • uowcs give:

image

  • uwowcs give:

image

All of them give me different results 🥲.

@Miodec
Copy link
Member

Miodec commented Jul 17, 2023

So, I just learned about these - are you using Unikey, EVKey or OpenKey?

@ngntrgduc
Copy link

I'm currently using Unikey 4.3 RC5.

@Miodec
Copy link
Member

Miodec commented Jul 17, 2023

Can you just quickly test without it? Using the native system Telex layout?

@ngntrgduc
Copy link

No, I don't have any system Telex layout 🥲.

@Miodec
Copy link
Member

Miodec commented Jul 17, 2023

Well, can you add it and test with it?

@ngntrgduc
Copy link

As far as I know, most people in my country use third-party apps like Unikey. So I think it will be better to solve the problem with Unikey application.

And yes. I added the system Telex layout and test it (on Window 11 22H2). The result is the same as yours.

@Miaozxje
Copy link
Author

Miaozxje commented Jul 17, 2023

Well, can you add it and test with it?

here is a video of me typing using EVKey 5.0.1
https://www.youtube.com/watch?v=1D7RqEObf6A

@Miodec
Copy link
Member

Miodec commented Jul 18, 2023

Which one do you guys think is the most popular one out of the three? If most of the Vietnameese typing population uses these kinds of software, its gonna be really hard to get this fixed, because it looks like those softwares simply NOT send an essential event (composition start and end) which would allow the correct accuracy calculation.

@Miodec Miodec removed bug Something isn't working help wanted Extra attention is needed labels Jul 18, 2023
@Miaozxje
Copy link
Author

Which one do you guys think is the most popular one out of the three? If most of the Vietnameese typing population uses these kinds of software, its gonna be really hard to get this fixed, because it looks like those softwares simply NOT send an essential event (composition start and end) which would allow the correct accuracy calculation.

as far as I know, Unikey maybe the most common, but I opinion is you guys should separate its into 3 different typing language

@Miodec
Copy link
Member

Miodec commented Jul 24, 2023

What do you mean by "separate its into 3 different typing langauge" ?

@Miodec
Copy link
Member

Miodec commented Jul 25, 2023

So, Unikey website mentions that its source code has been integrated into MacOS since 2007
image

But when i try to test Vietnamese input using the built in input, it works without issues (100% accuracy is possible). So, something weird is going on, someone changed something at some point.

Could any of you confirm that UniKey is working / not working (its only available on Windows)

@Miaozxje
Copy link
Author

So, Unikey website mentions that its source code has been integrated into MacOS since 2007 image

But when i try to test Vietnamese input using the built in input, it works without issues (100% accuracy is possible). So, something weird is going on, someone changed something at some point.

Could any of you confirm that UniKey is working / not working (its only available on Windows)

I can confirmed that UniKey does not working on Windows, as accuracy calculated kinda same as EVKey test video I uploaded above

@Miaozxje
Copy link
Author

What do you mean by "separate its into 3 different typing langauge" ?

sorry for my lack of information
just a idea that came to mind as that point when you shown that UniKey, EVKey and OpenKey sending different events that are needed for calculating accuracy
in that situation, I think that if you guys can separate the language of Vietnamese into 3 (for Unikey, EVKey and OpenKey)
but in the end, I realized it's not so smart to do this.

@Miodec
Copy link
Member

Miodec commented Jul 31, 2023

Well, I didnt say UniKey, EvKey and OpenKey do something different to eachother. I said they all do something different compared to the native input method.

@Miodec
Copy link
Member

Miodec commented Jul 31, 2023

I can confirmed that UniKey does not working on Windows, as accuracy calculated kinda same as EVKey test video I uploaded above

Well, thats not good.

@Miaozxje
Copy link
Author

Miaozxje commented Aug 5, 2023

I found this website: https://vntype.web.app that are forked from Monkeytype which having better accuracy and wpm calculation in Vietnamese (I'm using EVKey).
Can you look around to have idea about how they calculate these thing ?

@Miodec
Copy link
Member

Miodec commented Aug 23, 2023

I found this website: https://vntype.web.app that are forked from Monkeytype which having better accuracy and wpm calculation in Vietnamese (I'm using EVKey). Can you look around to have idea about how they calculate these thing ?

The problem with that fork is that its very old, so its very hard to find the changes they made. And they haven't updated any of the github links / contact links, so I don't know who to contact about this.

@SteveFour
Copy link

SteveFour commented Apr 28, 2024

Explain issue:

The general problem with Vietnamese's popular typing apps on Windows is the way they edit the letters.

Let's say the current word is "được".
Using "Telex" input language (*), here's how the system response:

Typing letter The actual word
d d
d đ
u đu
w đư
o đưo
w đươ
c đươc
j đưc

Bold & italic letter indicate what letter changed.
(Usually) the same result, but in Vietnamese's input methods there are mainly two approaches to the problem.

I'm not particularly knowledge in this subject, but I'll cite the original sources, as well as explain it in my understanding.

"An input method software's job is to transform chains of letters into the desired word. In my case, from "dduwowcj" into "được".

Most of Vietnamese's popular input method solves this issue by using a "fake backspace". So when I input the second "d", the app sends a fake backspace to delete the first "d", and replace it with "đ".

But according to the source, preedit should be the "correct" solution to this typing problem, by creating a temporary buffer inside the app, allowing the input method to change easily. And only after the input method commit the preedit, does the text becomes part of the document... For this different way of editing text, most OSes recommends marking preedit text differently, usually an underline to the preedit text.

Preedit is commonly used in input methods for Japanese, Chinese, Korean, etc."

Windows default input method for Vietnamese is also using preedit. I find it not having the accuracy issue compared to the others. It's just not many people use it.

Propose solution:

My propose solution to the accuracy problem with input methods using fake backspace, is probably to make a "compatibility accuracy mode", which only counts mistakes after the word is written (space is pressed). In other words: Check mistakes only on completed words.
This allows Monkeytype's mistake count the same way whether it is preedit, or fake backspace.

This compatibility solution works well for a lot of languages using input methods, and should be a default enabled for Vietnamese.

Sources:

These sources are in Vietnamese, there's little to no sources written in English:

https://lewtds.github.io/2014/07/31/uoc-mo-bo-go-kieu-unikey/

https://notes.huy.rocks/posts/go-tieng-viet-linux.html

I hope that this idea would help the team close this problem.
The reason why this issue got stuck for so long is because the way input methods handle words is a very rare topic, and mostly from Linux users.

(*) I mentioned Telex in this context as a typing language (the language of translating "dduwowcj" into "được") to differentiate with Preedit and Fake backspace, which handles how the letters change when typing. I have to write this because Wikipedia also mentioned Telex an input method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stuck We don't know how to proceed with this
Projects
None yet
Development

No branches or pull requests

4 participants