Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Hindi glyphs (with matras/ diacritics/ vowels) not displaying in any PsychoPy version (both coder and builder) #6135

Open
niket-agrawal opened this issue Jan 6, 2024 · 7 comments
Labels
🐞 bug Issue describes a bug (crash or error) or undefined behavior.

Comments

@niket-agrawal
Copy link

PsychoPy Version

2022.2.4

What OS are your PsychoPy running on?

Windows 10

Bug Description

We are facing this major issue with PsychoPy since last 3+ years that it is not able to display any character/string with matras/vowel sign.

It displays कमल correctly without any errors but crashes as soon as we add word with matra कमाल

It works correctly on Pavlovia (online environment) though. I guess because it offloads that task to web-fonts.

Let me first explain what is a matra :

In Hindi, a "matra" refers to the diacritical marks or vowel signs used in the script. Matras are an integral part of the script to convey accurate pronunciation. Matras do not mean anything on their own and need a "akshara" i.e. consonant to make proper syllable.
For example: is (akshara or consonant) and is (matra or diacritic denoting vowel). When they combine, it forms a single but composite glyph like म + ा = मा
These matras are very interesting and unique to Indian languages. Each matra have its own property and can be displayed all around the akshara/consonant, ( ी at right, ि at left, ै at top, ू at bottom etc., there are others too) and can even combine to make a akshara/consonant half. I'm mentioning some more such examples below:

प + ी = पी (displayed at right)
प + ि = पि (displayed at left)
प + ू = पू (displayed at bottom)
प + ै = पै (displayed at top)
प + ् + य = प्य (प is displayed only half with combined with ्) as opposed to प + य = पय

PsychoPy in its current form, is not able to display any of the Hindi characters. We currently rely on jsPsych or OpenSesame.

C:\Users\XXXXXXXXXXXXXXX\Desktop\codes\PsychopyHindi>python test_by_code.py
pygame 2.1.0 (SDL 2.0.16, Python 3.8.10)
Hello from the pygame community. https://www.pygame.org/contribute.html
Traceback (most recent call last):
  File "test_by_code.py", line 10, in <module>
    text = psychopy.visual.TextStim(win=win,text=word_to_display)
  File "C:\Program Portable\PsychoPy\psychopy_2022.2\lib\site-packages\psychopy\visual\text.py", line 227, in __init__
    self.setText(text, log=False)
  File "C:\Program Portable\PsychoPy\psychopy_2022.2\lib\site-packages\psychopy\visual\text.py", line 387, in setText
    setAttribute(self, 'text', text, log)
  File "C:\Program Portable\PsychoPy\psychopy_2022.2\lib\site-packages\psychopy\tools\attributetools.py", line 134, in setAttribute
    setattr(self, attrib, value)
  File "C:\Program Portable\PsychoPy\psychopy_2022.2\lib\site-packages\psychopy\tools\attributetools.py", line 27, in __set__
    newValue = self.func(obj, value)
  File "C:\Program Portable\PsychoPy\psychopy_2022.2\lib\site-packages\psychopy\visual\text.py", line 378, in text
    self._setTextShaders(text)
  File "C:\Program Portable\PsychoPy\psychopy_2022.2\lib\site-packages\psychopy\visual\text.py", line 396, in _setTextShaders
    self._pygletTextObj = pyglet.text.Label(
  File "C:\Program Portable\PsychoPy\psychopy_2022.2\lib\site-packages\pyglet\text\__init__.py", line 452, in __init__
    super(Label, self).__init__(document, x, y, width, height,
  File "C:\Program Portable\PsychoPy\psychopy_2022.2\lib\site-packages\pyglet\text\__init__.py", line 273, in __init__
    super(DocumentLabel, self).__init__(document,
  File "C:\Program Portable\PsychoPy\psychopy_2022.2\lib\site-packages\pyglet\text\layout.py", line 820, in __init__
    self.document = document
  File "C:\Program Portable\PsychoPy\psychopy_2022.2\lib\site-packages\pyglet\text\layout.py", line 929, in _set_document
    self._init_document()
  File "C:\Program Portable\PsychoPy\psychopy_2022.2\lib\site-packages\pyglet\text\layout.py", line 1043, in _init_document
    self._update()
  File "C:\Program Portable\PsychoPy\psychopy_2022.2\lib\site-packages\pyglet\text\layout.py", line 966, in _update
    lines = self._get_lines()
  File "C:\Program Portable\PsychoPy\psychopy_2022.2\lib\site-packages\pyglet\text\layout.py", line 942, in _get_lines
    glyphs = self._get_glyphs()
  File "C:\Program Portable\PsychoPy\psychopy_2022.2\lib\site-packages\pyglet\text\layout.py", line 1085, in _get_glyphs
    glyphs.extend(font.get_glyphs(text[start:end]))
  File "C:\Program Portable\PsychoPy\psychopy_2022.2\lib\site-packages\pyglet\font\base.py", line 394, in get_glyphs
    self.glyphs[c] = glyph_renderer.render(c)
  File "C:\Program Portable\PsychoPy\psychopy_2022.2\lib\site-packages\pyglet\font\win32.py", line 432, in render
    ord(text), ord(text), byref(abc)):
TypeError: ord() expected a character, but string of length 2 found
3.3274  WARNING         Monitor specification not found. Creating a temporary one...

Expected Behaviour

Ideally, it should not crash and should display all the valid glyphs (vowels, consonants, conjuncts) correctly. Some documentation and in-depth discussion about ligature glyphs can be found at following links:

Steps to Reproduce

Reproduce with PsychoPy Builder (any version)

  1. Make a simple experiment with just text to display कमाल
  2. or Download the zipped package from here display_word.zip here
  3. It contains a .psyexp file and three excel files.
  4. stim_en.xlsx contains words in English (working), stim_hi.xlsx contains words in Hindi (will cause program to crash)
  5. stim_hi_working.xlsx contains words in Hindi without matras (program works as expected)

Reproduce with PsychoPy Coder (any version)

Here is the code to reproduce the error, just comment/ uncomment relevant lines to see the error

import psychopy.visual
import psychopy.event

win = psychopy.visual.Window(size=[400, 400],units="pix",fullscr=False)

#word_to_display = "hello" #displays correctly
#word_to_display = "कमल" #displays correctly, क + म + ल
word_to_display = "कमाल" #displays incorrectly, क + म + ा + ल

text = psychopy.visual.TextStim(win=win,text=word_to_display)
text.draw()

win.flip()

psychopy.event.waitKeys()

win.close()

Additional context

No response

@niket-agrawal niket-agrawal added the 🐞 bug Issue describes a bug (crash or error) or undefined behavior. label Jan 6, 2024
@MichaelWoodc
Copy link

MichaelWoodc commented Feb 22, 2024

Is this correct? If not, could you share a rendered example?

image

If this is correct, I think the issue in your case is with pyglet:
Psychopy_video_experiment.venv\lib\site-packages\pyglet\font\win32.py", line 432, in render
ord(text), ord(text), byref(abc)):

The relevant portion:

image
it seems to want only one characters worth of data, in UTF-8, that's, what, one byte? (utf-8 i think 8 stands for the bits, so 1 byte) but it's a multi byte character, I think.

If you have a virtual environment, set it up and go to line 430 in your win32.py file and make it like this:
image
with this one small change everything seems to work fine. (Did that display correctly?) Just add three lines of code and indent the existing. I'm surprised to see this issue in this way. Pyglet can render the text but evidently relies on built in functions which are not setup to handle those characters.

I'm thinking:

  1. The issue is with pyglet's reliance on what eventually leads to a built in function that can't handle that character
  2. changing this one thing in pyglet fixed this issue. I'd change pyglett

@niket-agrawal
Copy link
Author

Hi @MichaelWoodc, amazing. Thanks a lot for your help. Yes I can confirm that it is indeed correct rendering.

word_to_display = "समर्पण" #displays incorrectly, स + म + र + ् + प + ण
word_to_display = "क्षत्रिय" #displays incorrectly, क + ् + ष + त + ् + र + ि + य

However, some more complex Hindi alphabets are still showing up as separate characters instead of coming up as a single glyph.
Some examples are:

  • र + ् + प = र्प
  • क + ् + ष = क्ष

Rendering Hindi can be challenging due to its complex script, where multiple units combine to form a single glyph. The good news is that the program can now render individual characters, a task it was unable to do previously.

@MichaelWoodc
Copy link

MichaelWoodc commented Feb 26, 2024 via email

@niket-agrawal
Copy link
Author

Thanks a lot. I tried to play with changing backends,
The experiment does not load with pygame
and the issue is same with your pyglet hack as with glfw

For now we are using images as text stimuli but for eye-tracker and getting character level data, makes those images useless. Interestingly, OpenSesame library worked for us to render stimuli.

Thanks a lot again for your help @MichaelWoodc

@peircej
Copy link
Member

peircej commented Feb 26, 2024

Actually, I'm most interested in whether we could solve this with TextBox2 rather than TextStim. TextBox2 is written in-house which means we have complete control over how layout works, so we should be able to apply a fix like this ourselves rather than requesting the fix in pyglet. TextBox2 is also much faster to update text than TextStim.

@niket-agrawal
Copy link
Author

Oh, I wasn't aware of that. I'll try with some stimuli using TextBox2 and provide an update here.

@peircej
Copy link
Member

peircej commented Feb 26, 2024

Just to be clear, I'm not thinking that TextBox2 already solves the issue - it needs each character manually laying out by us and i don't think we've handled this issues yet - but on the plus side the code to lay out the printable characters is only 20 lines long and is within our control meaning that a solution like @MichaelWoodc found could be very easy to implement here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐞 bug Issue describes a bug (crash or error) or undefined behavior.
Projects
None yet
Development

No branches or pull requests

3 participants