Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: cut utf8 codepoint sequence in half #88

Open
iacore opened this issue Jan 30, 2022 · 1 comment
Open

Bug: cut utf8 codepoint sequence in half #88

iacore opened this issue Jan 30, 2022 · 1 comment

Comments

@iacore
Copy link

iacore commented Jan 30, 2022

I don't know how to describe this, but the bug is here:

suit/theme.lua

Lines 136 to 139 in 1767782

if opt.hasKeyboardFocus and (love.timer.getTime() % 1) > .5 then
local ct = input.candidate_text;
local ss = ct.text:sub(1, utf8.offset(ct.text, ct.start))
local ws = opt.font:getWidth(ss)

This bug will cause the application to crash when the first codepoint of candidate text (IME) is not in ASCII range.

Here's a minimal reproducible main.lua:

function hex_dump (str)
    local len = string.len( str )
    local dump = ""
    local hex = ""
    local asc = ""
    
    for i = 1, len do
        if 1 == i % 8 then
            dump = dump .. hex .. asc .. "\n"
            hex = string.format( "%04x: ", i - 1 )
            asc = ""
        end
        
        local ord = string.byte( str, i )
        hex = hex .. string.format( "%02x ", ord )
        if ord >= 32 and ord <= 126 then
            asc = asc .. string.char( ord )
        else
            asc = asc .. "."
        end
    end

    
    return dump .. hex
            .. string.rep( "   ", 8 - len % 8 ) .. asc
end

function fromhex(a)
    local result = ""
    for i,x in ipairs(a) do
        result = result .. string.char(x)
    end
    return result
end

font = love.graphics.getFont( )
utf8 = require "utf8"
local ct = {
    text = fromhex({
        0xe5,0x87,0xb9
    }),
    start = 0,
}
print(ct.text)
local ss = ct.text:sub(1, utf8.offset(ct.text, ct.start))
print("ct.text:")
print(hex_dump(ct.text))
print("ss:")
print(hex_dump(ss))
local ws = font:getWidth(ss) -- crash here
@iacore
Copy link
Author

iacore commented Jan 30, 2022

I don't understand what the code does, but here's what's wrong.

utf8.offset(s, 0) always returns 1 (start of 1-th codepoint in s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant