Skip to content

skeeto/branchless-utf8

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Branchless UTF-8 Decoder

Full article: A Branchless UTF-8 Decoder

Example usage

#define N (1 << 20)  // 1 MiB

// input buffer with 3 bytes of zero padding
char buf[N+3];
char *end = buf + fread(buf, 1, N, stdin);
end[0] = end[1] = end[2] = 0;

// output buffer: parsed code points
int len = 0;
uint32_t cp[N];

int errors = 0;
for (char *p = buf; p < end;) {
    int e;
    p = utf8_decode(p, cp+len++, &e);
    errors |= e;
}
if (errors) {
    // decode failure
}

About

Branchless UTF-8 decoder

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published