Skip to content

NitishShandilya/UTF8Converter-Multilingual

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

UTF8Converter-Multilingual

Language: Python
Description: Program in python to convert Multilingual UTF-16 encoded characters to Multilingual UTF-8 encoded characters, without changing the actual meaning of the character. The program is tested for languages like English, Arabic, Gujurati and Japanese.
Detail: a Python program which will take a path to an input file (absolute path name) as the first parameter. It will read the file as a binary file, and assume that it contains characters from Unicode's Basic Multilingual Plane (U+0000 to U+FFFF) in UTF-16 encoding (big endian), that is every 2 bytes correspond to one character and directly encode that character's Unicode code point. The program will encode each character in UTF-8 (between 1 and 3 bytes), and write the encoded bytes to a file called utf8encoder_out.txt.

About

Program in python to convert Multilingual UTF-16 encoded characters to Multilingual UTF-8 encoded characters, without changing the actual meaning of the character

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages