Use latin1 for text encoding? #10

martonmiklos · 2019-02-07T14:45:14Z

Hi folks!

First of all thanks for all efforts put into this project!

I have some schematics where accented characters were present in the texts and got some exceptions:

Traceback (most recent call last):
  File "altium.py", line 1615, in <module>
    main()
  File "altium.py", line 420, in main
    render(args.file, renderer.Renderer)
  File "altium.py", line 590, in __init__
    self.handle_children([objects])
  File "altium.py", line 627, in handle_children
    handler(self, owners, obj)
  File "altium.py", line 996, in handle_text_frame
    text=obj["TEXT"].decode("utf-8").replace("~1", "\n"),
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 5: invalid start byte

The problematic text was the following:

b'1x5 t\xfcskesor~190\xb0, 1,27mm'
Which corresponds to:

1x5 tüskesor\n90°, 1,27mm

I will do some experiments to map all the accented and special characters, but I am under an impression that Altium uses latin1 character encoding rather than plain ASCII.

The text was updated successfully, but these errors were encountered:

vadmium · 2019-02-08T08:36:20Z

I expect it uses something like Latin-1 or Windows-1252. I am happy to change line 996 to decode with Latin-1. However I noted under https://github.com/vadmium/python-altium/blob/master/format.md#pin that I saw the byte 0x8E representing a broken bar (U+00A6, ¦). So the full story might not be so simple.

I have come across parallel UTF-8 properties, for instance as well as one named TEXT, there is one named %UTF8%TEXT. You don’t know if your text frame object has a UTF-8 version of the text?

martonmiklos · 2019-02-14T22:16:33Z

Hi @vadmium

I have not found any occurrence of the "UTF" string in the file.

I think I will create a text with including the most accents, and special characters, save it and see the text to make more solid conclusion on the encoding type.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use latin1 for text encoding? #10

Use latin1 for text encoding? #10

martonmiklos commented Feb 7, 2019 •

edited

vadmium commented Feb 8, 2019

martonmiklos commented Feb 14, 2019

Use latin1 for text encoding? #10

Use latin1 for text encoding? #10

Comments

martonmiklos commented Feb 7, 2019 • edited

vadmium commented Feb 8, 2019

martonmiklos commented Feb 14, 2019

martonmiklos commented Feb 7, 2019 •

edited