Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lxml issue when editing taxonomy previously edited in 1.4.2 #38

Open
vincentfretin opened this issue Jan 14, 2017 · 4 comments
Open

lxml issue when editing taxonomy previously edited in 1.4.2 #38

vincentfretin opened this issue Jan 14, 2017 · 4 comments

Comments

@vincentfretin
Copy link
Member

I had a big issue with lxml in version 1.4.4 several weeks ago, before the holidays, but I didn't have time to look at a fix yet, I just reverted to 1.4.2 which doesn't use lxml. I create the issue so maybe someone can look into it. @tomgross ? I'm surprised no one else had the problem.
Here is the traceback:

File "/home/zope/webpro/eggs/collective.taxonomy-1.4.4-py2.7.egg/collective/taxonomy/vdex.py", line 149, in buildTree
   for termnode in self.makeSubtree(index, table):
 File "/home/zope/webpro/eggs/collective.taxonomy-1.4.4-py2.7.egg/collective/taxonomy/vdex.py", line 114, in makeSubtree
   langstringnode.text = langstring
 File "src/lxml/lxml.etree.pyx", line 1031, in lxml.etree._Element.text.__set__ (src/lxml/lxml.etree.c:53218)
 File "src/lxml/apihelpers.pxi", line 715, in lxml.etree._setNodeText (src/lxml/lxml.etree.c:24413)
 File "src/lxml/apihelpers.pxi", line 703, in lxml.etree._createTextNode (src/lxml/lxml.etree.c:24276)
 File "src/lxml/apihelpers.pxi", line 1443, in lxml.etree._utf8 (src/lxml/lxml.etree.c:31495)
ValueError: All strings must be XML compatible: Unicode or ASCII, no NULL bytes or control characters

I have this traceback with taxonomies I edited ttw with version 1.4.2, somehow it stored utf-8 strings and not unicode (it's this issue we need to fix I think). When I upgrade to 1.4.4, I get the traceback.

@vincentfretin
Copy link
Member Author

Ah, actually to see the traceback you need to remove the except ValueError in jsonimpl.py:53
or else you just see an empty taxonomy. This was great when I saw empty taxonomies in prod! :-S

@tomgross
Copy link
Member

I don't have any issues with the lxml version but I didn't migrate any taxonomy from the old version.

@vincentfretin
Copy link
Member Author

Ok, I'll figure it out and probably write an upgrade step to fix existing taxonomies.

@petschki
Copy link
Member

I fought with unicode control characters (https://en.wikipedia.org/wiki/Unicode_control_characters) once before and was pretty much alone with my problem. see issue here plone/plone.app.widgets#127 ... our problem was, that users pasted text with hidden unicode control characters into tiny and broke the widget with that. I had to quick-patch the widget and wrote a upgradestep which cleaned the raw data of IRichText field ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants