Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

发送特殊字符会回复公众号故障 #420

Open
zhanghsgithub opened this issue Mar 4, 2019 · 5 comments · May be fixed by #809
Open

发送特殊字符会回复公众号故障 #420

zhanghsgithub opened this issue Mar 4, 2019 · 5 comments · May be fixed by #809

Comments

@zhanghsgithub
Copy link

  • 对 Bug 的描述

    • 当前行为:
      发送的字符串服务器收到的信息为
      <xml> <ToUserName><![CDATA[gh_***]]></ToUserName>\n <FromUserName><![CDATA[opid***]]></FromUserName>\n <CreateTime>1551481349</CreateTime>\n <MsgType><![CDATA[text]]></MsgType>\n <Content><![CDATA[ \x1dc]]></Content>\n <MsgId>22211872600753510</MsgId>\n <Encrypt><![CDATA[***]]></Encrypt>\n </xml>
      发送该内容后,无法解析这个xml,导致前端会出现公众号故障的提示
    • 正确的行为:正常解析,并返回内容
  • 环境

    • 平台:centos7
    • WeRoBot 版本号:1.8.0
    • Python 版本:3.6.2
  • 复现代码或 repo 链接

from werobot import WeRoBot
    def parse_message(
        self, body, timestamp=None, nonce=None, msg_signature=None
    ):
        """
        解析获取到的 Raw XML ,如果需要的话进行解密,返回 WeRoBot Message。
        :param body: 微信服务器发来的请求中的 Body。
        :return: WeRoBot Message
        """
        logger.debug(body)
        message_dict = parse_xml(body)
        if "Encrypt" in message_dict:
            xml = self.crypto.decrypt_message(
                timestamp=timestamp,
                nonce=nonce,
                msg_signature=msg_signature,
                encrypt_msg=message_dict["Encrypt"]
            )
            message_dict = parse_xml(xml)
        return process_message(message_dict)

# 请在这里给出 bug 的复现代码。如有必要,可以创建一个复现 repo 并将链接粘贴到这里。
  • 复现步骤

  • 其他信息

@whtsky
Copy link
Collaborator

whtsky commented Mar 4, 2019

麻烦贴一下后台的 log

@zhanghsgithub
Copy link
Author

Traceback (most recent call last):
File "/usr/local/mpchat/lib/python3.6/site-packages/bottle.py", line 862, in _handle
return route.call(**args)
File "/usr/local/mpchat/lib/python3.6/site-packages/bottle.py", line 1740, in wrapper
rv = callback(*a, **ka)
File "/usr/local/mpchat/lib/python3.6/site-packages/werobot/contrib/bottle.py", line 59, in werobot_view
msg_signature=request.query.msg_signature
File "/usr/local/mpchat/lib/python3.6/site-packages/werobot/robot.py", line 564, in parse_message
message_dict = parse_xml(body)
File "/usr/local/mpchat/lib/python3.6/site-packages/werobot/parser.py", line 15, in parse_xml
xml_dict = xmltodict.parse(text)["xml"]
File "/usr/local/mpchat/lib/python3.6/site-packages/xmltodict.py", line 327, in parse
parser.Parse(xml_input, True)
xml.parsers.expat.ExpatError: not well-formed (invalid token): line 6, column 25
127.0.0.1 - - [04/Mar/2019 13:46:53] "POST /?signature=3025d78df2cf2765999531318a6f34a116e4c857&timestamp=1551678413&nonce=1721272341&openid=opioD1bxzqL7F5RQ23_SBoUZWvOg&encrypt_type=aes&msg_signature=78b985b311695173d3dc3845c213a2517d0d93ff HTTP/1.0" 500 958

@whtsky
Copy link
Collaborator

whtsky commented Mar 4, 2019

没什么头绪……… 这个 XML 确实是 not well-formed.

发送特殊字符

有办法知道这是什么字符吗?

@overcat
Copy link
Contributor

overcat commented Mar 7, 2019

微信好像没有对字符进行有效的处理,导致返回给 werobot 的 XML 是错误的。

XML 1.0 中有效的字符

Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */

问题中的 \x1dc 可以拆分为 \x1d + c\x1d 不在有效字符的范围内。
可以用 �c 测试一下看看 werobot 会不会崩溃(可能无法直接复制,请使用编辑模式复制,或从此处复制)。
另外诸如 ]]> 这样的消息也会导致 werobot 崩溃。

@Beelkic
Copy link

Beelkic commented Jan 2, 2023

现在依旧 ]]>这样的消息会崩溃,在想有没有解决办法

@SuperWildFireFox SuperWildFireFox linked a pull request Oct 24, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants