Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

< symbol Truncate the remaining text #3791

Open
alfie013 opened this issue Apr 12, 2017 · 35 comments
Open

< symbol Truncate the remaining text #3791

alfie013 opened this issue Apr 12, 2017 · 35 comments

Comments

@alfie013
Copy link

Hello All,

I have encountered a bug when posting a reply on a ticket when using a "<" symbol. When there is a "<" symbol on the ticket body, it will cut the remaining text. Here is a sample reply that i did.

My Response:

_Hi Kamran,

Alright. The threshold has been set to 10% so any printer that hits the toner level <10%, we should get notified. We'll handle this for you so you won't get bombarded by these notifications. So far, these are the ones we received.

adsii-dc2\T5 PrinterBlack: 8%
adsii-dc2\T5 ScannerBlack: 8%

Thank you!_

After Posting the update on the ticket:

_Hi Kamran,

Alright. The threshold has been set to 10% so any printer that hits the toner level
adsii-dc2\T5 PrinterBlack: 8%
adsii-dc2\T5 ScannerBlack: 8%

Thank you!_

Can you guys help me to modify the code to access that symbol? Any help will be much appreciated.

Server Information
osTicket Version ----------> v1.10 (901e5ea) — Up to date
Web Server Software -----> Microsoft-IIS/8.5
MySQL Version -----------> 5.7.17
PHP Version --------------> 7.0.15

Thank you,
Alfie

@ntozier
Copy link
Contributor

ntozier commented Apr 12, 2017

< and > are reserved HTML tag characters. Using them triggers the WYSIWYG editor and HTMLawed into thinking that it is an HTML tag. Invalid ones, and some specific ones are stripped. It sounds to me like you are running into the that.

@alfie013
Copy link
Author

Hello @ntozier,

Yes you're correct. Is there's a way to make those symbols work when posting an update on the ticket?

Thank you,

@ntozier
Copy link
Contributor

ntozier commented Apr 13, 2017

Not that I know of. You could try dropping to the code view in the editor and use &lt and &gt.

@alfie013
Copy link
Author

Hello @ntozier,

Thank you for your response, I appreciate it.
For the agent that will work, but how about for the client side. If they use that symbol, their message will be truncated. Hope there someone fixes this issue.

Thank you,

@alfie013
Copy link
Author

Hi Guys,

Can you help me on this?

Thank you,

@bhspiers
Copy link

Hi @ntozier
This is definitely a bug.

The problem is not with the editor, it correctly converts "less than" symbols to the html entity. Emails from clients are also affected, everything on a line following a "<" is stripped, when the character should just be escaped.

The problem actually occurs in class.format.php when (by default) Format::safe_html decodes these entities back to "<" before running the input through HtmLawed. Entities are already sanitized right? Yet the comments in safe_html seem to consider &lt; and &gt; unsafe.

My fix was to edit FORMAT::sanitize on line 340 and add 'decode'=>false to the options.
$text = Format::safe_html($text, array('spec' => $spec,'decode'=>false));

Am I missing something? Have I opened myself up to XSS?
Would you ask the devs to take a look?

Thanks
Ben

@jane-t
Copy link

jane-t commented Jan 5, 2018

Thanks for the fix. Works for me, it was causing real problems when we replied with formulas.

@tmcnag
Copy link

tmcnag commented Mar 6, 2018

+1 for a fix. This cannot possibly be expected/intended behavior. The only time a < should be treated as part of an HTML tag when writing a reply is if the user has explicitly switched to the HTML mode.

Edit: If I find some time I will test the fix provided by @bhspiers and see if I can get XSS through.

@caseyamcl
Copy link

This is an issue for me too, although it happens with more characters than just the <. While we haven't been able to pin down which characters or encodings cause the issue, we've noticed it periodically.

@JediKev
Copy link
Contributor

JediKev commented Aug 28, 2018

@caseyamcl

< will definitely truncate the text because we balance HTML tags in the text. This means if there is an opening tag without a closing tag then we strip it out due to incomplete/unbalanced HTML. We are aware that this can be a pain sometimes and we are looking to address this in the future.

Cheers.

@caseyamcl
Copy link

@JediKev Thanks a lot for the response!

@lucize
Copy link

lucize commented Dec 12, 2018

$text = Format::safe_html($text, array('spec' => $spec,'decode'=>false));

this is still not fixed in master, but it works!

@fredricj
Copy link

fredricj commented May 6, 2019

If you use the WYSIWYG editor, you expect the editor to treat the < as a literal < and not as a HTML opening tag.
This is bad since you cant notice it until after you have sent the message

@plong0
Copy link

plong0 commented May 17, 2019

Still an issue in v12 and our CSR staff do not enjoy their carefully written messages being unexpectedly truncated. I feel like this should be a pretty high priority for core team to have fixed in release versions.

Thank you @bhspiers for finding a patch.

plong0 added a commit to ils-corp/osTicket that referenced this issue May 17, 2019
@JediKev
Copy link
Contributor

JediKev commented May 17, 2019

@plong0

I will refer you to this comment above:
#3791 (comment)

We appreciate the feedback and this is something that’s on our list to address in future versions.

Cheers.

@plong0
Copy link

plong0 commented May 22, 2019

Thank you for the reply @JediKev - but if my editor is set in WYSIWYG mode (not HTML editing mode), shouldn't it be converting the "<" to &lt; before attempting any HTML processing on it?

Or am I overlooking an expectation that someone might be typing HTML tags into the WYSIWYG mode of the editor and expecting them to be rendered as HTML output?

One edge case I can think of would be copying and pasting HTML-formatted text into the WYSIWYG editor, but this issue is about the core/normal functionality.

@knels
Copy link

knels commented Jun 10, 2019

@JediKev, While it does strip random single '<' characters, it seems full tags like are passed through just fine, which can really. In either case, even switching to the code view (which does encode the characters as displayed) and submitting they end up passed straight through as normal characters. It also apparently allows clients to reply and embed iframes into their ticket updates, is this intentional?

@JediKev
Copy link
Contributor

JediKev commented Jun 10, 2019

@plong0 @knels

We use HTMLawed with specific configs (and other methods) to sanitize the content to "make HTML safe". This removes most tags but some are allowed like divs, tables, styling tags, etc. (for obvious reasons). If you've dealt with HTMLawed before you'll know it balances HTML which means that it will attempt to make the HTML content complete and if unsuccessful will remove tags that make it incomplete. For example, if Test < 123 is passed everything after the < including the tag itself will be removed as the tag cannot be completed. If Test <div> 123 is passed, then HTMLawed will return Test <div> 123 </div> as it could complete or "balance" the HTML. This prevents rendering issues and even some XSS/Injection attempts. Since we render the messages/responses in HTML on the page we have to be strict about balancing HTML to avoid broken rendering and any chance of XSS/Injection attempts.

To answer @knels question of ... allows clients to reply and embed iframes into their ticket updates, is this intentional?; yes, the HTMLawed configs we set allows iframes with limited attributes.

Cheers.

@NdK73
Copy link

NdK73 commented Jul 4, 2019

Just had an user that sent a paste from a compilation error: nearly useless since #includes have had their "target" removed.
I think that use of HTMLawed if FLawed :) I'll try @bhspiers patch.

Text from WYSIWYG editor should stay WYSIWYG: if you enter HTML code, it's OK to escape it so it gets shown as source. If you switch to HTML mode, then you should be able to use some tags, and text should not be automatically escaped.
Text from mail depends on the format of the mail: HTML mails need sanitization (removing dangerous tags/attributes), text mails only need escaping.

Please just apply the "less surprise" principle.

@NdK73
Copy link

NdK73 commented Jul 5, 2019

Same problem with subject of internal notes.
Ticket subject seems handled correctly both from mail and from operator interface.

@NdK73
Copy link

NdK73 commented Jul 12, 2019

I "fixed" the problem with less-than in internal note titles changing line 343 of class.format.php to:
$text = Format::safe_html($text, array('spec' => $spec, 'decode'=>false, 'balance'=>false));
(added ", 'balance'=>false").
It's just a workaround, and possibly a dangerous one. But it seems it doesn't get interpreted as a start tag by the browser...
Anyway, is it wise to allow for HTML code (even if reduced) in titles?

@sjxiong
Copy link

sjxiong commented Jul 14, 2019

We run into the same problem and applied the same fix: to disable 'balance' in the safe_html() function in include/class.format.php. I am also concerned about the security implications of this.

This is really a show-stopper for us: we often use the "less-than" character in replies to customers. It also seems logical to me that anything entered in the WYSIWYG-editor should NOT be interpreted as HTML.

@NdK73
Copy link

NdK73 commented Oct 31, 2019

That patch does have a nasty side effect: HTML replies usually get truncated with an open 'div' tag (sometimes a 'p' one, too) and that messes the layout of the web interface.
So I think there are two distinct problems:

  1. badly truncated incoming messages (when removing quoted part)
  2. balancing does not work as expected
    Probably 2) can be resolved by using a different validator.

@caseyamcl
Copy link

To give an example of the issue that @NdK73 is referencing, see this image:

https://www.dropbox.com/s/qwvhm7qhtc6lizc/osTicket-problem.png?dl=0

@JediKev
Copy link
Contributor

JediKev commented Nov 14, 2019

@caseyamcl

That’s why we strip the tags...we have plans to eventually allow code in code blocks that will be escaped etc. so please stay tuned.

Cheers.

@JediKev
Copy link
Contributor

JediKev commented Nov 14, 2019

@caseyamcl @NdK73

The problem is that we deal with Emails that are HTML. The tags there do not need escaping otherwise the email will not render properly in the Ticket view. With this in mind, how do you determine what to render and what to escape? The solution is more often than not to allow code but only in a code block or something similar so we know that the code in the block needs escaping, the rest needs to be rendered to look like the actual email that was sent.

Cheers.

@NdK73
Copy link

NdK73 commented Nov 14, 2019

Nope: you can't trust the users to use codeblocks: they'll just paste code / error messages!
The only viable solution (IMVHO) is whitelisting tags: everything that's a known tag gets balanced. The rest is left as-is.
That's what mail clients do. It's a multi-pass process: first you balance all valid HTML tags, then strip the ones in a blacklist.

@JediKev
Copy link
Contributor

JediKev commented Nov 14, 2019

@NdK73

It's not that simple though...you have to think about the cases where you want to send HTML code to a User but not have it rendered as normal text. Since the code is HTML it will see it as email content and not escape it causing it to be rendered. You need a code-block to say "this text specifically is code that should be escaped".

Anyways, you're not wrong though. It will most likely be a combination of multiple things such as code blocks and whitelisted/blacklisted tags (which safe_html/HTMLawed currently does, whitelists and blacklists certain tags and removes the blacklisted tags).

Cheers.

@NdK73
Copy link

NdK73 commented Nov 14, 2019

There are two different flows: when receiving I have to accept "everything" (even possibly malformed/malicious contents), when sending I must send well-formed and clean code. If I'm generating HTML code, then I first have to htmlencode() everything the operator enters in the textarea. As operator, I expect that if I paste a HTML snippet the receiver will see it uninterpreted ("you have to use <div>&nbsp;</div> to apply this css" -- btw I've had to manually escape this example to have it rendered correctly here on github!).

@JediKev
Copy link
Contributor

JediKev commented Nov 14, 2019

@NdK73

Anyways, you're not wrong though. It will most likely be a combination of multiple things such as code blocks and whitelisted/blacklisted tags

@NdK73
Copy link

NdK73 commented Nov 20, 2019

For a partial solution (only addressing web inputs), you can have a look at https://www.froala.com/online-html-editor : it automatically encodes entered entities as soon as the user enters them. The downside is that operators and users are constrained to the "buttonized" tags... But is it really a downside?

Another interesting editor can be http://wymeditor.github.io -- it follows the "What You See Is What You Mean" paradigm. Probably worth having a look.

@nicoletta-maia
Copy link
Contributor

Hi all, to prevent the responses/notes from being truncated after "<" symbol, we have tried a workaround to replace automatically < and > with < >
In case it could helps, here our changes to htmldecode($var) in class.format.php:

--- a/include/class.format.php
+++ b/include/class.format.php
@@ -402,6 +402,15 @@ class Format {
         if (phpversion() >= '5.4.0')
             $flags |= ENT_HTML401;
 
+        $var = preg_replace_callback("/(&#(?:[0-9]+|x[0-9a-f]+);)/i", function($entity) use ($flags) {
+            return htmlentities(html_entity_decode($entity[1], $flags), $flags);
+        }, $var);
+
+        $var = str_replace(
+            array('&lt;', '&gt;'),
+            array('&#65308;', '&#65310;'),
+        $var);
+
         return htmlspecialchars_decode($var, $flags);
     }

@fredsted
Copy link

I've not quite been able to find out the security risks of doing a fix like this, or the one mentioned by @bhspiers. Does anyone know?

We often send code snippets containing html characters as part of our support tasks, and often we need to write "Whoops, sorry, our support system mangled the code, here it is as an attachment instead" – quite embarrassing.

@JediKev Now that it's 2021, any updates on this?

@JediKev
Copy link
Contributor

JediKev commented Aug 11, 2021

@fredsted

Allowing code within code-blocks is on our Feature Request list. This will most likely come to life in v2.0 (currently in development - no set/expected release date atm).

Cheers.

@ChrisTG742
Copy link
Contributor

Is there yet progress on this rather old issue? We just steped into this bug which led to some big confusion on the client side why we sent his quote "some text from client" without any comment/reaction from our side.
Actually the agent quoted the line like this:
"some text from client" <- yes, we are aware of this issue
The reply then was stripped to only the quote again, everything behind including the "<-" was missing.
The bad thing is, that no error is thrown, so no-one suspects that some info could be missing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests