Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SuiteCRM strips and mangles text e.g. in Calls #10413

Open
rtpt-alexanderneumann opened this issue Apr 26, 2024 · 9 comments
Open

SuiteCRM strips and mangles text e.g. in Calls #10413

rtpt-alexanderneumann opened this issue Apr 26, 2024 · 9 comments

Comments

@rtpt-alexanderneumann
Copy link

rtpt-alexanderneumann commented Apr 26, 2024

Issue

We frequently discuss HTML and/or JavaScript issues with clients. We copy emails as text into the body of a call. I've recently upgraded to 7.14.3 (from a much older version) and I've discovered that SuiteCRM will mangle the text saved into the description of a Call (likely elsewhere as well):

  • Text between angle brackets < and > is removed, e.g. the text <script>fetch('/x')</script> is silently removed
  • For email addresses, a mailto: link is inserted in an odd way: the original text <foo@example.com> is replaced with <<a href="mailto:foo@example.com">foo@example.com</a>>, which is rendered in verbatim so the mailto link cannot be clicked (screenshots below)

Expected Behavior

I expect that text that I enter into the description field of a call (and everywhere else, too) is saved into the database as I entered it and it is displayed again (as text) with removing any parts.

Actual Behavior

  • SuiteCRM silently removes anything that looks like HTML or JavaScript
  • SuiteCRM replaces email addresses with a broken mailto: link (screenshots below)

Possible Fix

  1. Do not remove anything from the user's input
  2. On output, ensure that all special characters are encoded e.g. as HTML entities (via htmlentities())

Steps to Reproduce

  1. Log into SuiteCRM demo instance (version 7.14.3 at the time of writing)

  2. Edit a Call, enter the following text as the description:

From: <foo@example.com>
Subject: How to call fetch?

Is this the right way?

<script>fetch('/x')</script>

Screenshot from 2024-04-26 09-36-57

  1. Click Save, then the text has changed, everything between < and > was silently removed and the meaning is completely different:
From:
Subject: How to call fetch?

Is this the right way?

Screenshot from 2024-04-26 09-37-12

  1. When the call is edited again, it can be seen that this is not a display issue, but the text has been removed completely:
    Screenshot from 2024-04-26 09-37-33

  2. Edit the call again, enter the following text then click Save:

From: foo <foo@example.com>
Subject: How to call fetch?

Is this the right way?

Screenshot from 2024-04-26 09-37-46

  1. The description is displayed, but the email address has been modified and the following text is displayed:
From: foo <<a href="mailto:foo@example.com">foo@example.com</a>>

Screenshot from 2024-04-26 09-38-00

  1. Edit the call again, the text field shows the modified text:
From: foo <<a href="mailto:foo@example.com">foo@example.com</a>>

Screenshot from 2024-04-26 09-38-08

Context

This issue makes it very hard to trust SuiteCRM to not remove/modify Data I enter into the description fields. It has modified several calls in the background, which has changed the meaning of the text completely (sample described above).

We oftentimes discuss issues with HTML and JavaScript with clients and track such issues and support requests within SuiteCRM.

Since we regularly find and exploit XSS vulnerabilities, I know quite a bit about mitigations. Removing < and > before saving user-provided text into the database is not the right solution. For example, it does not help at all when user input is inserted into an HTML attribute. Instead, great care must be taken to always run the appropriate function (e.g. htmlentities()), which SuiteCRM seems to do right already.

It's just the input filter that's wrong here.

For me, this is a high-priority issue as it changes text I entered.

Please let me know if I can help in any way!

Your Environment

  • SuiteCRM Version used: 7.14.3
  • Browser name and version (e.g. Chrome Version 51.0.2704.63 (64-bit)): Firefox 125.0.2 on Linux
  • Environment name and version (e.g. MySQL, PHP 7): MariaDB 10.5.22
  • Operating System and version (e.g Ubuntu 16.04): PHP 8.2.10
@rtpt-alexanderneumann
Copy link
Author

At least some of the code in question was added in #6883 by @g-fantini. Can you maybe chime in? :)

@rtpt-alexanderneumann
Copy link
Author

rtpt-alexanderneumann commented Apr 26, 2024

The following patch disables cleaning HTML from text fields (which is the right thing to do in my professional opinion), it does not solve removing script tags yet:

diff --git a/data/SugarBean.php b/data/SugarBean.php
index 4f9ca9f25..d6b346f86 100755
--- a/data/SugarBean.php
+++ b/data/SugarBean.php
@@ -2522,11 +2522,6 @@ class SugarBean

                 if (isset($def['type']) && ($def['type'] == 'html' || $def['type'] == 'longhtml')) {
                     $this->$key = purify_html($this->$key);
-                } elseif (
-                    (strpos((string) $type, 'char') !== false || strpos((string) $type, 'text') !== false || $type == 'enum') &&
-                    !empty($this->$key)
-                ) {
-                    $this->$key = purify_html($this->$key);
                 }
             }
         }

Shall I submit a PR?

@pgorod
Copy link
Contributor

pgorod commented Apr 26, 2024

If you search for some of my many rants regarding the over-zealous clean-ups in SuiteCRM you can get an idea of the depths of this problem....

#9127

Just for laughs, I found this in my notes from a few years ago, when trying to work around one of these problems. This is a description of a code flow within SuiteCRM:

include/entryPoint.php:98 clean_incoming_data changes HTML tags in $_REQUEST, saves originals in RAW_REQUEST

Later in Save, handleAttachmentsProcessImages is called in modules/EmailTemplates/EmailTemplateFormBase.php:171
which calls processImages, which uses from_html (from db_utils?) to reverse HMTL tags to examine img links etc, but doesn't save it unless there are images... (incongruent). 
At the end of that, calls save from EmailTemplate, which calls parent::save from SugarBean
Which calls cleanBean, overriden in modules/EmailTemplates/EmailTemplate.php:855
This encodes vars, calls parent::cleanBean, puts back variables de-encoded...
cleanBean in SugarBean iterates all fields, has "if" for some types, applies this jewel:
   $this->$key = htmlentities(SugarCleaner::cleanHtml($this->$key, true));
   this calls HtmlSanitizer.php, cleanHtml
   which starts by running html_entity_decode but does not use ENT_QUOTES, so misses &#039; (')
   if ($removeHtml === true) $clean_html = [$sugarCleaner]$purifier->purify($dirty_html_decoded);
   which throws away data
Save, goes into DBManager, uses from_html on each field, which uses html_entity_decode(with ENTQUOTES) and str_ireplace

I know that sounds depressing, there are just too many layers in there, and some major errors were made years ago... now people are afraid to revert them. But in practice we can often solve the issues just by overriding a save function and putting back RAW values.

What we should be doing is simply:

  • don't strip anything when going into the system (this is the opposite of what is currently done!)
  • escape stuff that is dangerous to the DB (SQL delimiters etc) right before going into the DB
  • escape stuff that is dangerous to JS right before going out to the browser
  • same for HTML

@rtpt-alexanderneumann
Copy link
Author

I feared that it may be a lot deeper, but was too hesitant to look into it (not to stare into the abyss too much). I agree on most points, except this one:

escape stuff that is dangerous to the DB (SQL delimiters etc) right before going into the DB

Use Prepared Statements instead :)

@rtpt-alexanderneumann
Copy link
Author

rtpt-alexanderneumann commented Apr 26, 2024

Oh wow, I just found out that I (with a different account) reported the exakt same behavior back in 2017 #4446

@pgorod
Copy link
Contributor

pgorod commented Apr 26, 2024

I agree with the Prepared Statements, but that would be a very big change to apply everywhere in SuiteCRM code. The new v8 doesn't have any of these issues, I guess the way forward is just to progressively deprecate all of the old stuff. The new stuff naturally leads to much better practices (e.g. Symfony Doctrine)

@rtpt-alexanderneumann
Copy link
Author

The new v8 doesn't have any of these issues

Hm, I'm not sure I understand you correctly. The v8 demo instance also just removes everything between < and > in descriptions of calls (<script> tags, email addresses).

@pgorod
Copy link
Contributor

pgorod commented May 1, 2024

You're right, I guess I overstated things (a lot). The technologies in v8 are better and potentially provide a cleaner architecture and better code. But there are still many bits of v7 in the code flow and so these bugs still arise.

@chris001
Copy link
Contributor

chris001 commented May 2, 2024

If the app would call mysqli_real_escape_string on the the user provided text while saving into the database, and before outputting to the page, first process the text with htmlspecialchars(), then the exact user-generated text would display on the page correctly, and this issue bug would be solved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants