Skip to content

Latest commit

 

History

History
153 lines (121 loc) · 5.03 KB

5.Entities.md

File metadata and controls

153 lines (121 loc) · 5.03 KB

Entities are the variables that can be used in XML content to maintain consistency. Eg,

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE note [
<!ENTITY nbsp "&#xA0;">
<!ENTITY writer "Writer: Donald Duck.">
<!ENTITY copyright "Copyright: W3Schools.">
]>

<note>
    <to>Tove</to>
    <from>Jani</from>
    <heading>Reminder</heading>
    <body attr="&writer;">Don't forget me this weekend!</body>
    <footer>&writer;&nbsp;&copyright;</footer>
</note> 

You can define your own entities using DOCTYPE. FXP by default supports following XML entities;

Entity name Character Decimal reference Hexadecimal reference
quot " &#34; &#x22;
amp & &#38; &#x26;
apos ' &#39; &#x27;
lt < &#60; &#x3C;
gt > &#62; &#x3E;

However, since the entity processing can impact the parser's performance drastically, you can use processEntities: false to disable it.

XML Builder decodes default entities value. Eg

const jsObj = {
            "note": {
                "@heading": "Reminder > \"Alert",
                "body": {
                    "#text": " 3 < 4",
                    "attr": "Writer: Donald Duck."
                },
            }
        };

        const options = {
            attributeNamePrefix: "@",
            ignoreAttributes:    false,
            // processEntities: false
        };
        const builder = new XMLBuilder(options);
        const output = builder.build(jsObj);

Output:

<note heading="Reminder &gt; &quot;Alert">
    <body>
        3 &lt; 4
        <attr>Writer: Donald Duck.</attr>
    </body>
</note>

Side effects

Though FXP doesn't silently ignores entities with & in the values, following side effects are possible

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE note [
<!ENTITY nbsp "writer;">
<!ENTITY writer "Writer: Donald Duck.">
<!ENTITY copyright "Copyright: W3Schools.">
]>

<note>
    <heading>Reminder</heading>
    <body attr="&writer;">Don't forget me this weekend!</body>
    <footer>&writer;&&nbsp;&copyright;</footer>
</note> 

Output

 {
    "note": {
        "heading": "Reminder",
        "body": {
            "#text": "Don't forget me this weekend!",
            "attr": "Writer: Donald Duck."
        },
        "footer": "Writer: Donald Duck.Writer: Donald Duck.Copyright: W3Schools."
    }
}

To deal with such situation, use &amp; instead of & in XML document.

Attacks

Following attacks are possible due to entity processing

  • Denial-of-Service Attacks
  • Classic XXE
  • Advanced XXE
  • Server-Side Request Forgery (SSRF)
  • XInclude
  • XSLT

Since FXP doesn't allow entities with & in the values, above attacks should not work.

HTML Entities

Following HTML entities are supported by the parser by default when htmlEntities: true.

Result Description Entity Name Entity Number
non-breaking space &nbsp; &#160;
< less than &lt; &#60;
> greater than &gt; &#62;
& ampersand &amp; &#38;
" double quotation mark &quot; &#34;
' single quotation mark (apostrophe) &apos; &#39;
¢ cent &cent; &#162;
£ pound &pound; &#163;
¥ yen &yen; &#165;
euro &euro; &#8364;
© copyright &copy; &#169;
® registered trademark &reg; &#174;
Indian Rupee &inr; &#8377;

In addition, numeric character references are also supported. Both decimal (num_dec) and hexadecimal(num_hex).

In future version of FXP, we'll be supporting more features of DOCTYPE such as ELEMENT, reading content for an entity from a file etc.

External Entities

You can set external entities without using DOCTYPE.

const xmlData = `<note>&unknown;&#xD;last</note> `;

const parser = new XMLParser();
parser.addEntity("#xD", "\r"); // &unknown;\rlast
let result = parser.parse(xmlData);

This way, you can also override the default entities.

> Next: HTML Document Parsing