Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Whitespace is removed when exporting table data as XML Format #422

Open
Valektrum opened this issue Aug 4, 2023 · 8 comments
Open

Whitespace is removed when exporting table data as XML Format #422

Valektrum opened this issue Aug 4, 2023 · 8 comments

Comments

@Valektrum
Copy link

I'm trying to export my table data with the XML Format. However, I noticed that some of my text fields are losing some whitespace characters.

For example, the string " system normal " will be exported as "system normal".

It seems to be working fine with the Tab Delimited format, but I would prefer to use the XML Format.
My sanitize level is set to "None (Off)".
I'm using v4.0.15-beta with Access 2016.

@joyfullservice
Copy link
Owner

Thanks for reporting this! Are you referring to data within a field? Would you be able to provide a screenshot or example?

@Valektrum
Copy link
Author

In Access, my data for this column is " -   System  Normal   -". The extra whitespace is important.
image
After exporting with the mentioned settings, this is what I get in the XML:
- System Normal -
image
Every whitespace is now a single space instead of multiple spaces. The whitespace is also removed at the beginning of the string.

@joyfullservice
Copy link
Owner

Thanks, that is helpful!

@joyfullservice
Copy link
Owner

I have reproduced this issue in the testing database. The change is coming from the FormatXML function, and probably related to the XSLT function normalize-space(). I am not real familiar with XSLT, but I am checking to see if this could be an easy fix, or if it might have other unintended effects...

@Valektrum
Copy link
Author

Thank you for looking into this. This project is great!

@joyfullservice
Copy link
Owner

Well, even with the help of ChatGPT, I was unable to fully figure out how to adjust the transformation while leaving the content intact. When I removed the template section with the normalize-space(.) function, it seems to have resolved the problem for table data export, but I don't know if that is going to cause other unintended effects. (I noticed that line breaks in the original XML may cause additional blank lines in the resulting XML.) Perhaps @bclothier could shed some light on this...

I will go ahead and push this change to the dev branch. You can build the add-in from source and test it out on your project. I left the original XSLT lines commented out in case we need to revert this change. Hope that helps!

joyfullservice added a commit that referenced this issue Aug 4, 2023
We don't want to change the actual data inside the tags when exporting table data to XML. #422
@joyfullservice
Copy link
Owner

For reference, here is the original XSLT:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:output method="xml"/>

  <xsl:template match="@*">
    <xsl:copy/>
  </xsl:template>

  <xsl:template match="text()">
    <xsl:value-of select="normalize-space(.)"/>
  </xsl:template>

  <xsl:template match="*">
    <xsl:param name="indent" select="''"/>
    <xsl:text>&#xA;</xsl:text>
    <xsl:value-of select="$indent"/>
    <xsl:copy>
      <xsl:apply-templates select="@*|*|text()">
        <xsl:with-param name="indent" select="concat($indent, '  ')"/>
      </xsl:apply-templates>
    </xsl:copy>
    <xsl:if test="count(../*) &gt; 0 and ../*[last()] = . and not(following-sibling::*)">
      <xsl:text>&#xA;</xsl:text>
      <xsl:value-of select="substring($indent, 3)"/>
    </xsl:if>
  </xsl:template>
</xsl:stylesheet>

I removed the following:

  <xsl:template match="text()">
    <xsl:value-of select="normalize-space(.)"/>
  </xsl:template>

@bclothier
Copy link
Contributor

The issue is that the XSLT transformation does manual indentation rather than letting the parser do the indentation which would be too wide. In order to do the indenting, it has to add the indentation which is achieved by treating XML as if it were text and then indenting it.

The problem is that we necessarily don't want to strip spaces from the values but we do want to strip the spaces from the tags (e.g. the XML structure itself). Will have to see if we can find a way to make the distinction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants