Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for RichTextRun in Spreadsheet #410

Open
freb opened this issue Jun 24, 2020 · 0 comments
Open

Support for RichTextRun in Spreadsheet #410

freb opened this issue Jun 24, 2020 · 0 comments

Comments

@freb
Copy link
Contributor

freb commented Jun 24, 2020

Description

I was pulling content from an existing spreadsheet and noticed two cells which have content, but returned an empty string from cell.GetRawValue() and cell.GetString(). After digging into the raw xml, I noticed that the shared string for it looked like:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<sst count="1" uniqueCount="1" xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
	<si>
		<r>
			<t xml:space="preserve">some content and </t>
		</r>
		<r>
			<rPr>
				<sz val="11"/>
				<color rgb="FF000000"/>
				<rFont val="Calibri"/>
				<family val="2"/>
			</rPr>
			<t>example.com</t>
		</r>
		<r>
			<rPr>
				<sz val="11"/>
				<color theme="1"/>
				<rFont val="Calibri"/>
				<family val="2"/>
				<scheme val="minor"/>
			</rPr>
			<t>and more content.</t>
		</r>
	</si>
</sst>

I figured the runs were the issue. I dug through the code, and while RichTextRun exists, it doesn't seem to be used anywhere. I also inspected all attributes on cell.X() (sml.CT_Cell) and couldn't find the runs anywhere. It appears that RichTextRuns are not even being parsed in to the CT_Cell from what I can tell.

Expected Behavior

GetFormattedValue() should not be empty when a cell has content displayed in Excel. Ideally GetString() would be updated to return a plaintext version of the content, though according to how GetString is documented, it is currently working as expected.

Actual Behavior

GetFormattedValue() returns and empty string for cells with RichTextRuns. There is also no method that I was able to find to access the raw RichTextRun content directly through cell.X().

I've attached a shreadsheet with RichTextRun content in A1: wb.xlsx.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants