Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some spreadsheets won't parse #360

Open
Daemach opened this issue Feb 15, 2024 · 4 comments
Open

Some spreadsheets won't parse #360

Daemach opened this issue Feb 15, 2024 · 4 comments

Comments

@Daemach
Copy link

Daemach commented Feb 15, 2024

I am trying to read an xlsx file and get the following message, even though there is a valid sheet. I am getting sheet names with getInfo() then looping over them to import, but it doesn't pick up any data.

The operative line is spreadsheet.setReturnCachedFormulaValues( true ).read( argumentcollection = args );

image
CB8188710A3B4338BE83A4D2FE9F65DF.xlsx

@Daemach
Copy link
Author

Daemach commented Feb 15, 2024

It seems like this was brought up and fixed in the past, but I don't remember how, and I can't find the original post...

The way around this is to use the inspection tool in Excel and have it remove everything it finds. The spreadsheet loads normally at that point. If POI has such a function available, it would also be nice to have access to a chainable execution of that.

@cfsimplicity
Copy link
Owner

Reading the file you attached seems to work as expected using this code:

spreadsheet = New spreadsheet.Spreadsheet();
path = ExpandPath( "CB8188710A3B4338BE83A4D2FE9F65DF.xlsx" );
data = spreadsheet.setReturnCachedFormulaValues( true ).read( path, "query" );
WriteDump( data );

datadump

I've no idea where the message you saw comes from or what it refers to.

It seems like this was brought up and fixed in the past

You mean here with the spreadsheet library? It doesn't ring any bells I'm afraid.

the way around this is to use the inspection tool in Excel and have it remove everything it finds

Is that the Document Inspector? There's no such tool in POI as far as I know.

@Daemach
Copy link
Author

Daemach commented Feb 16, 2024

I fear that I accidentally attached the file from which I had removed the offending data using the document inspector. https://app.screencast.com/ymcIgDAPSi9N1

Please try this one.

CB8188710A3B4338BE83A4D2FE9F65DF_broken.xlsx

@cfsimplicity
Copy link
Owner

Please try this one.

Same result. Reads into a query just fine using the code above.

From what you say it seems that the issue is caused by the presence of custom xml elements in your xlsx. As I say, I'm not aware of a POI method for removing those programmatically I'm afraid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants