Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XLSX read only grabbing 2 decimal places #1245

Open
alexp8 opened this issue Dec 14, 2023 · 3 comments
Open

XLSX read only grabbing 2 decimal places #1245

alexp8 opened this issue Dec 14, 2023 · 3 comments

Comments

@alexp8
Copy link

alexp8 commented Dec 14, 2023

I have an xlsx file where the cells are "formatted" to show 2 decimal places.

When I read in this file via XlsxReadOptions, only 2 decimal places are being parsed. e.g. 12.123 -> 12.12

Code:

XlsxReadOptions options = XslxRaedOptions.builder(filePath)
    .columnTypes(x -> ColumnType.STRING)
    .missingValueIndicator(indicators)
    .build();
    
Table table = Table.read().usingOptions(options);
@ccleva
Copy link
Contributor

ccleva commented Dec 15, 2023

Hi @alexp8. This is the expected behavior when reading numerical values from an xlsx file in a column of type STRING: the reader applies the numerical cell format so what you get in the Table is closer to what you see in the original file.

Try using ColumnType FLOAT (or DOUBLE) for your numerical columns, or let the reader infer the column type by sampling.

If you have a mixed numerical/textual column in the xlsx file you'll have to disable the formatting in the source.

@alexp8
Copy link
Author

alexp8 commented Dec 15, 2023

Thank you, do you have an example of the reader sampling to detect the data type?

@ccleva
Copy link
Contributor

ccleva commented Dec 15, 2023

My bad, for xlsx files column type is not done by sampling, but the principle is the same.

If you don't configure a column type for some columns, the default behavior is to infer the column type, in this case by checking the column cell types in the file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants