Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT] Support Array of Struct Type #108

Closed
ChristianDu opened this issue May 6, 2024 · 8 comments
Closed

[FEAT] Support Array of Struct Type #108

ChristianDu opened this issue May 6, 2024 · 8 comments
Labels
enhancement New feature or request

Comments

@ChristianDu
Copy link

Describe the feature you'd like to be added to Parquet Viewer
Hi,
can you please add the support of Arrays?
I can't open array columns:
image

Example File:

arrayExample.zip

Definition of columns (Apache Spark Java):

return createStructType(Arrays.asList( createStructField("Product", StringType, true), createStructField("Orders", createArrayType(createStructType(Arrays.asList( createStructField("DateTime", TimestampType, true), createStructField("Quantity", DoubleType, true) )), false), true) ));

Thank you!

Share why this feature would be a good addition to the utility
Improves usability, currently files with arrays can't be loaded correctly

Screenshots
Any screenshots describing how the feature would look is a plus.

Note: There are no guarantees your feature will be implemented.

@ChristianDu ChristianDu added the enhancement New feature or request label May 6, 2024
@mukunku mukunku changed the title [FEAT] Support Array Type [FEAT] Support Array of Struct Type May 10, 2024
@mukunku
Copy link
Owner

mukunku commented May 10, 2024

Hey @ChristianDu, as you saw I've added support for struct arrays in v3.0.0 . Thanks for opening the issue and sharing a sample file.

Going to close out this issue but feel free to re-open if the issue persists.

@mukunku mukunku closed this as completed May 10, 2024
@ChristianDu
Copy link
Author

Hi @mukunku ,

thank you very much for implementing the feature that fast.
It works with a few files but i get an error for one column.

createStructField(COLUMN_NAME, createArrayType(createStructType(Arrays.asList(
createStructField(SUB_COL_1, TimestampType, true),
createStructField(SUB_COL_2, DoubleType, true),
createStructField(SUB_COL_3, DoubleType, true) )), false), true);

I will try to provide you with an example file because i can't upload the exact file.

Error:


Specified cast is not valid.

Something went wrong (CTRL+C to copy):

System.InvalidCastException: Specified cast is not valid.

at ParquetViewer.Engine.ParquetEngine.ReadListField(DataTableLite dataTable, ParquetRowGroupReader groupReader, Int32 rowBeginIndex, ParquetSchemaElement itemField, Int32 fieldIndex, Int64 skipRecords, Int64 readRecords, Boolean isFirstColumn, CancellationToken cancellationToken, IProgress`1 progress)

at ParquetViewer.Engine.ParquetEngine.ProcessRowGroup(DataTableLite dataTable, ParquetRowGroupReader groupReader, Int64 skipRecords, Int64 readRecords, CancellationToken cancellationToken, IProgress`1 progress)

at ParquetViewer.Engine.ParquetEngine.PopulateDataTable(DataTableLite dataTable, ParquetReader parquetReader, Int64 offset, Int64 recordCount, CancellationToken cancellationToken, IProgress`1 progress)

at ParquetViewer.Engine.ParquetEngine.ReadRowsAsync(List1 selectedFields, Int32 offset, Int32 recordCount, CancellationToken cancellationToken, IProgress1 progress)

at ParquetViewer.MainForm.<>c__DisplayClass33_0.<b__1>d.MoveNext()

--- End of stack trace from previous location ---

at ParquetViewer.MainForm.LoadFileToGridview()

at System.Threading.Tasks.Task.<>c.b__128_0(Object state)

at InvokeStub_SendOrPostCallback.Invoke(Object, Span`1)

at System.Reflection.MethodBaseInvoker.InvokeWithOneArg(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)

OK

@ChristianDu
Copy link
Author

@mukunku
I could reproduce the error. It happens when every array of the column is empty.
I created an example file (same columns as the previous one but this one has empty arrays):

emptyArrayError.zip

@mukunku
Copy link
Owner

mukunku commented May 10, 2024

Thanks. I will take a look 👍🏼

@mukunku mukunku reopened this May 10, 2024
@mukunku
Copy link
Owner

mukunku commented May 10, 2024

@ChristianDu Can you try this alpha version with your file?

ParquetViewer_#108.zip

Not sure if you use regular exe or self-contained but I can't upload self-contained in a comment so I shared the regular one.

If you're not comfortable testing the exe that's okay too! I can add it to the release as usual.

@ChristianDu
Copy link
Author

Looks good! File is loading and showing the data correctly.
Thanks for the fast fix.

@ChristianDu
Copy link
Author

Implemented and fixed. Thx

@mukunku
Copy link
Owner

mukunku commented May 10, 2024

Thanks for confirming! https://github.com/mukunku/ParquetViewer/releases/tag/v3.0.0.1 has the fix now 💪🏼

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants