Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Circular static init in ColumnType prevents correct singleton init #1205

Open
uosis opened this issue Apr 20, 2023 · 2 comments
Open

Circular static init in ColumnType prevents correct singleton init #1205

uosis opened this issue Apr 20, 2023 · 2 comments

Comments

@uosis
Copy link

uosis commented Apr 20, 2023

There is a circular dependency between singleton fields in ColumnType and singleton fields in its implementations. This makes correct initialization impossible.

Observe the following peculiar behavior (using Scala REPL for ease of illustration; eq in Scala is == in Java, i.e. reference equality):

> import $ivy.`tech.tablesaw:tablesaw-core:0.43.1`
> assert(tech.tablesaw.columns.numbers.DoubleColumnType.instance() eq tech.tablesaw.api.ColumnType.DOUBLE)
java.lang.AssertionError: assertion failed

This fails because static init of ColumnType is called from within static init of DoubleColumnType, while DoubleColumnType.INSTANCE field is still null, so ColumnType.DOUBLE receives a new instance from here, instead of the singleton value that eventually ends up in DoubleColumnType.INSTANCE field.

Let's try the other way:

> import $ivy.`tech.tablesaw:tablesaw-core:0.43.1`
> assert(tech.tablesaw.api.ColumnType.DOUBLE eq tech.tablesaw.columns.numbers.DoubleColumnType.instance())
// yay no error! but...
> tech.tablesaw.columns.numbers.DoubleColumnType.DEFAULT_PARSER.columnType()
res2: ColumnType = null

Now static init of DoubleColumnType is called from within static init of ColumnType, while ColumnType.DOUBLE is still null, so we get null here.

This breaks tablesaw-parquet here because it's doing reference equality comparisons.

> import $ivy.`tech.tablesaw:tablesaw-core:0.43.1`
> import $ivy.`net.tlabs-data:tablesaw_0.43.1-parquet:0.10.1`
> val table = tech.tablesaw.api.Table.create("t").addColumns(tech.tablesaw.api.DoubleColumn.create("c", 1, 2, 3))
> new net.tlabs.tablesaw.parquet.TablesawParquetWriter().write(table, net.tlabs.tablesaw.parquet.TablesawParquetWriteOptions.builder("test.parquet").build())
java.lang.NullPointerException: Cannot invoke "tech.tablesaw.api.DoubleColumn.getDouble(int)" because "this.doubleColumns[colIndex]" is null
  net.tlabs.tablesaw.parquet.TableProxy.getDouble(TableProxy.java:225)

I am not sure what is the correct solution here. Should tablesaw-parquet not assume singleton instances and use equals instead of ==? I am guessing most things are generally working because in most cases DoubleColumnType.instance() will run first, causing DoubleColumnType.instance() and ColumnType.DOUBLE to have different instances, but nothing getting null value.

Possibly similar to #430.

@ccleva
Copy link
Contributor

ccleva commented Apr 23, 2023

Fixed on tablesaw-parquet side with v0.11.0.

@uosis
Copy link
Author

uosis commented Apr 24, 2023

Fixed on tablesaw-parquet side with v0.11.0.

While this is great and probably the most expedient solution (thanks @ccleva for a quick fix!), the underlying problem of broken init is still there.

I am not aware of any practical issues it is causing at this point however, so this is probably not a very high priority.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants