You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is an inconvenience rather than a BUG per se - If one is to provide columnStats during appends, stats for ALL indexed columns must be present.
How to reproduce?
Different steps about how to reproduce the problem.
1. Code that triggered the bug, or steps to reproduce:
// Index data with columns 'a' and 'b'
df1
.write
.format("qbeast")
.option("columnsToIndex", "a,b")
.save(targetPath)
// Provide stats only for column 'a' when appending
df2
.write
.format("qbeast")
.option("columnsToIndex", "a,b")
.option("columnStats", """{"a_min": 1, "a_max": 2}""")
.save(targetPath)
java.lang.IllegalArgumentException: b_min does not exist. Available: a_max, a_min
at org.apache.spark.sql.types.StructType.$anonfun$fieldIndex$1(StructType.scala:313)
at scala.collection.immutable.Map$Map2.getOrElse(Map.scala:236)
at org.apache.spark.sql.types.StructType.fieldIndex(StructType.scala:312)
at org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema.fieldIndex(rows.scala:187)
at org.apache.spark.sql.Row.getAs(Row.scala:373)
at org.apache.spark.sql.Row.getAs$(Row.scala:373)
at org.apache.spark.sql.catalyst.expressions.GenericRow.getAs(rows.scala:166)
at io.qbeast.spark.table.IndexedTableImpl.$anonfun$isNewRevision$2(IndexedTable.scala:159)
at io.qbeast.core.transform.LinearTransformer.makeTransformation(LinearTransformer.scala:43)
at io.qbeast.spark.table.IndexedTableImpl.$anonfun$isNewRevision$1(IndexedTable.scala:159)
at scala.collection.immutable.List.map(List.scala:297)
at io.qbeast.spark.table.IndexedTableImpl.isNewRevision(IndexedTable.scala:158)
at io.qbeast.spark.table.IndexedTableImpl.save(IndexedTable.scala:205)
The text was updated successfully, but these errors were encountered:
What went wrong?
This is an inconvenience rather than a BUG per se - If one is to provide
columnStats
during appends, stats for ALL indexed columns must be present.How to reproduce?
Different steps about how to reproduce the problem.
1. Code that triggered the bug, or steps to reproduce:
2. Branch and commit id:
main
, f066acf3. Spark version:
3.4.1
4. Hadoop version:
3.3.4
5. How are you running Spark?
Locally
6. Stack trace:
The text was updated successfully, but these errors were encountered: