Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to overwrite a delta table #326

Open
Jiaweihu08 opened this issue May 3, 2024 · 3 comments
Open

Unable to overwrite a delta table #326

Jiaweihu08 opened this issue May 3, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@Jiaweihu08
Copy link
Member

What went wrong?

Qbeast is not able to overwrite an existing delta table.

How to reproduce?

// Create a delta table
df.write.format("delta").save(tablePath)

// Overwrite it with qbeast
df.write.mode("overwrite").format("qbeast").option("columnsToIndex", "user_id").save(tablePath)

// The above would fail:
org.apache.spark.sql.AnalysisException: No space revision available with -1
  at org.apache.spark.sql.AnalysisExceptionFactory$.create(AnalysisExceptionFactory.scala:55)
  at io.qbeast.spark.delta.DeltaQbeastSnapshot.$anonfun$getRevision$1(DeltaQbeastSnapshot.scala:86)
  at scala.collection.immutable.Map$EmptyMap$.getOrElse(Map.scala:110)
  at io.qbeast.spark.delta.DeltaQbeastSnapshot.getRevision(DeltaQbeastSnapshot.scala:86)
  at io.qbeast.spark.delta.DeltaQbeastSnapshot.loadLatestRevision(DeltaQbeastSnapshot.scala:152)
  at io.qbeast.spark.internal.sources.QbeastDataSource.getTable(QbeastDataSource.scala:74)
  at org.apache.spark.sql.execution.datasources.v2.DataSourceV2Utils$.getTableFromProvider(DataSourceV2Utils.scala:92)
  at org.apache.spark.sql.DataFrameWriter.getTable$1(DataFrameWriter.scala:281)
  at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:297)
  at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:240)
  ... 47 elided

Branch and commit id: main, d37d238

Spark version: 3.5.0

Hadoop version: 3.3.4

How are you running Spark? local

@Jiaweihu08 Jiaweihu08 added the bug Something isn't working label May 3, 2024
@osopardo1
Copy link
Member

Mmm... Is it possible to overwrite tables in other formats? Let's say overwrite a JSON table with Parquet. Or Delta with Iceberg?

@Jiaweihu08
Copy link
Member Author

Mmm... Is it possible to overwrite tables in other formats? Let's say overwrite a JSON table with Parquet. Or Delta with Iceberg?

Different file formats can overwrite each other without problem, and delta overwrites qbeast mercilessly.

The bug shown here is because it detects an existing table, but no qbeast metadata is found.

@osopardo1
Copy link
Member

Thanks for the clarification 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants