Enhance BQ Connector to Pushdown Predicates while Materializing View #10523

ayushbilala · 2022-01-10T16:19:11Z

BQ Connector in the current state does not make use of predicates during view materialization. So, if the table is of size 10 TB, the connector will try to materialize 10 TB of data even when the SQL query submitted by the client has certain query filters.

Example:
SELECT col1, col2, col3 from someHugeBQTable where col1='someVal1';

BQ Connector as part of the materialization step will submit a BQ job with the below query without the where clause. (SQL query Creation happens here).

SELECT col1, col2, col3 from someHugeBQTable ;

This is going to materialize the entire table and is going to add a lot to the performance and cost.

The text was updated successfully, but these errors were encountered:

hashhar · 2022-01-10T16:50:15Z

I believe the reason for the current impl is that the view materialization cache is keyed on the SELECT SQL generated. Once we start pushing down predicates then the materialised cache will probably not be used at all unless the values passed to the predicates match exactly.

However the current behaviour is also suboptimal. As long as the sum of data returned from various queries on the view < size of entire view then the proposed behaviour might be cheaper.

I believe the only "perfect" solution to this would be for the Storage Read API to support reading views. cc: @davidrabinowitz any information if that's something that can be done?

cc: @ebyhr Do you have more ideas? Maybe we can bypass the Storage Read API for views and use the BigQuery client directly. But it's a very big lift since we'd need to implement something to convert the Avro data returned by the client and plug it into the page source.

ayushbilala · 2022-01-11T06:45:24Z

I agree, but there is always a chance of serving the stale data from the cache to the client if the underlying data keeps on changing. If we keep the TTL too low, we might end up refreshing the cache all the time.

Well, as far as I know, the Google BQ team is already at work for adding support for reading simple views using Storage Read API. But there won't be any support for complex views.

cozos · 2022-01-12T06:36:50Z

@hashhar If the view materialization cache key is the main issue, can we modify the logic to read cache based on the "original" query instead of the "optimized" (i.e. predicate pushdown'd) query? Is that too complicated of a code change

ebyhr · 2022-09-28T03:39:14Z

@ayushbilala Are you still working on this? We added support for query pass-through function and query based reader for views. Are they insufficient in your use case?

myfjdthink · 2022-10-10T08:57:30Z

I tried the query function, this is my query sql :

SELECT * FROM TABLE(
    bigquery.system.query(
      query => 'SELECT block_timestamp, transaction_hash
FROM `my.table` 
where block_timestamp >= "2022-10-09"
LIMIT 1000'
    )
  );

and it ran with an error：:

Error executing query:
SQL Error [65536]: Query failed (#20221010_085346_00084_9haf5): Unhandled type for LongTimestampWithTimeZone: timestamp(6) with time zone

@ebyhr

ebyhr · 2022-10-10T22:55:33Z

@myfjdthink See #12346

ebyhr · 2022-12-07T08:10:23Z

Let me close this issue because query pass-through function and query based reader for views are sufficient in my opinion. Please feel free to reopen if it's still required.

cw1427 · 2023-09-20T07:24:46Z

I also encount this error, how can it be resolved?

2023-09-19T10:15:37.275Z DEBUG stage-scheduler io.trino.execution.QueryStateMachine Query 20230919_101533_00021_cdqm7 failed
io.trino.spi.TrinoException: Unhandled type for LongTimestampWithTimeZone: timestamp(6) with time zone
at io.trino.plugin.bigquery.BigQueryQueryPageSource.appendTo(BigQueryQueryPageSource.java:215)
at io.trino.plugin.bigquery.BigQueryQueryPageSource.getNextPage(BigQueryQueryPageSource.java:148)
at io.trino.operator.TableScanOperator.getOutput(TableScanOperator.java:299)
at io.trino.operator.Driver.processInternal(Driver.java:395)
at io.trino.operator.Driver.lambda$process$8(Driver.java:298)
at io.trino.operator.Driver.tryWithLock(Driver.java:694)
at io.trino.operator.Driver.process(Driver.java:290)
at io.trino.operator.Driver.processForDuration(Driver.java:261)
at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:887)
at io.trino.execution.executor.timesharing.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:187)
at io.trino.execution.executor.timesharing.TimeSharingTaskExecutor$TaskRunner.run(TimeSharingTaskExecutor.java:565)
at io.trino.$gen.Trino_424____20230919_085020_2.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:833)

ebyhr · 2023-09-20T07:50:38Z

@cw1427 The type is unsupported. See #12346. You need to convert to other types in query table function or BigQueryr views.

ayushbilala changed the title ~~Enhance BQ Connector to Pushdown Predicates while Materialising View~~ Enhance BQ Connector to Pushdown Predicates while Materializing View Jan 10, 2022

ayushbilala self-assigned this Jan 10, 2022

hashhar added the enhancement New feature or request label Jan 10, 2022

ebyhr closed this as not planned Won't fix, can't repro, duplicate, stale Dec 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance BQ Connector to Pushdown Predicates while Materializing View #10523

Enhance BQ Connector to Pushdown Predicates while Materializing View #10523

ayushbilala commented Jan 10, 2022

hashhar commented Jan 10, 2022

ayushbilala commented Jan 11, 2022

cozos commented Jan 12, 2022

ebyhr commented Sep 28, 2022

myfjdthink commented Oct 10, 2022

ebyhr commented Oct 10, 2022

ebyhr commented Dec 7, 2022

cw1427 commented Sep 20, 2023

ebyhr commented Sep 20, 2023

Enhance BQ Connector to Pushdown Predicates while Materializing View #10523

Enhance BQ Connector to Pushdown Predicates while Materializing View #10523

Comments

ayushbilala commented Jan 10, 2022

hashhar commented Jan 10, 2022

ayushbilala commented Jan 11, 2022

cozos commented Jan 12, 2022

ebyhr commented Sep 28, 2022

myfjdthink commented Oct 10, 2022

ebyhr commented Oct 10, 2022

ebyhr commented Dec 7, 2022

cw1427 commented Sep 20, 2023

ebyhr commented Sep 20, 2023