New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve speed when listing columns in BigQuery #21920
base: master
Are you sure you want to change the base?
Conversation
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/type/DecimalTypeInfo.java
Outdated
Show resolved
Hide resolved
protected TypeInfo() {} | ||
|
||
@Override | ||
public abstract String toString(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these methods are not needed, remove toString, equals and hashCode. toString is only used in test but that's redundant - we only need a single roundtrip: String -> TypeInfo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
toString
is used in a few exceptions. I'll remove others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where's the exceptions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, the IllegalArgumentException
will be suppressed in convertToTrinoType
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I can either remove it, or we could add some logging, to make it easier to debug why some columns are not available in Trino. LMK what's better.
d3fba84
to
f43ccd6
Compare
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/BigQueryTypeManager.java
Outdated
Show resolved
Hide resolved
plugin/trino-bigquery/src/test/java/io/trino/plugin/bigquery/type/TestTypeInfoUtils.java
Show resolved
Hide resolved
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/BigQueryTypeManager.java
Show resolved
Hide resolved
f43ccd6
to
6b34692
Compare
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/BigQueryTypeManager.java
Show resolved
Hide resolved
plugin/trino-bigquery/src/test/java/io/trino/plugin/bigquery/BaseBigQueryConnectorTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-bigquery/src/test/java/io/trino/plugin/bigquery/BaseBigQueryConnectorTest.java
Show resolved
Hide resolved
plugin/trino-bigquery/src/test/java/io/trino/plugin/bigquery/TestBigQueryType.java
Show resolved
Hide resolved
plugin/trino-bigquery/src/test/java/io/trino/plugin/bigquery/BaseBigQueryConnectorTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/type/TypeInfoUtils.java
Outdated
Show resolved
Hide resolved
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/type/TypeInfoUtils.java
Outdated
Show resolved
Hide resolved
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/BigQueryTypeManager.java
Outdated
Show resolved
Hide resolved
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/BigQueryTypeManager.java
Outdated
Show resolved
Hide resolved
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/type/TypeInfoUtils.java
Outdated
Show resolved
Hide resolved
a2bbea8
to
3654c2e
Compare
@ebyhr PTAL |
plugin/trino-bigquery/src/test/java/io/trino/plugin/bigquery/TestBigQueryType.java
Outdated
Show resolved
Hide resolved
plugin/trino-bigquery/src/test/java/io/trino/plugin/bigquery/BaseBigQueryConnectorTest.java
Outdated
Show resolved
Hide resolved
FROM %s.INFORMATION_SCHEMA.COLUMNS | ||
GROUP BY table_catalog, table_schema, table_name | ||
""".formatted(quote(remoteSchemaName))); | ||
String schemaName = client.toSchemaName(DatasetId.of(projectId, remoteSchemaName)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to build DatasetId and call getDataset method (when is it different from remoteSchemaName)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added it for consistency. This connector has a local to remote names mapping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #19860
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand it's for extension. Such change should go to the internal repository.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then we'd have to revert #19860. It's out of the scope of this PR.
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/BigQueryMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/BigQueryTypeManager.java
Show resolved
Hide resolved
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/type/PrimitiveTypeInfo.java
Outdated
Show resolved
Hide resolved
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/type/TypeInfoUtils.java
Outdated
Show resolved
Hide resolved
cc6a799
to
bf0cf78
Compare
/test-with-secrets sha=bf0cf785188ed0cca97f6b18f4f750df035ebe75 |
The CI workflow run with tests that require additional secrets has been started: https://github.com/trinodb/trino/actions/runs/9218809344 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost good to me.
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/BigQueryClient.java
Outdated
Show resolved
Hide resolved
FROM %s.INFORMATION_SCHEMA.COLUMNS | ||
GROUP BY table_catalog, table_schema, table_name | ||
""".formatted(quote(remoteSchemaName))); | ||
String schemaName = client.toSchemaName(DatasetId.of(projectId, remoteSchemaName)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand it's for extension. Such change should go to the internal repository.
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/type/TypeInfoUtils.java
Outdated
Show resolved
Hide resolved
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/type/TypeInfoUtils.java
Outdated
Show resolved
Hide resolved
protected TypeInfo() {} | ||
|
||
@Override | ||
public abstract String toString(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, the IllegalArgumentException
will be suppressed in convertToTrinoType
, right?
* Unsupported type exception occurs only for data types | ||
* known to be not supported, like STRUCT, not all unknown tokens. | ||
*/ | ||
public class UnsupportedTypeException |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This name looks little confusing because STRUCT type is supported except for parsing. How about renaming to ParsingException or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not supported in the type
package. ParsingException
seems to be too broad, or I'd have to refactor the exception to use it instead of IllegalArgumentException
and make the typeName
parameter optional. I don't think it's worth doing.
if (!e.getTypeName().equals(STRUCT) || table.isEmpty() || table.get().getDefinition().getSchema() == null) { | ||
// ignore unsupported types | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we want to continue iteration even when table is empty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only need the table to map STRUCT types. We should continue iteration to convert other types.
plugin/trino-bigquery/src/test/java/io/trino/plugin/bigquery/type/TestTypeInfoUtils.java
Outdated
Show resolved
Hide resolved
bf0cf78
to
8c16327
Compare
Description
A continuation of #21830. Test coverage of the type parser is at 99% of lines and 95% of branches. Original description below.
Using mini parser because BigQuery Java SDK doesn't support translating string to BigQuery type as far as I asked Google engineers.
This PR improves listing columns. Test with 169 tables is improved from 22s to 1.5s.
Additional context and related issues
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text: