Fallback vectorization for FunctionExpr and BaseMacroFunctionExpr. #16366
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This patch adds
FallbackVectorProcessor
, a processor that adapts non-vectorizable operations into vectorizable ones. It is used inFunctionExpr
andBaseMacroFunctionExpr
. As a result, all such expressions can now participate in vectorized queries.In addition:
Identifiers are updated to offer getObjectVector for ARRAY and COMPLEX in addition to STRING. ExprEvalObjectVector is updated to offer ARRAY and COMPLEX as well. Identifiers already did
return true
fromcanVectorize
, so this enables them to live up to their claims.In SQL tests, cannotVectorize now fails tests if an exception is not thrown. This makes it easier to identify tests that can now vectorize.
Fixes a null-matcher bug in StringObjectVectorValueMatcher that was uncovered by certain newly-vectorizable test cases.
Benchmarks follow for
SqlExpressionBenchmark
queries 26 and 27. These two queries are:In these cases fallback vectorization is not as compelling as proper vectorization, but it's better than unvectorized execution. The relative benefit is greater for query 27, likely because fallback vectorization for
CONCAT
enables thelong1 * double4
to vectorize as well. In general I would expect the benefit to be greater for more complex queries, due to this effect.