Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tech Request]: delay Ranges for partitioned table #16085

Open
2 tasks done
aunjgr opened this issue May 14, 2024 · 1 comment
Open
2 tasks done

[Tech Request]: delay Ranges for partitioned table #16085

aunjgr opened this issue May 14, 2024 · 1 comment
Assignees
Labels
kind/tech-request New feature or request phase/testing priority/p0 Critical feature that should be implemented in this version
Milestone

Comments

@aunjgr
Copy link
Contributor

aunjgr commented May 14, 2024

Is there an existing issue for the same tech request?

  • I have checked the existing issues.

Does this tech request not affect user experience?

  • This tech request doesn't affect user experience.

What would you like to be added ?

Ranges should be called as late as handling runtime filters if they exist. This mechanism is already done for tables without partitions. It should also be implemented for partitioned tables.

Why is this needed ?

DML plan has a bug in main (and the tagged branches). For prepared statements like

prepare stmt from insert into t1 values (?, ?, ?), (?, ?, ?), (?, ?, ?), (?, ?, ?)

whose parameter list is longer than 3 rows, the deduplication plan gives a BlockFilter cpk in (null, null, null, null), because it generates the filter from a compile-time null vector. In theory it should fail deduplication, i.e., it will never detect any deduplicated keys. However, current Reader code drops any pk filter with null value. These two bugs annihilate each other. It doesn't affects correctness of result, so it's not discovered before.

For performance's sake, we don't fix that bug directly. We should remove that falsely early folded BlockFilter, and use runtime filter instead. But the fix depends on this task. Without it, partitioned table should be slower whether it's prepared or not.

Additional information

related to #16178

@aunjgr aunjgr added the kind/tech-request New feature or request label May 14, 2024
@aunjgr aunjgr changed the title [Tech Request]: [Tech Request]: delay Ranges for partitioned table May 14, 2024
@aunjgr aunjgr added this to the 1.2.1 milestone May 17, 2024
@aunjgr aunjgr added the priority/p0 Critical feature that should be implemented in this version label May 17, 2024
@qingxinhome
Copy link
Contributor

This function has been implemented and verified for testing @aunjgr @aressu1985 @Ariznawlll @badboynt1 @daviszhen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/tech-request New feature or request phase/testing priority/p0 Critical feature that should be implemented in this version
Projects
None yet
Development

No branches or pull requests

4 participants