Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PHOENIX-7313 All cell versions should not be retained during flushes … #1888

Merged
merged 8 commits into from
May 16, 2024

Conversation

kadirozde
Copy link
Contributor

…and minor compaction when maxlookback is disabled

…and minor compaction when maxlookback is disabled
@kadirozde
Copy link
Contributor Author

The last commit for fixing the same issue reported in https://issues.apache.org/jira/browse/PHOENIX-7232

@tkhurana
Copy link
Contributor

@kadirozde Can you also fix the issue to disable masking if TTL is set to forever. Thanks

@virajjasani
Copy link
Contributor

First build timed out after ~9 hr, and second one ran for only ~16 min.
Triggered new one: https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1888/3/

@kadirozde
Copy link
Contributor Author

I tested this PR using local hbase (in addition to the new IT). It worked as expected.

When max lookback is disabled, the following behavior is verified

  1. All cell versions are preserved in memory.
  2. During flushes, extra cell versions are removed.
  3. During minor compactions, delete markers are preserved but extra cell versions are removed.

When max lookback is enabled, all cells are preserved during flushes and minor compaction.

@virajjasani
Copy link
Contributor

Almost every test is getting aborted after ~9 hr https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1888/

This usually means that some test is getting stuck forever and that's why the whole test suite is not getting completed.

this.maxLookbackWindowStart = maxLookbackInMillis == 0 ?
compactionTime : compactionTime - (maxLookbackInMillis + 1);
this.maxLookbackWindowStart = this.maxLookbackInMillis == 0 ?
compactionTime : compactionTime - (this.maxLookbackInMillis + 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kadirozde It will be useful to have some logging here so that we know we are using Phoenix compactor

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which case we should log and why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I misunderstood this comment. This was for CompactionScanner. I added logging for that.

Copy link
Contributor

@jpisaac jpisaac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1, Will need to check the TFs

@jpisaac
Copy link
Contributor

jpisaac commented May 14, 2024

@kadirozde I may have prematurely approved it. Can u add some IT tests for compacting shared Index tables.

@kadirozde
Copy link
Contributor Author

Tests are passing now. Only one test failed which passes when I run. I think it (ConcurrentMutationsIT.testSetIndexedColumnToNullAndValueAtSameTSWithStoreNulls2) is a flapper. @tkhurana, @virajjasani, I want to merge this if you do not have further comments.

@kadirozde kadirozde merged commit 0abdcdb into apache:master May 16, 2024
0 of 2 checks passed
@kadirozde kadirozde deleted the 7313 branch May 16, 2024 22:30
options.setMaxVersions(Integer.MAX_VALUE);
options.setMinVersions(Integer.MAX_VALUE);
} else {
options.setMinVersions(Math.max(Math.max(options.getMaxVersions(),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this intended to be options.setMaxVersions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it was. Please see the next comment.

options.setMinVersions(Math.max(Math.max(options.getMaxVersions(),
store.getColumnFamilyDescriptor().getMaxVersions()), 1));
options.setMinVersions(Math.max(Math.max(options.getMinVersions(),
store.getColumnFamilyDescriptor().getMaxVersions()), 1));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is min versions supposed to be columnFamilyDescription max versions if it exists, or should this also be min? store.getColumnFamilyDescriptor().getMinVersions()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is borrowed from the previous implementation (before CompactionScanner). It is interesting that some tests for old indexes or SYCAT fails if getMaxVersions is not used above. I do not know the exact reason for it. This will be changed very likely again when we enable CompactionScanner for minor compaction. I am planning to get that change (for https://issues.apache.org/jira/browse/PHOENIX-7314) soon.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I could see how potentially we still want to retain all versions here for whatever reason... I did not try to understand it. But on line 369 above we are also calling options.setMinVersions so we call options.setMinVersions twice, but never call options.setMaxVersions. If the CFDescriptor can have a higher number for maxVersions than what options contains, then min will be greater than max.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed that there is a typo here. The old code just had options.setMinVersions(Math.max(Math.max(options.getMinVersions(), store.getColumnFamilyDescriptor().getMaxVersions()), 1));. I wanted to have the same logic for max version and even though it was not necessary for existing tests. So my intention was to add options.setMaxVersions(Math.max(Math.max(options.getMaxVersions(), store.getColumnFamilyDescriptor().getMaxVersions()), 1)); However, because of the typo, I ended up not adding this logic for max version. The code is harmless but confusing. I am going to change this in the jira. Thank you for catching this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants