Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] 使用delete 删除的数据仍然可以查询到 #34551

Open
2 of 3 tasks
weilai201 opened this issue May 8, 2024 · 4 comments
Open
2 of 3 tasks

[Bug] 使用delete 删除的数据仍然可以查询到 #34551

weilai201 opened this issue May 8, 2024 · 4 comments

Comments

@weilai201
Copy link

Search before asking

  • I had searched in the issues and found no similar issues.

Version

2.0.4

What's Wrong?

--1. 先创建TEST1表
drop table IF EXISTS TEST1;
CREATE TABLE IF NOT EXISTS TEST1 (
ID int NULL COMMENT 'ID',
ID1 int NULL COMMENT 'ID1',
ID2 int NULL COMMENT 'ID2'
) UNIQUE KEY(
ID
) DISTRIBUTED BY HASH(ID) BUCKETS AUTO PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"enable_unique_key_merge_on_write" = "true",
"store_row_column" = "true"
);

-- 2. 表中插入一条记录
insert into TEST1(ID,ID1,ID2) values(1,1,1);
select * from TEST1;
查询结果
image

-- 3. 删除此记录
delete from TEST1 where ID=1;

--4. 设置部分更新
set enable_unique_key_partial_update=true;
set enable_insert_strict=false;

--5. 在此插入相同key值记录
insert into TEST1(ID,ID1) values(1,2);
select * from TEST1;

image

What You Expected?

上面第五步骤中期望结果为
image

How to Reproduce?

No response

Anything Else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@weilai201
Copy link
Author

我看到2.1.2 版本中修复了类似问题,不知道能不能解决我的问题:#32574

@weilai201
Copy link
Author

使用2.1.2版本,此问题仍然存在

@Yukang-Lian
Copy link
Collaborator

Prerequisite Knowledge: During the process of updating partial columns, existing data may be read. If the key of the newly imported data for partial columns already exists, BE (Backend) will read the rows with the same key and update the data of the partial columns. If the key does not exist, BE will write the newly imported data of the partial columns and fill in the default values for other columns.
Since the storage engine used by Doris is similar to the LSM-tree structure, the delete statement does not actually delete the data but marks it for deletion. The delete statement in Doris writes the delete information into the delete predicate, marking the data for deletion with a predicate.

Reason: In step 3, the data with key=1 was deleted, so key=1 was written into the delete predicate. In step 5, when partially updating the columns with key=1, the old data was read without considering the effect of the delete predicate, resulting in the ID2 column showing the previously deleted value.

Solution: When reading and completing data for partial column updates, consider the delete predicate to solve the issue.

@ycycse
Copy link

ycycse commented May 16, 2024

Hi, I'll try this.

@Yukang-Lian Yukang-Lian assigned ycycse and unassigned zhannngchen and Yukang-Lian May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants