Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DT[TRUE] can lead to invalid key #3215

Open
1 of 2 tasks
mattdowle opened this issue Dec 13, 2018 · 4 comments
Open
1 of 2 tasks

DT[TRUE] can lead to invalid key #3215

mattdowle opened this issue Dec 13, 2018 · 4 comments
Labels

Comments

@mattdowle
Copy link
Member

mattdowle commented Dec 13, 2018

I didn't realize DT[TRUE] was a way to achieve a shallow copy. shallow copy was only intended for internal use. Thanks to @renkun-ken for highlighting this in #3214, and related #2254.

In v1.11.8 we see this :

DT = data.table(id = 1:5, key="id")
DT1 = DT[TRUE]
key(DT1)
[1] "id"
DT1[3, id:=6L]
key(DT1)
# NULL              # correct
DT$id
# [1] 1 2 6 4 5     # should be 1:5
key(DT)
# [1] "id"          # invalid key

It only occurs after DT[TRUE], iiuc, which hopefully folk have not discovered or relied on too much?! I hope the usage out there is like @renkun-ken described to add new columns to the shallow copy, not to change existing columns!

New test 1542.08 was added in PR #2313 ready for when this is fixed.

@mattdowle mattdowle added this to the 1.12.0 milestone Dec 13, 2018
@mattdowle mattdowle added the bug label Dec 13, 2018
@renkun-ken
Copy link
Member

renkun-ken commented Dec 13, 2018

Yes, setkey, changing existing columns should not be used on the shallow copy since columns themselves are not copied.

@jangorecki
Copy link
Member

If we won't allow to make shallow copy with dt[TRUE] this issue will be automatically resolved.

@mattdowle
Copy link
Member Author

Eventually. But in the meantime, we can't break @renkun-ken's workflow.
More detail here: #3214 (comment)

@mattdowle mattdowle modified the milestones: 1.12.2, 1.12.4 Feb 26, 2019
@jangorecki
Copy link
Member

the following code could be added to tests to ensure copy behaviour

DT = data.table(a=c(1,2), b=c("b","a"))
address(DT)
address(DT[])
address(DT[, .SD])
address(DT[TRUE])
sapply(DT, address)
sapply(DT[], address)
sapply(DT[, .SD], address)
sapply(DT[TRUE], address)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants