Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rbind allow binding of different class attributes #5446

Open
wants to merge 21 commits into
base: master
Choose a base branch
from

Conversation

ben-schwen
Copy link
Member

@ben-schwen ben-schwen commented Aug 25, 2022

Closes #5309 (shadowing current approach for int64 and factor)
Closes #3911 (automatically allows for mixing Date and IDate and adds ignore.attr argument)
Closes #4934 (only carries class of AsIs to result if AsIs is first in binding to stay conform with do.call(rbind, list) in this case)
Closes #5391
Closes #5542
Towards #5486 also needs #5569

  • Mix/fill dates (Date and IDate) with atomic columns
  • Mix/fill POSIXct with atomic columns
  • Mix/fill ITime
  • Mix/fill AsIs
  • Add ignore.attr argument rbindlist and rbind to manually deactivate check for equal classes of binding columns
  • Tests
  • News

@codecov
Copy link

codecov bot commented Aug 25, 2022

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.53%. Comparing base (898dce3) to head (7353bc6).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #5446   +/-   ##
=======================================
  Coverage   97.53%   97.53%           
=======================================
  Files          80       80           
  Lines       14915    14926   +11     
=======================================
+ Hits        14547    14558   +11     
  Misses        368      368           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

R/merge.R Outdated
@@ -97,7 +97,7 @@ merge.data.table = function(x, y, by = NULL, by.x = NULL, by.y = NULL, all = FAL
# Perhaps not very commonly used, so not a huge deal that the join is redone here.
missingyidx = y[!x, which=TRUE, on=by, allow.cartesian=allow.cartesian]
if (length(missingyidx)) {
dt = rbind(dt, y[missingyidx], use.names=FALSE, fill=TRUE)
dt = rbind(dt, y[missingyidx], use.names=FALSE, fill=TRUE, ignore.attr=TRUE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have unit test that covers that? It looks like a breaking change, therefore should be introduced softly, having backward compatible default for 1-2 versions, before switching to new behavior.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have now since #5857. I'm indifferent whether we add the change to merge or not. It makes the code much shorter and does not hide so much what we want to achieve but since many ppl rely on merge we should be careful about changes.

@jangorecki
Copy link
Member

This PR resolves one of the issues on 1.15.0 milestone. But it is much bigger than just fixing regression, it adds new feature. I would therefore prefer to shift it to 1.15.99, and for 1.15.0 push only regression fixes.

@MichaelChirico MichaelChirico added this to the 1.15.0 milestone Dec 26, 2023
@MichaelChirico
Copy link
Member

Agree with Jan, it would be nice to merge a minimal PR that fixes regression to send out as 1.15.0. I glanced at the diff here and didn't see an easy way to separate out the regression fix from the new functionality, is that possible @ben-schwen? Or should we work on that as a standalone PR?

@ben-schwen
Copy link
Member Author

ben-schwen commented Dec 26, 2023

#5309 has two issues in it:

  1. rbindlist(..., fill=TRUE, use.names=FALSE). This did not work previously and threw a warning changing the value of use.names to TRUE.
Warning message:
In rbindlist(l, use.names, fill, idcol) :
  use.names= cannot be FALSE when fill is TRUE. Setting use.names=TRUE.

This error of #5309 is already fixed by #5468

  1. merge used rbind(dt, yy, use.name=FALSE, fill=FALSE) and manually filled yy before binding.
    There is a working version of merge without rbindlist(fill=TRUE, use.names=FALSE) in rbindlist support fill=TRUE with use.names=FALSE and use it in merge.R ToDo of #678 #5263 but I would even roll back merge to version pre rbindlist support fill=TRUE with use.names=FALSE and use it in merge.R ToDo of #678 #5263 (since it was only a minor).

Will add tests of #5263 and file regression PR.

@MichaelChirico
Copy link
Member

Is this ready for review? If so, please resolve the R/merge.R conflict. Thanks!

@ben-schwen
Copy link
Member Author

Is this ready for review? If so, please resolve the R/merge.R conflict. Thanks!

Yes. Only need to decide whether we want to include the change in merge together with the test 2253.31 or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants