New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Joining with by
: incorrect results when joining.
#2591
Comments
I'm going to take a look at this (if possible this week) and see if we can patch this. Will write back. |
Poking around, I suspect the culprit is the following on lines 885–898 in data.table.R:
The definition of |
I suspect this is the same bug as #2713 |
Reminder for myself: Try adding:
|
Note that the bug only occurs if the grouping column is sorted:
|
I have hunted the bug down to lines 135ff in
The |
I also tracked it down to this line, but i wasn't clear on the fix. IIRC
that if statement is working correctly, it's something on the R side that's
messed up
I could be mistaken, however...
…On Fri, Apr 13, 2018, 3:26 PM MarkusBonsch ***@***.***> wrote:
I have hunted the bug down to lines 135ff in src/dogroups.c:
if (LOGICAL(on)[0])
igrp = (length(grporder) && isNull(jiscols)) ? INTEGER(grporder)[INTEGER(starts)[i]-1]-1 : i;
else
igrp = (length(grporder)) ? INTEGER(grporder)[INTEGER(starts)[i]-1]-1 : (isNull(jiscols) ? INTEGER(starts)[i]-1 : i);
The on here is given as !missing(on) from R. If I remove the if condition
and leave only the second part, the bug is gone in this specific case. I
will create a PR, but someone who knows more about dogroups will have to
carefully review.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#2591 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHQQdfaJeBsgZmpiEg4ZG35kvwUdkOoIks5toGE2gaJpZM4RvGbR>
.
|
To be clear, I am not clear on the fix either. I did not really run it through |
me neither, perhaps @arunsrinivasan has some insight since IINM he implemented |
Merged PR should have closed this but didn't because "Closes" needs to appear before each and every |
Related to #733
When attempting to use
by
in the same operation as a join, an error should be thrown telling the user this feature is not implemented yet, unless the user has specifiedby=.EACHI
.Not only would this be helpful for reducing user confusion, but its also important because there are cases where
by
will work (e.g. on columns in the main data.table, see #2581) but produce the wrong results:Here's an example of where use of
by
with a join will work but produce the incorrect results:Note the third row, when
value == 3
, theid
should bec
.The join returns the correct results when
nomatch=NA
:The text was updated successfully, but these errors were encountered: