Should rbindlist(..., fill=TRUE) return NA_logical_ in list columns? #4198

sritchie73 · 2020-01-25T11:35:31Z

When filling a list column, rbindlist departs from the behaviour of all other column types, and returns NULL elements instead of NA:

> A = data.table(c1=0, c2=list(1:3))
> B = data.table(c1=1)
> rbind(A,B,fill=TRUE)
      c1     c2
   <num> <list>
1:     0  1,2,3
2:     1

Expected:

> A = data.table(c1=0, c2=list(1:3))
> B = data.table(c1=1)
> rbind(A,B,fill=TRUE)
      c1     c2
   <num> <list>
1:     0  1,2,3
2:     1  NA

Should we change this behaviour for list columns to fill the rows with NA values to match the behaviour of fill=TRUE for other column types?

The text was updated successfully, but these errors were encountered:

sritchie73 · 2020-01-25T12:10:47Z

PR with fix provided if we want to make the change

jangorecki · 2020-01-25T14:26:29Z

Current behaviour seems fine to me.

> str(as.integer(NULL)[1L])
 int NA
> str(as.list(NULL)[1L])
List of 1
 $ : NULL

IMO it should not be NA because:

it changes the type from a missing field (undefined) to a logical vector
it changes the length from 0 length to length 1

mattdowle · 2020-02-18T06:28:50Z

I'm leaning towards Jan's point. Current behavior of empty element is actually a list's way of representing missing (there isn't any object to point to). We could construct an example where each item of the list was a logical vector, each item being the result of some computation. In such a case, 3 different states might need to be represented: length 0 logical, length 1 NA logical, and missing computation. If length 1 NA logical was used for missing, those 2 couldn't be distinguished.

Would changing the print method suffice? Instead of nothing being printed, how about NULL ? Printing NA could again imply a length 1 NA logical, whereas NULL would be unambiguous, consistent with what base R prints for empty list items, and would give a further visual reminder that it was a list column.

MichaelChirico · 2020-02-18T06:32:19Z

also agree w Jan, in particular about length 0 --> length 1.

I'm using lengths(x)>0 a lot to filter rows by empty list columns.

we could I guess put logical() there instead, is there any advantage of logical() over NULL though?

sritchie73 · 2020-02-18T09:11:25Z

What if we just instead add a sentence to the documentation noting the behaviour in the case of a missing list column:

Current entry for the fill argument:

TRUE fills missing columns with NAs. By default FALSE. When TRUE, use.names is set to TRUE.

Proposed:

TRUE fills missing columns with NAs, or NULL for missing list columns. By default FALSE. When TRUE, use.names is set to TRUE.

mattdowle · 2020-02-19T01:30:36Z

Doc change looks good. Plus the print method change I suggested too?

sritchie73 mentioned this issue Jan 25, 2020

rbindlist(..., fill=TRUE) fills missing list columns with NA instead of NULL #4199

Closed

jangorecki added the question label Jan 25, 2020

sritchie73 mentioned this issue Feb 19, 2020

Improved handling of list columns with NULL entries #4250

Merged

jangorecki added the non-atomic column e.g. list columns, S4 vector columns label Apr 6, 2020

mattdowle mentioned this issue Jul 16, 2021

melt(na.rm=TRUE) should remove rows with missing list column #5053

Merged

jangorecki added the rbindlist label Dec 19, 2023

ben-schwen closed this as completed Jan 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should rbindlist(..., fill=TRUE) return NA_logical_ in list columns? #4198

Should rbindlist(..., fill=TRUE) return NA_logical_ in list columns? #4198

sritchie73 commented Jan 25, 2020

sritchie73 commented Jan 25, 2020

jangorecki commented Jan 25, 2020 •

edited

mattdowle commented Feb 18, 2020 •

edited

MichaelChirico commented Feb 18, 2020

sritchie73 commented Feb 18, 2020

mattdowle commented Feb 19, 2020

Should rbindlist(..., fill=TRUE) return NA_logical_ in list columns? #4198

Should rbindlist(..., fill=TRUE) return NA_logical_ in list columns? #4198

Comments

sritchie73 commented Jan 25, 2020

sritchie73 commented Jan 25, 2020

jangorecki commented Jan 25, 2020 • edited

mattdowle commented Feb 18, 2020 • edited

MichaelChirico commented Feb 18, 2020

sritchie73 commented Feb 18, 2020

mattdowle commented Feb 19, 2020

jangorecki commented Jan 25, 2020 •

edited

mattdowle commented Feb 18, 2020 •

edited