Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unnest_wider fails on certain types #1504

Open
ArthurAndrews opened this issue Jun 26, 2023 · 3 comments
Open

unnest_wider fails on certain types #1504

ArthurAndrews opened this issue Jun 26, 2023 · 3 comments
Labels
feature a feature request or enhancement rectangling 🗄️ converting deeply nested lists into tidy data frames

Comments

@ArthurAndrews
Copy link

I've been struggling to unnest_wider certain results in a list column. I don't know why some types (tibbles) are correctly unnested into list columns, but others (lm) are not. Any advice? Or is this a bug?

For the third example here, I expect unnest_wider should create a list column of lm objects.

library(tidyverse)
packageVersion("tidyr")
#> [1] '1.3.0'

# works
x <- list(a = 1, b = "b")
tibble(y = list(x, x)) |> unnest_wider(y)
#> # A tibble: 2 × 2
#>       a b    
#>   <dbl> <chr>
#> 1     1 b    
#> 2     1 b

# works
x <- list(a = 1, b = "b", c = tibble(z = 1:3))
tibble(y = list(x, x)) |> unnest_wider(y)
#> # A tibble: 2 × 3
#>       a b     c               
#>   <dbl> <chr> <list>          
#> 1     1 b     <tibble [3 × 1]>
#> 2     1 b     <tibble [3 × 1]>

# doesn't work
x <- list(a = 1, b = "b", c = lm(mpg ~ wt, mtcars))
tibble(y = list(x, x)) |> unnest_wider(y)
#> Error in `unnest_wider()`:
#> ℹ In column: `y`.
#> ℹ In row: 1.
#> Caused by error in `list_sizes()`:
#> ! `x$c` must be a vector, not a <lm> object.
#> Backtrace:
#>      ▆
#>   1. ├─tidyr::unnest_wider(tibble(y = list(x, x)), y)
#>   2. │ └─tidyr:::col_to_wide(...)
#>   3. │   ├─tidyr:::with_indexed_errors(...)
#>   4. │   │ └─rlang::try_fetch(...)
#>   5. │   │   └─base::withCallingHandlers(...)
#>   6. │   └─purrr::map(...)
#>   7. │     └─purrr:::map_("list", .x, .f, ..., .progress = .progress)
#>   8. │       ├─purrr:::with_indexed_errors(...)
#>   9. │       │ └─base::withCallingHandlers(...)
#>  10. │       ├─purrr:::call_with_cleanup(...)
#>  11. │       └─tidyr (local) .f(.x[[i]], ...)
#>  12. │         └─tidyr:::elt_to_wide(...)
#>  13. │           └─vctrs::list_sizes(x)
#>  14. └─vctrs:::stop_scalar_type(`<fn>`(`<lm>`), "x$c", `<env>`)
#>  15.   └─vctrs:::stop_vctrs(...)
#>  16.     └─rlang::abort(message, class = c(class, "vctrs_error"), ..., call = call)

Created on 2023-06-26 with reprex v2.0.2

@cstepper
Copy link

I get the same issue, when trying to convert the result of a purrr::safely() column to two columns:

library(tidyverse)

# Sample list with problematic function
data <- tibble(
  input = list(
    a = 1:3,
    b = NULL,
    c = "abc"
  )
)

# Apply safely function
data <- data |> 
  mutate(
    output = map(
      input, 
      safely(sum)
    )
  )

str(data$output)
#> List of 3
#>  $ a:List of 2
#>   ..$ result: int 6
#>   ..$ error : NULL
#>  $ b:List of 2
#>   ..$ result: int 0
#>   ..$ error : NULL
#>  $ c:List of 2
#>   ..$ result: NULL
#>   ..$ error :List of 2
#>   .. ..$ message: chr "invalid 'type' (character) of argument"
#>   .. ..$ call   : language .Primitive("sum")(..., na.rm = na.rm)
#>   .. ..- attr(*, "class")= chr [1:3] "simpleError" "error" "condition"

# Unnest the result
data = data |> 
  unnest_wider(col = output)
#> Error in `unnest_wider()`:
#> ℹ In column: `output`.
#> ℹ In row: 3.
#> Caused by error in `list_sizes()`:
#> ! `x$error` must be a vector, not a <simpleError/error/condition> object.
#> Backtrace:
#>      ▆
#>   1. ├─tidyr::unnest_wider(data, col = output)
#>   2. │ └─tidyr:::col_to_wide(...)
#>   3. │   ├─tidyr:::with_indexed_errors(...)
#>   4. │   │ └─rlang::try_fetch(...)
#>   5. │   │   └─base::withCallingHandlers(...)
#>   6. │   └─purrr::map(...)
#>   7. │     └─purrr:::map_("list", .x, .f, ..., .progress = .progress)
#>   8. │       ├─purrr:::with_indexed_errors(...)
#>   9. │       │ └─base::withCallingHandlers(...)
#>  10. │       ├─purrr:::call_with_cleanup(...)
#>  11. │       └─tidyr (local) .f(.x[[i]], ...)
#>  12. │         └─tidyr:::elt_to_wide(...)
#>  13. │           └─vctrs::list_sizes(x)
#>  14. └─vctrs:::stop_scalar_type(`<fn>`(`<smplErrr>`), "x$error", `<env>`)
#>  15.   └─vctrs:::stop_vctrs(...)
#>  16.     └─rlang::abort(message, class = c(class, "vctrs_error"), ..., call = call)

Created on 2023-08-21 with reprex v2.0.2

@cstepper
Copy link

would be great if it would work without this detour:

# Unnest the result
data |> 
  mutate(
    output = map_depth(output, 2, as.character)
  ) |> 
  unnest_wider(col = output)
#> # A tibble: 3 × 3
#>   input        result error                                                     
#>   <named list> <chr>  <chr>                                                     
#> 1 <int [3]>    6       <NA>                                                     
#> 2 <NULL>       0       <NA>                                                     
#> 3 <chr [1]>    <NA>   "Error in .Primitive(\"sum\")(..., na.rm = na.rm): invali…

Created on 2023-08-21 with reprex v2.0.2

@hadley
Copy link
Member

hadley commented Nov 1, 2023

The problem is that you need vectors of length 1, instead of arbitrary objects which we generally think of a scalars (e.g. a linear model or error object only ever represents a single model or a single error). So you can always fix this problem by wrapping that object inside another list:

library(tidyverse)

data <- tibble(input = list(a = 1:3, b = NULL, c = "abc"))

data |> 
  mutate(output = map(input, safely(sum))) |> 
  mutate(output = map(output, \(x) list(result = x$result, error = list(x$error)))) |> 
  unnest_wider(output)
#> # A tibble: 3 × 3
#>   input        result error     
#>   <named list>  <int> <list>    
#> 1 <int [3]>         6 <list [1]>
#> 2 <NULL>            0 <list [1]>
#> 3 <chr [1]>        NA <list [1]>

Created on 2023-11-01 with reprex v2.0.2

Another way of saying the same thing is that if you need a list-column to represent the unnested data (e.g. it's a list of linear models or a list of errors), then you'll currently need to make sure that your data is already a list.

It does seem like tidyr could handle this for you (perhaps with an explicit option).

@hadley hadley added feature a feature request or enhancement rectangling 🗄️ converting deeply nested lists into tidy data frames labels Nov 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement rectangling 🗄️ converting deeply nested lists into tidy data frames
Projects
None yet
Development

No branches or pull requests

3 participants