New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exposing cell content in list-columns? #892
Comments
Thanks. I think a way to move forward could indeed be the creation of a custom class that applies the desired formatting. If this is useful and stable, we might incorporate a variant in pillar. |
I don't anything about pillar & vctrs so I don't know how stable the code below may be, but here is a simple proof of concept: list_col <- function(x) {
vctrs::new_vctr(x, class = "list_col")
}
formatter_list_element <- function(x, width) {
start_txt <- "<"
end_txt <- ">"
ptype_txt <- vctrs::vec_ptype_abbr(x) # note: not working if element is not a vector (e.g. a function), do we care?
context_text <- ifelse(length(x) > 0,
paste0(" [", length(x), "] ",
toString(x,
width = width - nchar(ptype_txt) - nchar(paste0("<[]>", length(x))))),
"")
paste0(start_txt, ptype_txt, context_text, end_txt)
}
format.list_col <- function(x, ..., width = 25, formater = formatter_list_element) {
res <- purrr::map_chr(x, ~ formater(.x, width))
format(res, justify = "left")
}
vec_ptype_abbr.list_col <- function(x) {
"list-col"
}
pillar_shaft.list_col <- function(x, ...) {
out <- format(x, width = 25) # how to define width?
pillar::new_pillar_shaft_simple(out, min_width = 10) # what should min_width be?
}
## Example 1:
x <- list(1:2, TRUE, NA, NULL, 1.3, list(1, b = 2:10), matrix(1:9, nrow = 3))
y <- list_col(x)
tibble::tibble(x = x, y = y)
#> # A tibble: 7 x 2
#> x y
#> <list> <list-col>
#> 1 <int [2]> <int [2] 1, 2>
#> 2 <lgl [1]> <lgl [1] TRUE>
#> 3 <lgl [1]> <lgl [1] NA>
#> 4 <NULL> <NULL>
#> 5 <dbl [1]> <dbl [1] 1.3>
#> 6 <named list [2]> <named list [2] 1, 2:10>
#> 7 <int [3 × 3]> <int[,3] [9] 1, 2, 3, ....>
# note: display could be improved:
# - in console, colors are not consistent
# - matrix dim are weird
# - named list don't show names
## Example 2:
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
iris %>%
group_by(Species) %>%
summarise(range = list_col(list(range(Sepal.Length))),
quartiles = list_col(list(quantile(Sepal.Length))))
#> # A tibble: 3 x 3
#> Species range quartiles
#> <fct> <list-col> <list-col>
#> 1 setosa <dbl [2] 4.3, 5.8> <dbl [5] 4.3, 4.8, 5, ....>
#> 2 versicolor <dbl [2] 4.9, 7> <dbl [5] 4.9, 5.6, 5.9....>
#> 3 virginica <dbl [2] 4.9, 7.9> <dbl [5] 4.9, 6.225, 6....> Created on 2021-06-09 by the reprex package (v2.0.0) |
Nice! @hadley: What do you think? |
Seems like a reasonable idea, but I'd want to see a fuller exploration of what would be displayed for types other than atomic vector. |
Default outputs for non-atomic vectorsRedefining formatter_list_element <- function(x, width) {
ptype_txt <- pillar::obj_sum(x)
context_text <- ifelse(length(x) > 0,
paste0(" ", toString(x, width = width - nchar(ptype_txt) - 3L)),
"")
paste0("<", ptype_txt, context_text, ">")
} to benefit from the dimensions and ptype extracted by pillar_shaft.list_col <- function(x, ...) {
out <- format(x, width = 50)
pillar::new_pillar_shaft_simple(out, min_width = 10)
} we get the following for types other than atomic vectors > x <- list(a = matrix(1:9, nrow = 3), b = array(1:27, dim = c(3, 3, 3)), c = list(z = 1, zz = list(1, 2)))
> y <- list_col(x)
> tibble::tibble(x = x, y = y)
# A tibble: 3 x 2
x y
<named list> <list-col>
1 <int [3 × 3]> <int [3 × 3] 1, 2, 3, 4, 5, 6, 7, 8, 9>
2 <int [3 × 3 × 3]> <int [3 × 3 × 3] 1, 2, 3, 4, 5, 6, 7, 8, 9, 1....>
3 <named list [2]> <named list [2] 1, list(1, 2)> The first 2 rows are not that different from what > str(x)
List of 3
$ a: int [1:3, 1:3] 1 2 3 4 5 6 7 8 9
$ b: int [1:3, 1:3, 1:3] 1 2 3 4 5 6 7 8 9 10 ...
$ c:List of 2
..$ z : num 1
..$ zz:List of 2
.. ..$ : num 1
.. ..$ : num 2 the list looks quite different since it is compacted into a single row for the display of the tibble. For fun, I tried list of class > iris %>%
+ group_nest(Species) %>%
+ rowwise() %>%
+ summarise(lm = list(lm(Sepal.Length ~ Petal.Length, data = data))) %>%
+ mutate(lm = list_col(lm))
`summarise()` has ungrouped output. You can override using the `.groups` argument.
# A tibble: 3 x 1
lm
<list-col>
1 <lm c(`(Intercept)` = 4.21316822303424, Petal.Length = 0.542292597103803), c(`1` = 0.1276221410....>
2 <lm c(`(Intercept)` = 2.40752310536045, Petal.Length = 0.828280961182994), c(`1` = 0.6995563770....>
3 <lm c(`(Intercept)` = 1.05965909090909, Petal.Length = 0.995738636363637), c(`1` = -0.734090909....> That could certainly be improved but that shows that it should be possible to deal with various classes of non-atomic vectors. Improved outputs via methods for
|
Thanks. I think the easiest way to start is to expand the contents only for elements where |
When displaying tibbles with list-columns, it would be nice to be able to give a glimpse of the content within each cell.
For example, if the width is sufficient, instead of:
it would great to have something like what
str()
produces:I guess this could be done by defining one's own class and pillar method, but I think that it would be useful for any tibble.
Perhaps whether to expose the content of not could be set with a global formatting option.
A motivation is that it could play well with
dplyr::summarise()
when using function not outputting scalars:A difficulty is that any kind of content can be nested within a cell and not just vectors, but perhaps specific displays could be setup for the main class.
This is probably an issue for pillar, but the motivation is the display of tibbles.
The text was updated successfully, but these errors were encountered: