Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C implementations of vec_case_when() and vec_case_match() #1622

Open
DavisVaughan opened this issue Aug 22, 2022 · 2 comments
Open

C implementations of vec_case_when() and vec_case_match() #1622

DavisVaughan opened this issue Aug 22, 2022 · 2 comments
Assignees
Labels
dplyr feature a feature request or enhancement

Comments

@DavisVaughan
Copy link
Member

DavisVaughan commented Aug 22, 2022

And possibly vec_if_else() because it would be nice for, say, ggplot2 to be able to use this.

Consider if we can figure out some kind of 1:1 interface that doesn't always require a list for values and haystacks to nicely replace plyr::mapvalues() tidyverse/dplyr#7027 (the list approach is very powerful and general because it allows for 1:m and m:1 replacements, but is not always needed)

@DavisVaughan DavisVaughan added feature a feature request or enhancement dplyr labels Aug 22, 2022
@DavisVaughan DavisVaughan self-assigned this Aug 22, 2022
@olivroy
Copy link

olivroy commented May 8, 2024

Is it still planned? I saw this was the proposed solution to replacing splicing for dplyr::recode(). Using recode() is slowing down code because of lifecycle, so I wondered if I could rely on a faster vctrs implementation sometimes in the future..

Cf. tidyverse/dplyr#6623 (comment)

The bench marks for recreating the formula can be a bit expansive

# manually created
a_formula <- c("xx" ~ "x", "y" ~ "yy")

dplyr::case_match(
  c("x", "y", "z"),
  "zz" ~ "a",
  !!!a_formula,
  .default = NA_character_
)
#> [1] NA   "yy" NA

# What I have 
a_list <- c("xx" = "x", "y" = "yy")
dplyr::recode(
  c("x", "y", "z"),
  !!!a_list,
  .default = NA_character_
)
#> [1] NA   "yy" NA

# programatically recreated
a_formula_from_list <- purrr::map2(
  names(a_list),
  unname(a_list),
  rlang::new_formula
)
dplyr::case_match(
  c("x", "y", "z"),
  "zz" ~ "a",
  !!!a_formula_from_list,
  .default = NA_character_
)
#> [1] NA   "yy" NA

bench::mark(
  recode = dplyr::recode(
    c("x", "y", "z"),
    !!!a_list,
    .default = NA_character_
  ),
  casematch_program = {
    a_formula_from_list <- purrr::map2(
      names(a_list),
      unname(a_list),
      rlang::new_formula
    )
    dplyr::case_match(
      c("x", "y", "z"),
      "zz" ~ "a",
      !!!a_formula_from_list,
      .default = NA_character_
    )
  },
  casematch_regular = dplyr::case_match(
    c("x", "y", "z"),
    "zz" ~ "a",
    !!!a_formula,
    .default = NA_character_
  )
)
#> # A tibble: 3 × 6
#>   expression            min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>       <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 recode             802µs    888µs      988.        0B     8.43
#> 2 casematch_program  366µs    385µs     2355.     1.3KB    10.4 
#> 3 casematch_regular  295µs    311µs     2860.    1.05KB    10.4
# programatically recreating the values can become expansive

Created on 2024-05-07 with reprex v2.1.0

@DavisVaughan
Copy link
Member Author

Unfortunately it is taking us longer than expected to get some time for a vctrs release, but this is definitely still something I want to add, as I think a lot of people would like a low level type stable vec_if_else() that doesn't need dplyr (particularly ggplot2)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dplyr feature a feature request or enhancement
Projects
None yet
Development

No branches or pull requests

2 participants