Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syntax for rx_or() #16

Open
dmi3kno opened this issue Mar 14, 2019 · 1 comment
Open

Syntax for rx_or() #16

dmi3kno opened this issue Mar 14, 2019 · 1 comment

Comments

@dmi3kno
Copy link

dmi3kno commented Mar 14, 2019

Right now we have rx_or implementation which compares .data and value

##### Do not run
rx() %>% 
  rx_find("a") %>%
  rx_or("b") # or at best rx_or(rx_find("b"))

In the comments you mentioned:

##### Do not run
  # Not sure if I like this. I would prefer:
  # find(value = "foo") %>%
  #   or() %>%
  #   find("bar")
  # Rather than having to nest a rule inside of or(), maybe use glue?

Might the solution be similar to how now (in dev branch) we organized rx_one_of():

###### Do not run
rx() %>%
  rx_find("gr") %>%
  either_of(rx_find("a"), rx_find("e")) %>%
  rx_find("y")

In a sense, this is rx_one_of with (?:a|b) instead of [ab] and limited to two arguments only. I actually believe nothing prevents us from allowing more arguments, if we go down this route. I think going this route will add consistency to the package.

@tylerlittlefield
Copy link
Member

tylerlittlefield commented Mar 16, 2019

Adding rx_either_of and stumbled upon the inherent eagerness of the | alternator:

rx_either_of <- function(.data = NULL, ..., rep = NULL, mode = "greedy") {
  if (!inherits(.data, "rx_string")) stop("This function is not to be used as first element of the pipe! Please start pipe with constructor funcion rx()")
  san_args <- sapply(list(...), sanitize)
  san_args_peeled <- peel_set(san_args)
  res <- paste0(.data, "(?:", paste0(san_args_peeled, collapse = "|"), ")", parse_rep_mode(rep, mode))
  new_rx(res)
}

library(RVerbalExpressions)

# Alternation is eager!
rx() %>% 
  rx_either_of("GetValue", "Get", "Set", "SetValue") %>% 
  stringr::str_extract_all("Get, GetValue, Set or SetValue", .) %>% 
  .[[1]]
#> [1] "Get"      "GetValue" "Set"      "Set"

# Avoid eagerness with order of values
rx() %>% 
  rx_either_of("GetValue", "Get", "SetValue", "Set") %>% 
  stringr::str_extract_all("Get, GetValue, Set or SetValue", .) %>% 
  .[[1]]
#> [1] "Get"      "GetValue" "Set"      "SetValue"

# Avoid eagerness with word boundaries
rx() %>% 
  rx_word_edge() %>% 
  rx_either_of("GetValue", "Get", "Set", "SetValue") %>% 
  rx_word_edge() %>% 
  stringr::str_extract_all("Get, GetValue, Set or SetValue", .) %>% 
  .[[1]]
#> [1] "Get"      "GetValue" "Set"      "SetValue"

Should rx_either_of have an eager option which turns on word_boundaries? I'd prefer to not add more arguments but curious what you think. If we do decide to go with eager, should it be set to true? I think this is a rare case, so I'd prefer it to be false if we add the argument.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants