Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in data_filter() when rendering Rmd with lapply() #379

Open
etiennebacher opened this issue Mar 15, 2023 · 4 comments
Open

Error in data_filter() when rendering Rmd with lapply() #379

etiennebacher opened this issue Mar 15, 2023 · 4 comments

Comments

@etiennebacher
Copy link
Member

etiennebacher commented Mar 15, 2023

Reprex:

  • save this in "vignettes/reprex.Rmd"
---
output: rmarkdown::html_document
---

```{r}
dat <- data.frame() # dataset doesn't matter here
test <- c("a", "b", "c")
x <- "a"

data_filter(dat, x %in% test)
```
  • remove all objects from global env
  • run lapply("vignettes/reprex.Rmd", rmarkdown::render)

Originally posted by @etiennebacher in easystats/easystats#356 (comment)

@etiennebacher etiennebacher changed the title Error in data_filter() when rendering Rmd with purrr::map Error in data_filter() when rendering Rmd with lapply() Mar 15, 2023
@etiennebacher
Copy link
Member Author

Also errors with purrr::map()

@strengejacke
Copy link
Member

Now that we have .dynEval() and the experience with data_modify(), we could probably revise the code for data_filter(), to make it work with string variables?

@etiennebacher
Copy link
Member Author

Simpler reprex:

library(datawizard)
lapply(1, \(x) {
  dat <- data.frame() # dataset doesn't matter here
  test <- c("a", "b", "c")
  x <- "a"
  
  data_filter(dat, x %in% test)
})
#> Error: Filtering did not work. Please check the syntax of your conditions.

Created on 2023-09-30 with reprex v2.0.2

@etiennebacher
Copy link
Member Author

The problem is that the .dynEval() call here is problematic:

eval_symbol <- .dynEval(symbol, ifnotfound = NULL)

x is already defined in the function (it's the name of the dataframe argument) and will be (incorrectly) evaluated by .dynEval(). Even if we could start the search for x in the parent environment to avoid this, we'd still need to evaluate x in its dataframe first to get its values, and evaluate test in another environment. This is probably why the tidyverse prefers when we use .data$ or .env$ explicitly, it removes the risk of evaluating something in the wrong environment.

We should probably do something similar but this doesn't seem so simple. Also, the fact that data_filter() accepts both unquoted and quoted expressions doesn't help. Overall I think that data_filter() should be completely rewritten because the code looks like a lot of patches, but it would be good to clarify the design it should have first.

@IndrajeetPatil @strengejacke

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants