Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

str_like case sensitivity #1490

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open

Conversation

edward-burn
Copy link

Relates to issue #1488

This pr would change behaviour to the following

library(dbplyr)
 
lf_postgres <- lazy_frame(a = TRUE, b = 1, c = 2, d = "zzzyzzz", 
                 con = simulate_postgres())
lf_redshift <- lazy_frame(a = TRUE, b = 1, c = 2, d = "zzzyzzz", 
                          con = simulate_redshift())
lf_mssql <- lazy_frame(a = TRUE, b = 1, c = 2, d = "zzzyzzz", 
                 con = simulate_mssql()) 
lf_snowflake <- lazy_frame(a = TRUE, b = 1, c = 2, d = "zzzyzzz", 
                       con = simulate_snowflake()) 
lf_spark <- lazy_frame(a = TRUE, b = 1, c = 2, d = "zzzyzzz", 
                           con = simulate_spark_sql()) 

# error for backends without ilike
lf_postgres |> 
  dplyr::filter(stringr::str_like(d, "Y", ignore_case = TRUE))
#> <SQL>
#> SELECT `df`.*
#> FROM `df`
#> WHERE (`d` ILIKE 'Y')
lf_redshift |> 
  dplyr::filter(stringr::str_like(d, "Y", ignore_case = TRUE))
#> <SQL>
#> SELECT `df`.*
#> FROM `df`
#> WHERE (`d` ILIKE 'Y')
lf_mssql |> 
  dplyr::filter(stringr::str_like(d, "Y", ignore_case = TRUE))
#> Error in `stringr::str_like()`:
#> ! Backend does not support case insensitve {.fn str_like}.
#> ℹ Set ignore_case = FALSE for case sensitive match.
#> ℹ Note, using `tolower()` to cast string and pattern to lower case would also achieve a case insenitive match.
#> Backtrace:
#>      ▆
#>   1. ├─base::tryCatch(...)
#>   2. │ └─base (local) tryCatchList(expr, classes, parentenv, handlers)
#>   3. │   ├─base (local) tryCatchOne(...)
#>   4. │   │ └─base (local) doTryCatch(return(expr), name, parentenv, handler)
#>   5. │   └─base (local) tryCatchList(expr, names[-nh], parentenv, handlers[-nh])
#>   6. │     └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
#>   7. │       └─base (local) doTryCatch(return(expr), name, parentenv, handler)
#>   8. ├─base::withCallingHandlers(...)
#>   9. ├─base::saveRDS(...)
#>  10. ├─base::do.call(...)
#>  11. ├─base (local) `<fn>`(...)
#>  12. ├─global `<fn>`(input = base::quote("stony-moth_reprex.R"))
#>  13. │ └─rmarkdown::render(input, quiet = TRUE, envir = globalenv(), encoding = "UTF-8")
#>  14. │   └─knitr::knit(knit_input, knit_output, envir = envir, quiet = quiet)
#>  15. │     └─knitr:::process_file(text, output)
#>  16. │       ├─knitr:::handle_error(...)
#>  17. │       │ └─base::withCallingHandlers(...)
#>  18. │       ├─base::withCallingHandlers(...)
#>  19. │       ├─knitr:::process_group(group)
#>  20. │       └─knitr:::process_group.block(group)
#>  21. │         └─knitr:::call_block(x)
#>  22. │           └─knitr:::block_exec(params)
#>  23. │             └─knitr:::eng_r(options)
#>  24. │               ├─knitr:::in_input_dir(...)
#>  25. │               │ └─knitr:::in_dir(input_dir(), expr)
#>  26. │               └─knitr (local) evaluate(...)
#>  27. │                 └─evaluate::evaluate(...)
#>  28. │                   └─evaluate:::evaluate_call(...)
#>  29. │                     ├─evaluate (local) handle(...)
#>  30. │                     │ └─base::try(f, silent = TRUE)
#>  31. │                     │   └─base::tryCatch(...)
#>  32. │                     │     └─base (local) tryCatchList(expr, classes, parentenv, handlers)
#>  33. │                     │       └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
#>  34. │                     │         └─base (local) doTryCatch(return(expr), name, parentenv, handler)
#>  35. │                     ├─base::withCallingHandlers(...)
#>  36. │                     ├─base::withVisible(value_fun(ev$value, ev$visible))
#>  37. │                     └─knitr (local) value_fun(ev$value, ev$visible)
#>  38. │                       └─knitr (local) fun(x, options = options)
#>  39. │                         ├─base::withVisible(knit_print(x, ...))
#>  40. │                         ├─knitr::knit_print(x, ...)
#>  41. │                         └─knitr:::knit_print.default(x, ...)
#>  42. │                           └─evaluate (local) normal_print(x)
#>  43. │                             ├─base::print(x)
#>  44. │                             └─dbplyr:::print.tbl_lazy(x)
#>  45. │                               ├─dplyr::show_query(x) at dbplyr/R/tbl-lazy.R:49:3
#>  46. │                               └─dbplyr:::show_query.tbl_lazy(x)
#>  47. │                                 └─dbplyr::remote_query(x, cte = cte, sql_options = sql_options) at dbplyr/R/explain.R:5:3
#>  48. │                                   └─dbplyr::db_sql_render(remote_con(x), x, cte = cte, sql_options = sql_options) at dbplyr/R/remote.R:99:3
#>  49. │                                     ├─dbplyr::db_sql_render(con, sql, ..., sql_options = sql_options) at dbplyr/R/db.R:76:5
#>  50. │                                     ├─dbplyr:::`db_sql_render.Microsoft SQL Server`(...) at dbplyr/R/db.R:80:3
#>  51. │                                     ├─base::NextMethod() at dbplyr/R/backend-mssql.R:635:3
#>  52. │                                     └─dbplyr:::db_sql_render.DBIConnection(con, sql, ..., sql_options = sql_options)
#>  53. │                                       ├─dbplyr::sql_render(sql, con = con, ..., sql_options = sql_options) at dbplyr/R/db.R:84:3
#>  54. │                                       └─dbplyr:::sql_render.tbl_lazy(sql, con = con, ..., sql_options = sql_options) at dbplyr/R/sql-build.R:90:3
#>  55. │                                         ├─dbplyr::sql_render(...) at dbplyr/R/sql-build.R:101:3
#>  56. │                                         └─dbplyr:::sql_render.lazy_query(...) at dbplyr/R/sql-build.R:90:3
#>  57. │                                           ├─dbplyr::sql_build(query, con = con, sql_options = sql_options) at dbplyr/R/sql-build.R:118:3
#>  58. │                                           └─dbplyr:::sql_build.lazy_select_query(query, con = con, sql_options = sql_options) at dbplyr/R/sql-build.R:43:3
#>  59. │                                             └─dbplyr::translate_sql_(op$where, con = con, context = list(clause = "WHERE")) at dbplyr/R/lazy-select-query.R:177:3
#>  60. │                                               └─base::lapply(...) at dbplyr/R/translate-sql.R:142:3
#>  61. │                                                 └─dbplyr (local) FUN(X[[i]], ...)
#>  62. │                                                   ├─dbplyr::escape(eval_tidy(x, mask), con = con) at dbplyr/R/translate-sql.R:149:7
#>  63. │                                                   └─rlang::eval_tidy(x, mask) at dbplyr/R/translate-sql.R:149:7
#>  64. └─stringr::str_like(d, "Y", ignore_case = TRUE)
#>  65.   └─rlang::abort(bullets) at dbplyr/R/backend-.R:290:7
lf_snowflake |> 
  dplyr::filter(stringr::str_like(d, "Y", ignore_case = TRUE))
#> Error in `stringr::str_like()`:
#> ! Backend does not support case insensitve {.fn str_like}.
#> ℹ Set ignore_case = FALSE for case sensitive match.
#> ℹ Note, using `tolower()` to cast string and pattern to lower case would also achieve a case insenitive match.
#> Backtrace:
#>      ▆
#>   1. ├─base::tryCatch(...)
#>   2. │ └─base (local) tryCatchList(expr, classes, parentenv, handlers)
#>   3. │   ├─base (local) tryCatchOne(...)
#>   4. │   │ └─base (local) doTryCatch(return(expr), name, parentenv, handler)
#>   5. │   └─base (local) tryCatchList(expr, names[-nh], parentenv, handlers[-nh])
#>   6. │     └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
#>   7. │       └─base (local) doTryCatch(return(expr), name, parentenv, handler)
#>   8. ├─base::withCallingHandlers(...)
#>   9. ├─base::saveRDS(...)
#>  10. ├─base::do.call(...)
#>  11. ├─base (local) `<fn>`(...)
#>  12. ├─global `<fn>`(input = base::quote("stony-moth_reprex.R"))
#>  13. │ └─rmarkdown::render(input, quiet = TRUE, envir = globalenv(), encoding = "UTF-8")
#>  14. │   └─knitr::knit(knit_input, knit_output, envir = envir, quiet = quiet)
#>  15. │     └─knitr:::process_file(text, output)
#>  16. │       ├─knitr:::handle_error(...)
#>  17. │       │ └─base::withCallingHandlers(...)
#>  18. │       ├─base::withCallingHandlers(...)
#>  19. │       ├─knitr:::process_group(group)
#>  20. │       └─knitr:::process_group.block(group)
#>  21. │         └─knitr:::call_block(x)
#>  22. │           └─knitr:::block_exec(params)
#>  23. │             └─knitr:::eng_r(options)
#>  24. │               ├─knitr:::in_input_dir(...)
#>  25. │               │ └─knitr:::in_dir(input_dir(), expr)
#>  26. │               └─knitr (local) evaluate(...)
#>  27. │                 └─evaluate::evaluate(...)
#>  28. │                   └─evaluate:::evaluate_call(...)
#>  29. │                     ├─evaluate (local) handle(...)
#>  30. │                     │ └─base::try(f, silent = TRUE)
#>  31. │                     │   └─base::tryCatch(...)
#>  32. │                     │     └─base (local) tryCatchList(expr, classes, parentenv, handlers)
#>  33. │                     │       └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
#>  34. │                     │         └─base (local) doTryCatch(return(expr), name, parentenv, handler)
#>  35. │                     ├─base::withCallingHandlers(...)
#>  36. │                     ├─base::withVisible(value_fun(ev$value, ev$visible))
#>  37. │                     └─knitr (local) value_fun(ev$value, ev$visible)
#>  38. │                       └─knitr (local) fun(x, options = options)
#>  39. │                         ├─base::withVisible(knit_print(x, ...))
#>  40. │                         ├─knitr::knit_print(x, ...)
#>  41. │                         └─knitr:::knit_print.default(x, ...)
#>  42. │                           └─evaluate (local) normal_print(x)
#>  43. │                             ├─base::print(x)
#>  44. │                             └─dbplyr:::print.tbl_lazy(x)
#>  45. │                               ├─dplyr::show_query(x) at dbplyr/R/tbl-lazy.R:49:3
#>  46. │                               └─dbplyr:::show_query.tbl_lazy(x)
#>  47. │                                 └─dbplyr::remote_query(x, cte = cte, sql_options = sql_options) at dbplyr/R/explain.R:5:3
#>  48. │                                   └─dbplyr::db_sql_render(remote_con(x), x, cte = cte, sql_options = sql_options) at dbplyr/R/remote.R:99:3
#>  49. │                                     ├─dbplyr::db_sql_render(con, sql, ..., sql_options = sql_options) at dbplyr/R/db.R:76:5
#>  50. │                                     └─dbplyr:::db_sql_render.DBIConnection(con, sql, ..., sql_options = sql_options) at dbplyr/R/db.R:80:3
#>  51. │                                       ├─dbplyr::sql_render(sql, con = con, ..., sql_options = sql_options) at dbplyr/R/db.R:84:3
#>  52. │                                       └─dbplyr:::sql_render.tbl_lazy(sql, con = con, ..., sql_options = sql_options) at dbplyr/R/sql-build.R:90:3
#>  53. │                                         ├─dbplyr::sql_render(...) at dbplyr/R/sql-build.R:101:3
#>  54. │                                         └─dbplyr:::sql_render.lazy_query(...) at dbplyr/R/sql-build.R:90:3
#>  55. │                                           ├─dbplyr::sql_build(query, con = con, sql_options = sql_options) at dbplyr/R/sql-build.R:118:3
#>  56. │                                           └─dbplyr:::sql_build.lazy_select_query(query, con = con, sql_options = sql_options) at dbplyr/R/sql-build.R:43:3
#>  57. │                                             └─dbplyr::translate_sql_(op$where, con = con, context = list(clause = "WHERE")) at dbplyr/R/lazy-select-query.R:177:3
#>  58. │                                               └─base::lapply(...) at dbplyr/R/translate-sql.R:142:3
#>  59. │                                                 └─dbplyr (local) FUN(X[[i]], ...)
#>  60. │                                                   ├─dbplyr::escape(eval_tidy(x, mask), con = con) at dbplyr/R/translate-sql.R:149:7
#>  61. │                                                   └─rlang::eval_tidy(x, mask) at dbplyr/R/translate-sql.R:149:7
#>  62. └─stringr::str_like(d, "Y", ignore_case = TRUE)
#>  63.   └─rlang::abort(bullets) at dbplyr/R/backend-.R:290:7
lf_spark |> 
  dplyr::filter(stringr::str_like(d, "Y", ignore_case = TRUE))
#> Error in `stringr::str_like()`:
#> ! Backend does not support case insensitve {.fn str_like}.
#> ℹ Set ignore_case = FALSE for case sensitive match.
#> ℹ Note, using `tolower()` to cast string and pattern to lower case would also achieve a case insenitive match.
#> Backtrace:
#>      ▆
#>   1. ├─base::tryCatch(...)
#>   2. │ └─base (local) tryCatchList(expr, classes, parentenv, handlers)
#>   3. │   ├─base (local) tryCatchOne(...)
#>   4. │   │ └─base (local) doTryCatch(return(expr), name, parentenv, handler)
#>   5. │   └─base (local) tryCatchList(expr, names[-nh], parentenv, handlers[-nh])
#>   6. │     └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
#>   7. │       └─base (local) doTryCatch(return(expr), name, parentenv, handler)
#>   8. ├─base::withCallingHandlers(...)
#>   9. ├─base::saveRDS(...)
#>  10. ├─base::do.call(...)
#>  11. ├─base (local) `<fn>`(...)
#>  12. ├─global `<fn>`(input = base::quote("stony-moth_reprex.R"))
#>  13. │ └─rmarkdown::render(input, quiet = TRUE, envir = globalenv(), encoding = "UTF-8")
#>  14. │   └─knitr::knit(knit_input, knit_output, envir = envir, quiet = quiet)
#>  15. │     └─knitr:::process_file(text, output)
#>  16. │       ├─knitr:::handle_error(...)
#>  17. │       │ └─base::withCallingHandlers(...)
#>  18. │       ├─base::withCallingHandlers(...)
#>  19. │       ├─knitr:::process_group(group)
#>  20. │       └─knitr:::process_group.block(group)
#>  21. │         └─knitr:::call_block(x)
#>  22. │           └─knitr:::block_exec(params)
#>  23. │             └─knitr:::eng_r(options)
#>  24. │               ├─knitr:::in_input_dir(...)
#>  25. │               │ └─knitr:::in_dir(input_dir(), expr)
#>  26. │               └─knitr (local) evaluate(...)
#>  27. │                 └─evaluate::evaluate(...)
#>  28. │                   └─evaluate:::evaluate_call(...)
#>  29. │                     ├─evaluate (local) handle(...)
#>  30. │                     │ └─base::try(f, silent = TRUE)
#>  31. │                     │   └─base::tryCatch(...)
#>  32. │                     │     └─base (local) tryCatchList(expr, classes, parentenv, handlers)
#>  33. │                     │       └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
#>  34. │                     │         └─base (local) doTryCatch(return(expr), name, parentenv, handler)
#>  35. │                     ├─base::withCallingHandlers(...)
#>  36. │                     ├─base::withVisible(value_fun(ev$value, ev$visible))
#>  37. │                     └─knitr (local) value_fun(ev$value, ev$visible)
#>  38. │                       └─knitr (local) fun(x, options = options)
#>  39. │                         ├─base::withVisible(knit_print(x, ...))
#>  40. │                         ├─knitr::knit_print(x, ...)
#>  41. │                         └─knitr:::knit_print.default(x, ...)
#>  42. │                           └─evaluate (local) normal_print(x)
#>  43. │                             ├─base::print(x)
#>  44. │                             └─dbplyr:::print.tbl_lazy(x)
#>  45. │                               ├─dplyr::show_query(x) at dbplyr/R/tbl-lazy.R:49:3
#>  46. │                               └─dbplyr:::show_query.tbl_lazy(x)
#>  47. │                                 └─dbplyr::remote_query(x, cte = cte, sql_options = sql_options) at dbplyr/R/explain.R:5:3
#>  48. │                                   └─dbplyr::db_sql_render(remote_con(x), x, cte = cte, sql_options = sql_options) at dbplyr/R/remote.R:99:3
#>  49. │                                     ├─dbplyr::db_sql_render(con, sql, ..., sql_options = sql_options) at dbplyr/R/db.R:76:5
#>  50. │                                     └─dbplyr:::db_sql_render.DBIConnection(con, sql, ..., sql_options = sql_options) at dbplyr/R/db.R:80:3
#>  51. │                                       ├─dbplyr::sql_render(sql, con = con, ..., sql_options = sql_options) at dbplyr/R/db.R:84:3
#>  52. │                                       └─dbplyr:::sql_render.tbl_lazy(sql, con = con, ..., sql_options = sql_options) at dbplyr/R/sql-build.R:90:3
#>  53. │                                         ├─dbplyr::sql_render(...) at dbplyr/R/sql-build.R:101:3
#>  54. │                                         └─dbplyr:::sql_render.lazy_query(...) at dbplyr/R/sql-build.R:90:3
#>  55. │                                           ├─dbplyr::sql_build(query, con = con, sql_options = sql_options) at dbplyr/R/sql-build.R:118:3
#>  56. │                                           └─dbplyr:::sql_build.lazy_select_query(query, con = con, sql_options = sql_options) at dbplyr/R/sql-build.R:43:3
#>  57. │                                             └─dbplyr::translate_sql_(op$where, con = con, context = list(clause = "WHERE")) at dbplyr/R/lazy-select-query.R:177:3
#>  58. │                                               └─base::lapply(...) at dbplyr/R/translate-sql.R:142:3
#>  59. │                                                 └─dbplyr (local) FUN(X[[i]], ...)
#>  60. │                                                   ├─dbplyr::escape(eval_tidy(x, mask), con = con) at dbplyr/R/translate-sql.R:149:7
#>  61. │                                                   └─rlang::eval_tidy(x, mask) at dbplyr/R/translate-sql.R:149:7
#>  62. └─stringr::str_like(d, "Y", ignore_case = TRUE)
#>  63.   └─rlang::abort(bullets) at dbplyr/R/backend-.R:290:7

# like used for all
lf_postgres |> 
  dplyr::filter(stringr::str_like(d, "Y", ignore_case = FALSE))
#> <SQL>
#> SELECT `df`.*
#> FROM `df`
#> WHERE (`d` LIKE 'Y')
lf_redshift |> 
  dplyr::filter(stringr::str_like(d, "Y", ignore_case = FALSE))
#> <SQL>
#> SELECT `df`.*
#> FROM `df`
#> WHERE (`d` LIKE 'Y')
lf_mssql |> 
  dplyr::filter(stringr::str_like(d, "Y", ignore_case = FALSE))
#> <SQL>
#> SELECT `df`.*
#> FROM `df`
#> WHERE (`d` LIKE 'Y')
lf_snowflake |> 
  dplyr::filter(stringr::str_like(d, "Y", ignore_case = FALSE))
#> <SQL>
#> SELECT `df`.*
#> FROM `df`
#> WHERE (`d` LIKE 'Y')
lf_spark |> 
  dplyr::filter(stringr::str_like(d, "Y", ignore_case = FALSE))
#> <SQL>
#> SELECT `df`.*
#> FROM `df`
#> WHERE (`d` LIKE 'Y')

Created on 2024-04-04 with reprex v2.0.2

If ignore_case is true, an error is thrown for dbms without ILIKE. If ignore_case is false, LIKE is used consistently across dbms
Copy link
Member

@hadley hadley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this, it's much appreciated!

As well as the changes below, could you please add a bullet to the top of NEWS.md? It should briefly describe the change and end with (@yourname, #issuenumber).

R/backend-.R Show resolved Hide resolved
R/backend-.R Outdated
cli_abort(c(
"Backend does not support case insensitve {.fn str_like}.",
i = "Set ignore_case = FALSE for case sensitive match.",
i = "Note, using `tolower()` to cast string and pattern to lower case would also achieve a case insenitive match."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
i = "Note, using `tolower()` to cast string and pattern to lower case would also achieve a case insenitive match."
i = "Or use `tolower()` on both arguments to achieve a case insensitive match."

?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have updated as suggested

@@ -103,9 +103,9 @@ test_that("lead and lag translate n to integers", {

test_that("can translate case insensitive like", {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
test_that("can translate case insensitive like", {
test_that("can't translate case insensitive like", {

?

Could you please also add a test for case sensitive like? Or maybe come up with a more encompassing test name and test both sides of this function in the same test?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the tests. I'm not sure how to add more tests for case sensitive like here beyond checking for the expected errors. Where ILIKE is being used, for example for postgres, there are tests of it in the backend specific tests

tests/testthat/test-backend-.R Outdated Show resolved Hide resolved
R/backend-.R Outdated
@@ -280,11 +280,16 @@ base_scalar <- sql_translator(
str_trim = sql_str_trim,
str_c = sql_paste(""),
str_sub = sql_str_sub("SUBSTR"),
# https://docs.getdbt.com/sql-reference/like is typically case sensitive
str_like = function(string, pattern, ignore_case = TRUE) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if ignore_case should default to FALSE? I filed an issue in stringr to make sure to fix it there too: tidyverse/stringr#543.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I do see that but for now I have left this as is in this pr because it seems that would first need to change in stringr before changing here in dbplyr? I also have commented on that issue with another idea (stringr exporting a str_ilike function)

Perhaps for this pr it can be kept as is, and then changed in the future to reflect changes as and when they happen in stringr? But equally I'm happy to change things here if you prefer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to just make the change now — it's technically out of sync but I don't think this function is important enough to manage the updates across multiple package updates.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I have changed the default to FALSE in my latest commit

(I also opened a pr on stringr to introduce str_ilike tidyverse/stringr#544, but I imagine it would be best to wait for a future version of stringr to be released with that, if you´re happy with it, before adding it in here in dbplyr)

@edward-burn
Copy link
Author

@hadley @mgirlich just fyi I noticed this pr had a merge conflict so I have resolved that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants