Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vectorise replacement function #462

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open

Vectorise replacement function #462

wants to merge 14 commits into from

Conversation

hadley
Copy link
Member

@hadley hadley commented Oct 4, 2022

This breaks 7 revdeps but I think it's worth it because the performance improvement is immense.

@hadley
Copy link
Member Author

hadley commented Oct 4, 2022

It also breaks rmarkdown 😬

Error in `if (grepl("^[.][.]", in_file)) ...`:
! the condition has length > 1
---
Backtrace:
     ▆
  1. └─pkgdown::build_article("from-base", new_process = FALSE)
  2.   └─pkgdown:::render_rmarkdown(...) at pkgdown/R/build-articles.R:281:2
  3.     ├─rlang::inject(rmarkdown::render(!!!args)) at pkgdown/R/rmarkdown.R:52:4
  4.     └─rmarkdown::render(...) at rlang/R/eval.R:195:2
  5.       └─output_format$post_processor(...)
  6.         └─rmarkdown (local) base(metadata, input_file, output_file, ...)
  7.           └─rmarkdown:::process_images(output_str, image_relative)
  8.             └─rmarkdown:::process_html_res(...)
  9.               └─stringr::str_replace_all(html, reg, process_img_src)
 10.                 └─stringr:::str_transform_all(string, pattern, replacement) at stringr/R/replace.r:91:4
 11.                   └─rmarkdown (local) replacement(old_flat) at stringr/R/replace.r:201:4
 12.                     └─rmarkdown (local) processor(img_src, src)

@hadley
Copy link
Member Author

hadley commented Oct 4, 2022

rmarkdown patch in rstudio/rmarkdown#2416. Breaking rmarkdown is a big deal, so I don't think we should merge this until after rmarkdown is released, so it probably won't make it for this release.

@DavisVaughan
Copy link
Member

DavisVaughan commented Oct 4, 2022

I guess pkgdown is failing because rmarkdown is broken? Very meta - ah yes i see that in your backtrace

Copy link
Member

@DavisVaughan DavisVaughan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the broken packages are like rmarkdown? They didn't expect multiple strings at once?

R/replace.R Outdated Show resolved Hide resolved
R/replace.R Outdated

# unchop list into a vector, apply replacement(), and then rechop back into
# a list
old_flat <- vctrs::vec_unchop(old, idx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on how much you value readability over performance, you don't actually need to pass idx here since you are unchopping in the order the input was provided

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In dev vctrs:

xs <- list("a", c("b", "c"), c("d", "e", "f"))

x <- list_unchop(xs)
x <- toupper(x)
x
#> [1] "A" "B" "C" "D" "E" "F"

sizes <- list_sizes(xs)
vec_partition(x, sizes)

Comment on lines +197 to +199
# minor optimisation to avoid problems with the many replacement
# functions that use paste
new_flat <- character()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💀

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternate idea is to actually apply replacement() and then check if length(new_flat) == 1 && length(old_flat) == 0. In that special case you could issue a warning (possibly even mentioning paste0()) and "fix" it for them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we could do that, I just don't think it's worth the effort since we do know what the correct answer is in the 0 element case.

R/view.R Outdated Show resolved Hide resolved
tests/testthat/_snaps/replace.md Outdated Show resolved Hide resolved
@hadley
Copy link
Member Author

hadley commented Oct 5, 2022

Yeah, the failures seem to be cases that expected a single string. So there's always a trivial fix, which is to vectorise with vapply() or similar. When I do the revdep fixes, I'll try and vectorise properly so they can benefit from the performance boost.

@hadley
Copy link
Member Author

hadley commented Oct 20, 2022

Fixed in rmarkdown 2.17, but I think this will have to wait until the next release since otherwise installing stringr will break older (i.e. most) rmarkdown installed.

#Conflicts:
#	NEWS.md
#	R/replace.R
#	man/str_replace.Rd
#	revdep/README.md
#	revdep/cran.md
#	revdep/problems.md
#	tests/testthat/_snaps/replace.md
#	tests/testthat/test-replace.R
@hadley hadley added the breaking change ☠️ API change likely to affect existing code label Oct 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change ☠️ API change likely to affect existing code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants