-
Notifications
You must be signed in to change notification settings - Fork 412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NA handling in unite #203
Comments
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Well, I think |
I think in some cases the omit NA option could be useful. My
So I was trying to combine the columns into one and remove the NA to see things better.
The solution is not hard, just not quite as tidy. |
This comment has been minimized.
This comment has been minimized.
This is not the requested solution, but a clean way to get the desired result is: library(tidyverse)
df <- tribble(
~ID, ~d1, ~d2,
1, "G", "G",
2, "A", "G",
3, "A", "A",
4, "G", "A",
5, NA, NA,
6, "G", "G")
df %>%
replace_na(list(d1 = "", d2 = "")) %>%
unite(new, d1, d2, remove = FALSE, sep = "")
#> # A tibble: 6 × 4
#> ID new d1 d2
#> * <dbl> <chr> <chr> <chr>
#> 1 1 GG G G
#> 2 2 AG A G
#> 3 3 AA A A
#> 4 4 GA G A
#> 5 5
#> 6 6 GG G G |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
I'm not convinced that library(tidyr)
example <- tibble::data_frame(x = c('foo', 'foo bar', 'foo bar baz'))
example %>% separate(x, c('foo', 'bar', 'baz'), fill = 'right') # without `fill = 'right'` same result with a message
#> # A tibble: 3 x 3
#> foo bar baz
#> * <chr> <chr> <chr>
#> 1 foo <NA> <NA>
#> 2 foo bar <NA>
#> 3 foo bar baz
example %>%
separate(x, c('foo', 'bar', 'baz'), fill = 'right') %>%
unite(x, foo:baz, sep = ' ')
#> # A tibble: 3 x 1
#> x
#> * <chr>
#> 1 foo NA NA
#> 2 foo bar NA
#> 3 foo bar baz If |
There are actually two feature requests in this thread:
2. seems like the more useful option so I will implement that. @alexpghayes the plan is to extract out a general helper for turning the vectorised functions that power many tidyr functions in a tibblicious version |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Minimal reprex library(tidyr)
df <- expand_grid(x = c("a", NA), y = c("b", NA))
unite(df, z, c("x", "y"), remove = FALSE)
#> # A tibble: 4 x 3
#> z x y
#> <chr> <chr> <chr>
#> 1 a_b a b
#> 2 a_NA a <NA>
#> 3 NA_b <NA> b
#> 4 NA_NA <NA> <NA> Created on 2019-03-07 by the reprex package (v0.2.1.9000) |
Note that you'll need library(tidyr)
df <- expand_grid(x = c("a", NA), y = c("b", NA))
df %>% unite("z", x:y, na.rm = TRUE, remove = FALSE)
#> # A tibble: 4 x 3
#> z x y
#> <chr> <chr> <chr>
#> 1 a_b a b
#> 2 a a <NA>
#> 3 b <NA> b
#> 4 "" <NA> <NA> Created on 2019-03-07 by the reprex package (v0.2.1.9000) |
Hi @hadley , I am having trouble getting na.rm = TRUE to work within the unite() function. I tried the following:
> library("tidyr")
> df <- expand.grid(x = c("a", NA), y = c("b", NA))
> df
x y
1 a b
2 <NA> b
3 a <NA>
4 <NA> <NA>
> df %>% unite("z", x:y, na.rm = TRUE, remove = FALSE)
Error: `TRUE` must evaluate to column positions or names, not a logical vector
Call `rlang::last_error()` to see a backtrace Which gives me this error: Error: `TRUE` must evaluate to column positions or names, not a logical vector
Call `rlang::last_error()` to see a backtrace Backtracing error: > rlang::last_error()
<error>
message: `TRUE` must evaluate to column positions or names, not a logical vector
class: `rlang_error`
backtrace:
1. tidyr::unite(., "z", x:y, na.rm = TRUE, remove = FALSE)
10. tidyselect::vars_select(colnames(data), ...)
11. tidyselect:::bad_calls(bad, "must evaluate to { singular(.vars) } positions or names, \\\n not { first_type }")
12. tidyselect:::glubort(fmt_calls(calls), ..., .envir = .envir)
13. tidyr::unite(., "z", x:y, na.rm = TRUE, remove = FALSE)
Call `rlang::last_trace()` to see the full backtrace
> rlang::last_trace()
x
1. \-df %>% unite("z", x:y, na.rm = TRUE, remove = FALSE)
2. +-base::withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
3. \-base::eval(quote(`_fseq`(`_lhs`)), env, env)
4. \-base::eval(quote(`_fseq`(`_lhs`)), env, env)
5. \-global::`_fseq`(`_lhs`)
6. \-magrittr::freduce(value, `_function_list`)
7. +-base::withVisible(function_list[[k]](value))
8. \-function_list[[k]](value)
9. +-tidyr::unite(., "z", x:y, na.rm = TRUE, remove = FALSE)
10. \-tidyr:::unite.data.frame(., "z", x:y, na.rm = TRUE, remove = FALSE)
11. \-tidyselect::vars_select(colnames(data), ...)
12. \-tidyselect:::bad_calls(bad, "must evaluate to { singular(.vars) } positions or names, \\\n not { first_type }")
13. \-tidyselect:::glubort(fmt_calls(calls), ..., .envir = .envir) |
@kasperav you probably have not installed the development version of tidyr. |
@hadley you are right! I have no luck with installing the dev version, so I'll wait for this to be implemented in a CRAN version of tidyr :) |
FWIW, I found the behavior where unite takes two NA values and produces an empty string to be very confusing and unexpected. Seems clear to me that uniting two NA values should produce an NA value. I'm guessing this is clearer to people who have used |
I have a use case where I need to use
|
What version are you using? That's not the result I get (on 1.0.2.9000) suppressPackageStartupMessages(require(tidyverse))
df_notwork <- expand_grid(x = c("a", NA), y = c(NA, NA))
df_notwork %>% unite("z", x:y, na.rm = T, remove = FALSE)
#> # A tibble: 4 x 3
#> z x y
#> <chr> <chr> <lgl>
#> 1 "a" a NA
#> 2 "a" a NA
#> 3 "" <NA> NA
#> 4 "" <NA> NA Created on 2020-02-25 by the reprex package (v0.3.0) |
I am using a newer version.
|
|
Oh, sorry I saw that you were loading Even with updating tidyr using the GitHub version, I still have that issue. Maybe it is another out-of-date package?
|
Interesting. I'm not sure why we are getting different results. Regardless, it looks to me as if your Yes, I would try update your other packages and see if that solves it. But since both |
It appears that my version of
|
Hello, I've updated to all the latest versions of the packages (tidyr 1.0.2.900, tidyselect 1.0.0) and I'm still getting the same error. I tried Lindsay's df_notwork, and get the same version as what she has prior to the updates. Any help would be appreciated! |
@anjaollodart - perhaps you can try updating additional packages that |
Dear Lindsay,
I solved this issue on my own when I used my own data frame (not one that
was in the example). And as soon as I did it, in 10-15 minutes, I deleted
the comment because the issue was not about this function.
It is strange that GitHub still put this comment through.
Thank you,
Julia
…On Thu, Apr 2, 2020 at 4:02 PM Lindsay (Carr) Platt < ***@***.***> wrote:
@anjaollodart <https://github.com/anjaollodart> - perhaps you can try
updating additional packages that tidyr depends on. It's just a guess,
but the need to separately update tidyselect from tidyr was surprising to
me, so maybe there is another package dependency that has the same issue.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#203 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFUQAKM6SJFDD6MZGPN2B6DRKSLHHANCNFSM4CGRBQRQ>
.
|
Consider the following
df
:When uniting
d1
andd2
:tidyr::unite(df, new, d1, d2, remove = FALSE, sep = "")
Row 5 gives
NANA
instead of the expectedNA
The text was updated successfully, but these errors were encountered: