Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve error message when pivot_longer() is applied on data with a "name" column #1494

Open
DanChaltiel opened this issue Apr 14, 2023 · 3 comments
Labels
feature a feature request or enhancement pivoting ♻️ pivot rectangular data to different "shapes"

Comments

@DanChaltiel
Copy link

Hi,

When you apply pivot_longer() on a dataframe that already has a column named name, the error message is confusing:

library(tidyverse)
tibble(name=c("x", "y"), a=c("a", "b"), b=c("c", "d")) %>% 
  pivot_longer(-name)
#> Error in `pivot_longer()`:
#> ! Names must be unique.
#> x These names are duplicated:
#>   * "name" at locations 1 and 2.
#> i Use argument `names_repair` to specify repair strategy.

Created on 2023-04-14 with reprex v2.0.2

Wouldn't it be possible to check if data has a column named like names_to and hint the user to use names_to to correct the problem instead of names_repair which doesn't help much here?

@hadley
Copy link
Member

hadley commented Nov 1, 2023

A bit like #1367, but I think the root cause is the default values for names_to. I'm not sure what we should do about this.

@hadley hadley added feature a feature request or enhancement pivoting ♻️ pivot rectangular data to different "shapes" labels Nov 1, 2023
@DanChaltiel
Copy link
Author

Here is how I would somehow formulate it:

library(tidyverse)

pivot_longer2 = function(data, cols, names_to = "name", ...){
  wrong_names = names(data)[names(data) %in% names_to]
  if(length(wrong_names)>0){
    cli::cli_abort("{.arg names_to}={.val {wrong_names}} is already used in {.arg data} column names.")
  }
  pivot_longer(data=data, cols={{cols}}, names_to=names_to, ...)
}

tibble(name=c("x", "y"), a=c("a", "b"), b=c("c", "d")) %>% 
  pivot_longer2(-name)
#> Error in `pivot_longer2()`:
#> ! `names_to`="name" is already used in `data` column names.
tibble(name2=c("x", "y"), a=c("a", "b"), b=c("c", "d")) %>% 
  pivot_longer2(-name2)
#> # A tibble: 4 x 3
#>   name2 name  value
#>   <chr> <chr> <chr>
#> 1 x     a     a    
#> 2 x     b     c    
#> 3 y     a     b    
#> 4 y     b     d

Created on 2023-11-02 with reprex v2.0.2

@jpiaskowski
Copy link

One solution could be including a note in the documentation that if you have a column called "name" it will result in an error. At least that tells users what the problem is so we spend less time scratching our heads, wondering what we did wrong. (this just happened to me today, fortunately renaming the column is a very easy fix)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement pivoting ♻️ pivot rectangular data to different "shapes"
Projects
None yet
Development

No branches or pull requests

3 participants