Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

line number in problems not correct after commented rows. #295

Open
jimhester opened this issue Jan 26, 2021 · 1 comment · May be fixed by #448
Open

line number in problems not correct after commented rows. #295

jimhester opened this issue Jan 26, 2021 · 1 comment · May be fixed by #448
Assignees
Labels
feature a feature request or enhancement readr 📖 Issues related to readr compatibility

Comments

@jimhester
Copy link
Collaborator

Because we aren't keeping track of how many lines were skipped.

@jimhester jimhester added the readr 📖 Issues related to readr compatibility label Jan 26, 2021
@jimhester jimhester changed the title line number in problems not correct after commented columns. line number in problems not correct after commented rows. Sep 14, 2021
@s-andrews
Copy link

I think I hit this too, but it plays into a slightly larger issue which makes interpreting the output of problems more difficult.

For example:


library(tidyverse)
library(vroom)
tibble(
  A=c(1,"two"),
  B=c(1,2)
) %>%
  write_csv("import_bug.csv")

vroom(
  "import_bug.csv", 
  col_types=cols(A=col_double()))-> data

problems(data)

data%>%slice(3)

data%>%slice(2)

Here problems reports a problem on row 3, but when you look at row 3 in the data there's no problem. That's because it's not counting the header, and it's actually on row 2, which is line 3.

tibble(
  A=c("#skip",1,"two"),
  B=c(0,1,2)
) %>%
  write_csv("import_bug2.csv")

vroom(
  "import_bug2.csv", 
  col_types=cols(A=col_double()),
  comment="#")-> data2

problems(data2)

data%>%slice(3)
data%>%slice(2)

In this instance problems still reports the issue on row 3, but now 3 is neither the row in the tibble, nor the line in the file.

Ideally it would be nice if problems were to report both row (the row in the returned tibble) and line (the line in the parsed file) where the parsing failure occurred.

Note also that the reported problems are also messed up by skip_empty_rows and skip arguments.

@jimhester jimhester added the feature a feature request or enhancement label Nov 9, 2021
@sbearrows sbearrows self-assigned this Apr 7, 2022
@sbearrows sbearrows linked a pull request Jun 22, 2022 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement readr 📖 Issues related to readr compatibility
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants