Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read_delim parsing issue with compressed file #1534

Open
rvalieris opened this issue Mar 23, 2024 · 0 comments
Open

read_delim parsing issue with compressed file #1534

rvalieris opened this issue Mar 23, 2024 · 0 comments

Comments

@rvalieris
Copy link

Trying to read the attached file with read_delim results in the following error:

Attached file: f.log.gz

r$> a = read_delim('f.log.gz', delim=' | ',col_names=F,col_types='cccc')
Warning message:
One or more parsing issues, call `problems()` on your data frame for details, e.g.:
  dat <- vroom(...)
  problems(dat)

r$> problems(a)
# A tibble: 1 × 5
    row   col expected  actual    file
  <int> <int> <chr>     <chr>     <chr>
1  1494     3 4 columns 3 columns ""

r$> a[1494,]
# A tibble: 1 × 4
  X1    X2              X3                                        X4
  <chr> <chr>           <chr>                                     <chr>
1 15:26 07 | 三特東喰赤 いのけん(+33) まあぷ(-10) 陸奥陽之助(-23)       NA

However, the indicated line does have 4 columns (note the | on X2 column), and if I uncompress the file before calling read_delim it parses it fine.

I was not able to reduce the file further than this and still reproduce the issue, so it seems the issue is not related to that specific line.

Env info:
Linux
R 4.3.3
readr 2.1.5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant