Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with dec=',', sep = 'auto' but detected as ',' #4483

Closed
MichaelChirico opened this issue May 24, 2020 · 2 comments · Fixed by #4495
Closed

Error with dec=',', sep = 'auto' but detected as ',' #4483

MichaelChirico opened this issue May 24, 2020 · 2 comments · Fixed by #4495
Labels
Milestone

Comments

@MichaelChirico
Copy link
Member

Possibly related: #2750

fread('A,B,C\n1,+,4\n2,-,5\n3,-,6\n', dec=',', verbose=TRUE)

Stopped early on line 2. Expected 3 fields but found 3. Consider fill=TRUE and comment.char=. First discarded non-empty line: <<1,+,4>>

with dec=',', +,4 reads as +.4.

But we've already detected sep=','... not sure the correct behavior for manually specified dec=','.

In any case the error message is strange "Expected 3 fields but found 3"

@st-pasha
Copy link
Contributor

st-pasha commented May 25, 2020

When dec=',' and sep=',', we should not attempt to use the float parser at all, unless the field is quoted.
Otherwise we'll end up reading lines like 1,2,3,4 as two numbers 1.2 and 3.4, which is rather insane...

@MichaelChirico
Copy link
Member Author

rather insane...

Actually found a file on some Spanish government website for which that's the correct parsing 😂 but yes, indeed insane as a general thing.

This issue is slightly different. We already block sep==dec:

if (args.sep == dec) STOP(_("sep == dec ('%c') is not allowed"), dec);

Here what happens is, since sep detection comes first, right now we aren't enforcing sep != dec when sep starts out as 'auto' but ends up as ','.

Maybe the best option is to change here:

char seps[]=",|;\t "; // default seps in order of preference. See ?fread.

to

char seps[] = dec == ',' ? "|;\t " : ",|;\t ";

i.e. if user has manually specified dec=',', prevent sep='auto' from finding sep=','

@mattdowle mattdowle added this to the 1.14.1 milestone Aug 20, 2021
@jangorecki jangorecki modified the milestones: 1.14.9, 1.15.0 Oct 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants