Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fails to parse record with unescaped parenthesis #79

Open
marvin-yorke opened this issue Mar 5, 2015 · 10 comments
Open

Fails to parse record with unescaped parenthesis #79

marvin-yorke opened this issue Mar 5, 2015 · 10 comments
Assignees
Labels

Comments

@marvin-yorke
Copy link

given the record

16681;6;Orehovyj boulevard, ul. Musy Dzhalilja (odd side);20;out;55.6141571054;37.7460757208;800;34;34;0;0;0;0;0;1

library fails to parse the 3rd field with the following error:

Unexpected delimiter. Expected ';' (0x3B), but got '(' (0x28)

Is there any way to parse this data without altering it (e.g adding quotes)?

@davedelong
Copy link
Owner

Thanks for reporting this. I added a unit test to parse the exact text you provided, and it seems to have no problem with it. I tried parsing it as the in-memory string, and as a file written to disk (which is similar to what your code was doing). Both tests pass without modification to the parser, so I'm not sure what the issue is here.

@davedelong
Copy link
Owner

Are the URLs you're parsing remote (coming in over a network connection) or local file URLs? Do you have an example of either that I could try?

@davedelong davedelong self-assigned this Mar 7, 2015
@marvin-yorke
Copy link
Author

Hi Dave,
I'm downloading an archive from the server, unpack it into Documents directory and supply a URL to the file in Documents dir.
You can find the files I'm using in the following archive: http://metro4all.org/data/msk.zip
The file I've encountered the problem in is portals_ru.csv

davedelong added a commit that referenced this issue Mar 7, 2015
@davedelong
Copy link
Owner

Thanks @marvin-yorke. I incorporated the portals.csv file into the unit tests, but they're still passing on my machine. 😕

@marvin-yorke
Copy link
Author

Hm, ok, I've cloned the repo and run the tests and it works on my machine too. I should have mentioned that original case was observed on iOS, not OS X. Could this make any difference? Also I've installed the library from Cocoapods, not from github, although there's no major difference to the latest code.. Anyway, I'll try again with my iOS app and let you know about the results

@marvin-yorke
Copy link
Author

I've checked the issue again and here's the line that breaks the parsing
17530;2;"Крокус Экспо" (павильон 1, 2);215;both;55.8235522598;37.3855503584;800;56;0;0;0;400;950;23;0
Turns out that it's not parentheses that cause the issue, but quotes. And now I'm not quite sure whether it's a parser problem or my data is malformed. What do you think?

@davedelong
Copy link
Owner

Yes, that is a problem with the data. When the parser encounters a field that starts with ", it assumes the field ends with the corresponding closing ". And then since the next character after the closing " isn't a delimiter (;), it aborts with an error.

@marvin-yorke
Copy link
Author

Then could you please help me on how to correct my data?

@danieljfarrell
Copy link

The solutions seems pretty clear: don't start an field with quoted text; or if a field starts with quoted text wrap the whole field in quotes.

@h3dkandi
Copy link

Is there a property that can turn of such behavior. Or some work around without me having to edit the file I am parsing.

Edit: added one seems to work fine now :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants