Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Customise cell parsing for declarative decoder #55

Open
HowlingEverett opened this issue Jul 17, 2023 · 0 comments
Open

Customise cell parsing for declarative decoder #55

HowlingEverett opened this issue Jul 17, 2023 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@HowlingEverett
Copy link

HowlingEverett commented Jul 17, 2023

Is your feature request related to a problem?

I am a fan of the concept of Codable declarative CSV parsing, but am running into the edges of it a little with my current use case. I'm parsing a nutrient database (a UK public health source), and in their dataset they either offer a floating point value for a quantity of a nutrient, or special codes representing trace amounts: e.g. they use "N" to represent "significant but unmeasured quantity" or "Tr" to represent "trace amounts".

Here's an example subset of an input:

Water (g),Protein (g),Fat (g),Carbohydrate (g),Energy (kJ) (kJ),Starch (g),Total sugars (g),Glucose (g)
76.7,2.9,15.2,0.8,625,Tr,0.8,0.1
9.7,1.3,1.2,Tr,67,0.0,Tr,0.0
84.2,0.2,0.1,Tr,7,0.0,Tr,0.0
93.4,4.0,0.7,0.4,100,Tr,0.3,0.1
8.5,6.1,8.7,N,N,N,N,N

In my use case, I'd basically like to ignore N or Tr values (defaulting them to 0 in the parsed type, maybe), but the parser throws an exception and exits when it encounters a non-parseable Double value.

Describe the solution you'd like

Similar to the customisation point for a Decimal parser, It'd be great if we could customise the parsing for types such as Double to be able to handle for edge cases in our input data. In my case I'd be able to Double cast values that aren't "N" or "Tr", and return 0.0 for those edge cases.

Describe alternatives you've considered

I've been able to resolve my issues using the imperative parser, or by pre-processing the CSV whenever I parse it, but it ceases to be a nice declarative interface at that point (and requires loading the whole thing into memory, as my old SwiftCSV implementation did).

The Decimal parser option works, but results in Decimal values - in my case I want simple Doubles.

@HowlingEverett HowlingEverett added the enhancement New feature or request label Jul 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants