Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error message if parsing fails #79

Open
benswick opened this issue Jul 14, 2017 · 6 comments
Open

Error message if parsing fails #79

benswick opened this issue Jul 14, 2017 · 6 comments

Comments

@benswick
Copy link
Contributor

Currently, when parsing fails a NULL value is returned and my software can only log a "something went wrong" message. It would be great to be able to log the line number and line contents that caused the failure.

I apologize for not reading the bottom of the readme about creating issues before a pull request. Pull request #78 shows a minimally invasive way to implement this. It returns a JSONError value with an error message in the string field.

@realnol
Copy link

realnol commented Jul 20, 2017

you should print log

@mbelicki
Copy link

This is actually quite nice solution (extending JSONValue union with error case).

I don't understand however why you are creating an error message in parson. In my opinion the library should try not handle any end user communication. It should return structure with line and column numbers and preferably a type of error. With this information the application which is using the library can form an error message in whatever way it finds suitable.

Please note that there are also some minor style issues in your patch:

  • char* name instead of char *name;
  • lack of consequence in checking null pointers: ptr == NULL vs !ptr;
  • although guaranteed to be 1, parson is using sizeof(char) in string allocations (which arguably influences readability).

@benswick
Copy link
Contributor Author

@mbelicki I don't understand however why you are creating an error message in parson. In my opinion the library should try not handle any end user communication. It should return structure with line and column numbers and preferably a type of error. With this information the application which is using the library can form an error message in whatever way it finds suitable.

The API for that might look something like this:

typedef struct json_error_t JSON_Error;

enum json_error_type {
    JSONError_Unidentified = -1,
};
typedef int JSON_Error_Type;

JSON_Error * json_value_get_error(const JSON_Value *value);

int    json_error_get_type(const JSON_Error *error);
size_t json_error_get_line_number(const JSON_Error *error);
size_t json_error_get_column(const JSON_Error *error);
/* Offset from the beginning of the file or string */
size_t json_error_get_offset(const JSON_Error *error);

In the future, new json_error_type enumeration values could be added to express the variety of errors that are possible.

I have cleaned up the style inconsistencies in PR #78. I also changed remove_comments() to retain newlines so line numbers remain correct.

@kgabis
Copy link
Owner

kgabis commented Jul 22, 2017

Sorry for taking so long to respond, but I've been quite busy recently.
I agree that there should be some information why parsing failed and where, returning NULL provides too little feedback.

Instead of extending existing function a nicer solution would be to add a new one. I really don't want people to use a different API based on a compiler flag.

Example:

enum json_parse_error_t {
    JSON_PARSE_ERROR_NONE = 0,
    JSON_PARSE_ERROR_MISSING_BRACKET,
    ...
};
typedef struct json_parse_error_t {
    enum json_parse_error_t error;
    unsigned int line;
    unsigned int line_offset;
} JSON_Parse_Error;

typedef struct json_parse_option_t {
    int accept_comments;
} JSON_Parse_Options;

JSON_Value* json_parse(const char *string, JSON_Parse_Options parse_options, JSON_Parse_Error *err); /* err is optional and can be NULL */

It's only a proposal though.

The real issue is that the API feels a bit dated and inconsistent in some places. It's been growing gradually over last 5 years and a cleanup would be nice at some point in future (parson 2.0?). I cannot guarantee any timeline, but it's something I keep at the back of my mind.

Anyway, I'd rather not accept your pull request (#78), because I don't want json_parse_string to behave differently based on a compiler flag and I think a new json_parse function is a better solution. Also, it's going to be necessary to rewrite parts of parsing code to remove comments and parse string in a single pass (it also gets rid of a malloc(sizeof(str_parsed)) that serves as a temp buffer) to get actual offset within the parsed line.

I'm going to keep this issue open because other people might have a similar problem and it needs to be resolved at some point.

@benswick
Copy link
Contributor Author

I don't mind the closed pull request. It was just a starting point for what this could look like. I should have just referenced the branch on my fork.

I would suggest including the offset of the error from the beginning of the string as well. It makes it easy for applications to directly show the text that caused the failure. With only the line and offset they would have to rescan the string to count lines.

What kind of ABI stability are you going for with parson? Adding an option to the proposed JSON_Parse_Options struct would change its size.

Some option APIs:

  • Make the options struct opaque and treat a NULL options argument as using the defaults.
typedef struct json_parse_option_t JSON_Parse_Options;

JSON_Parse_Options * json_options_init(void);

void json_options_allow_comments(JSON_Parse_Options *options, int allow_comments);

void json_options_free(JSON_Parse_Options *options);
  • Make the options global and add functions like void set_json_option_allow_comments(int allow); or a function like void set_json_option(enum JSON_Option_Type option, int value). Not multi-threading friendly.
  • Require apps to recompile to use the new version. Anyone compiling it as a library would probably dislike this option.
  • If all the options are binary flags it could hold an unsigned int with each option corresponding to a bit. I could imagine wanting to set nesting and capacity limits through options, though.

A few more awkward APIs I've seen.

  • Pass a string and parse the options out.
  • The user could pass the size of the struct (as an argument or the first member of the struct) and a new version of the library would know which version it is by the size. If the library was compiled with a smaller version it could use only what it understands or return an error.

Technically, the proposed JSON_Parse_Error struct has the same issue. But I'm having a hard time coming up with plausible changes to it other than adding the absolute string offset. Maybe using size_t to support really large files.

@benswick
Copy link
Contributor Author

I've implemented the proposed API with a few adjustments on my parse-errors-v2 branch 880a058 to see what it would look like.

Changes:

  • The line and line_offset variables are size_t in case anyone uses ridiculously large files.
  • Added a byte_offset variable that is from the start of the string or file so error messages can easily include the text that caused the error.
  • Added a string_is_file_path option so json_parse() can parse strings or files.
  • Added a parse_outer_value() function that rejects trailing text after the first value. It is only used by json_parse() so users of the existing functions do not get new behavior. It could be made optional or removed if that level of conformance is not desired.

Basically, I added a JSON_Parse_Error* argument to all of the parse functions. Whenever they would return NULL they also set the type of error.

While updating the tests, I noticed some test_suite_3 failures were for different reasons than the comments indicated. These three have "control character" comments after them.

  • "[\"\\\"]" is the string ["\"] and returns an unterminated string error.
  • "[\"\"\"]" is the string ["""] and returns a missing comma error.
  • "[\"\0\"]" is the string [" and returns an unterminated string error.

I added tests so each error type has at least one test. I also added tests for the three Parse Validation failures in issue #53 that are fixed by using parse_outer_value(). If a test fails it prints the error type that was returned to make it easier to see if the parsing code was wrong or if the expected error type was wrong.

Repository owner deleted a comment from yarf May 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants