Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Colander and None during deserialization? #276

Open
tsauerwein opened this issue Oct 28, 2016 · 22 comments
Open

Colander and None during deserialization? #276

tsauerwein opened this issue Oct 28, 2016 · 22 comments

Comments

@tsauerwein
Copy link

tsauerwein commented Oct 28, 2016

I am confused about the way Colander handles None values (or not handles None values) during the deserialization. For example:

class Person(colander.MappingSchema):
    name = colander.SchemaNode(colander.String(), missing=None)

schema = Person()
print(schema.deserialize({'name': None}))
# {'name': None}

In this case Colander accepts a None value, and keeps it because missing is set to None. This sounds reasonable to me.

Another example using a sequence:

class NoneSequence(colander.MappingSchema):
    foo = colander.SchemaNode(colander.String())
    bar = colander.SchemaNode(
        colander.Sequence(), colander.SchemaNode(colander.String()),
        missing=None)

schema = NoneSequence()
deserialized = schema.deserialize({'foo': '1', 'bar': None})
# colander.Invalid: {'bar': '"None" is not iterable'}

In this case Colander does not accept None.

Yet another example using a nested schema:

class SchemaA(colander.MappingSchema):
    val = colander.SchemaNode(colander.String())


class SchemaB(colander.MappingSchema):
    a = SchemaA(missing=None)


class SchemaC(colander.MappingSchema):
    b = SchemaB()

schema = SchemaC()
print(schema.deserialize({"b": {"a": None}}))
# colander.Invalid: {'b.a': '"None" is not a mapping type: Does not implement dict-like functionality.'}

Again None is not accepted. I have some more examples with nested sequences and tuples here: https://gist.github.com/tsauerwein/93c88ba3c37f936dea113a27447c0db5

What is the idea? Why does Colander accept None in some cases during the deserialization, and in some cases not? Do I have to replace all None values with colander.null before calling deserialize?

Looking the tests it seems that this is even the expected behavior:
https://github.com/Pylons/colander/blob/cb1645d/colander/tests/test_colander.py#L935
https://github.com/Pylons/colander/blob/cb1645d/colander/tests/test_colander.py#L690

@weilu
Copy link

weilu commented Dec 19, 2016

I ran into the same issue. Any explanation would be great!

@delijati
Copy link

There was already a solution to that #258

@ngoonee
Copy link

ngoonee commented Sep 19, 2017

It wasn't merged though. Does that mean I have to search-and-replaces None values with colander.null currently?

@tsauerwein
Copy link
Author

Does that mean I have to search-and-replaces None values with colander.null currently?

That's what we ended up doing.

@ngoonee
Copy link

ngoonee commented Sep 20, 2017

Ah, that's what I was afraid of. Not hard to do but I'm dreading having to remember to come back to this 2 years from now....

@delijati
Copy link

in #258 a contributor had to sign and did not do it ...

@ngoonee
Copy link

ngoonee commented Sep 20, 2017

I understand, not trying to apportion any blame. Stuff happens.

@leplatrem
Copy link

I would be willing to contribute a PR. But before starting, which approach would the maintainers recommend to solve the None deserialization issue? Same as #258 or a recursive replace of None to colander.null before deserializing?

In #258 @mmerickel mentioned the allow_empty option. Currently, it only applies to the String schema type. Should we generalize it to all of them?

@RangelReale
Copy link
Contributor

I'm having this exact same problem, I have a SequenceSchema that is optional, and I use the missing parameter. If I don't send the field in the json, it works, but if I send null (None), I get the '"None" is not iterable' message.

@stevepiercy
Copy link
Member

@mmerickel is redoing the work in #258 the preferred way forward to resolve this issue for non-strings? Per your comment, strings are already handled via allow_empty.

If that's all it takes, then I am happy to do the copy-paste from the code in the PR #258 to get this into Colander.

@mmerickel
Copy link
Member

mmerickel commented Aug 24, 2020

#258 wouldn't work here and looking at its impl again I wouldn't accept it as-is. It's converting none explicitly to an empty string and in the case of a SequenceSchema you'd want it to deserialize to something else.

WRT some other comments in this thread, I can't actually explain the scenarios in which colander.null is intended to be used as I was not the original developer of this library and have never needed to use it myself. The semantics around it are pretty confusing to me.

@RangelReale None and "missing" are different concepts entirely in schema design, even if you would prefer them to be treated the same in your case. If you want to handle None, I would suggest using a preparer to delete the value entirely. At which point the "missing" semantics will kick in and you can specify a default value there such as colander.drop or None or anything else.

@stevepiercy
Copy link
Member

Thanks for the quick reply @mmerickel. Also after reviewing your earlier comment about the proposed implementation treating "" as None and vice versa, I would agree that the proposal in #258 needs to be modified further. I missed that the first time through.

For this issue, I would suggest that developers carefully review the documentation of The Null and Drop Values. This is densely packed with easy to misunderstand details. There are also two tables which illustrate Serialization Combinations and Deserialization Combinations.

I think this issue can be closed, as it seems to be a matter of not understanding how Colander handles None during deserialization, which I think is answered and resolved in the documentation I mention above. Please do speak up if that is not accurate.

@RichardJones1337
Copy link

I'm also getting this issue with nested schema and handling 'null' values.
Say I have the schema:

class SchemaA(colander.MappingSchema):
    val = colander.SchemaNode(colander.String())

class SchemaB(colander.MappingSchema):
    a = SchemaA(missing=None)
    otherval = colander.SchemaNode(colander.String(), missing=None)

deserialising the json:

{
    "a": null,
    "otherval": "asdf"
}

gives the error:

{'a': '"None" is not a mapping type: Does not implement dict-like functionality.'}

based on the documentation:

If the appstruct value computed by the type's deserialize method is colander.null and the schema node has an explicit missing attribute (the node's constructor was supplied with an explicit missing argument), the missing value will be returned. Note that when this happens, the missing value is not validated by any schema node validator: it is simply returned.

I would expect a to be deserialised to None

@stevepiercy
Copy link
Member

@RichardJones1337 you need to replace "null" with colander.null.

from pprint import pprint
import colander


class SchemaA(colander.MappingSchema):
    val = colander.SchemaNode(colander.String())


class SchemaB(colander.MappingSchema):
    a = SchemaA(missing=None)
    otherval = colander.SchemaNode(colander.String(), missing=None)


schema = SchemaB()
deserialized = schema.deserialize({"a": colander.null, "otherval": "asdf"})

pprint(deserialized)

=> {'a': None, 'otherval': 'asdf'}

Like I said, those docs are densely packed with easy to overlook details.

@RichardJones1337
Copy link

RichardJones1337 commented Sep 11, 2020

This presents inconsistent behavior with other SchemaNode instances.
the above example, with a json of:

{
    'a': {'val': null},
    'otherval': null
}

would result in a (correct) outcome of {'a': {'val': None}, 'otherval': None}, so why are nested schema treated differently?

Also, for the conveinience of everyone here, please could you provide a reference to the location in the documents where it states that passing None as the value to a nested MappingSchema needs to be handled differently?

@stevepiercy
Copy link
Member

Using your example in my code above, I get:

Traceback (most recent call last):
  File "/Users/stevepiercy/projects/colander/col.py", line 17, in <module>
    'a': {'val': null},
NameError: name 'null' is not defined

I think you are confusing a JSON object with a Python dict. Although they might look the same with their curly braces, they are very different things.

For documentation references, please see my earlier comment.

@RichardJones1337
Copy link

I am fully aware of the differences between JSON and a python dict, ok let's skip the bit where I am using colander to deserialise json requests, and just jump straight to the point where I'm deserialising the dict I've extracted from the json request,

from pprint import pprint
import colander

class SchemaA(colander.MappingSchema):
    val = colander.SchemaNode(colander.String(), missing=None)

class SchemaB(colander.MappingSchema):
    a = SchemaA(missing=None)
    otherval = colander.SchemaNode(colander.String(), missing=None)

schema = SchemaB()
deserialized = schema.deserialize({"a": {"val": None}, "otherval": "asdf"})
pprint(deserialized)

deserialized_2 = schema.deserialize({"a": None, "otherval": "asdf"})
pprint(deserialized_2)

produces

{'a': {'val': None}, 'otherval': 'asdf'}

colander.Invalid: {'a': '"None" is not a mapping type: Does not implement dict-like '
      'functionality.'}

(I have trimmed some lines for brevity)
What I want to understand is why 'otherval': None is handled gracefully and returns the value of missing, but 'a': None does not return the missing value and instead throws an exception

@stevepiercy
Copy link
Member

In this new example, you inserted a missing=None into the val schemaNode from the previous example. I almost missed that, you sneaky devil. 😈

None is not a mapping type (such as a dict, with a key/value pair). It is the absence of a type, except for None itself. Instead you would need to provide an empty dict {}. The exception message attempts to describe that, but maybe could be improved to be more explicit.

deserialized3 = schema.deserialize({"a": {}, "otherval": "asdf"})
pprint(deserialized3)

=> {'a': {'val': None}, 'otherval': 'asdf'}

Colander is doing the right thing here by enforcing a mapping type.

@RichardJones1337
Copy link

Thanks for the explanation, I'm still not sure why something like {"a": None} doesn't handle None values gracefully like other SchemaNodes, but I'll just work around it

@stevepiercy
Copy link
Member

stevepiercy commented Sep 11, 2020

It could, but IMO it would be harmful. For the sake of discussion, let's consider its impact.

Let's say I deserialize an arbitrary non-mapping object to None, even though I have declared it as a colander.MappingSchema . I would want to know that I have a None, str, list, set, range, or other non-mapping object for a, so that I can handle it properly, and raise an exception.

@Wim-De-Clercq
Copy link

Wim-De-Clercq commented Jan 26, 2021

So, let's say there's a business model with tickets, and an assignee. An assignee is an id to a person + a role like 'tester' or 'developer'.
Tickets, however, can exist without an assignee yet.

How do you write a schema to validate both jsons like this

{
  "ticket-name": "",
  "assignee": null    
}
{
  "ticket-name": "",
  "assignee": {
    "person_id": 1,
    "role": "tester"
  }    
}

That json null will always become a python None. And I can't see any way to make a schema that allows this. Unless i manually preprocess the json-dict to turn None into colander.null. But that's just silly, no?

(i know realistically "assignee" would probably be a list and a sequenceSchema, but just for the sake of this example. Is this not possible at all?)

import colander


class Assignee(colander.Schema):
    person_id = colander.SchemaNode(colander.Integer)
    role = colander.SchemaNode(colander.String)


class Ticket(colander.Schema):
    name = colander.SchemaNode(colander.String)
    assignee = Assignee(missing=None)


Ticket().deserialize({'name': 'x', 'assignee': None})

@stevepiercy
Copy link
Member

In that pseudo-code, you are telling Colander to assign None to... what? person_id or role? Colander interprets the Assignee schema as consisting of two nodes that must have values, and has no idea how to apply None as you provide.

For your use case, I think preprocessing the first JSON object, expanding it to include its nodes from the Colander schema and setting Assignee's nodes with missing=None, would work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants