Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Primary key optimization doesn't work (null attribtue values) with nested includes #708

Open
tyler-baetz opened this issue Sep 25, 2019 · 4 comments
Labels

Comments

@tyler-baetz
Copy link

I ran into this issue when upgrading to a newer version of django-rest-framework-json-api and it seems like a bug to me, unless there is now a different way that this is meant to be accomplished that I'm not seeing in the documentation. The problem is when providing an include parameter for nested children of a resource (for example, ?include=articles.author), those "grandchild" objects are present in the included but they have no attributes.

I was able to recreate this with a simple setup:
Models

class Page(models.Model):
    pass


class Author(models.Model):
    first_name = models.TextField()
    last_name = models.TextField()


class Article(models.Model):
    page = models.ForeignKey(Page, related_name='articles', on_delete=models.CASCADE)
    author = models.ForeignKey(Author, related_name='articles', on_delete=models.CASCADE)
    title = models.TextField()

Serializers

class AuthorSerializer(serializers.ModelSerializer):
    class Meta:
        model = Author
        fields = ('id', 'first_name', 'last_name')


class ArticleSerializer(serializers.ModelSerializer):
    author = AuthorSerializer(many=False, read_only=True)

    included_serializers = {
        'author': AuthorSerializer,
    }

    class Meta:
        model = Article
        fields = ('id', 'title', 'author')


class PageSerializer(serializers.ModelSerializer):
    articles = ArticleSerializer(many=True, read_only=True)

    included_serializers = {
        'articles': ArticleSerializer,
    }

    class Meta:
        model = Page
        fields = ('id', 'articles')

The view is just a simple ModelViewSet.

class PageViewSet(ModelViewSet):
    queryset = Page.objects.all()
    serializer_class = PageSerializer

The output of a query /page/?include=articles.author is this:

{
    "data": [
        {
            "type": "Page",
            "id": "1",
            "attributes": {},
            "relationships": {
                "articles": {
                    "data": [
                        {
                            "type": "Article",
                            "id": "1"
                        }
                    ]
                }
            }
        }
    ],
    "included": [
        {
            "type": "Article",
            "id": "1",
            "attributes": {
                "title": "Some Article"
            },
            "relationships": {
                "author": {
                    "data": {
                        "type": "Author",
                        "id": "1"
                    }
                }
            }
        },
        {
            "type": "Author",
            "id": "1",
            "attributes": {
                "first_name": null,
                "last_name": null
            }
        }
    ]
}

The first_name and last_name fields for the included Author are present but have null values, when data exists in the database for this object. This issue seems to have been introduced in django-rest-framework-json-api v2.6.0, as the fields are returned correctly when using 2.5.0.

@sliverc
Copy link
Member

sliverc commented Sep 27, 2019

An include articles.author should work with the configuration you have. There are also tests for it.

When looking at the history between version 2.5.0 and 2.6.0 it is not obvious which commit could have broken this. I am also using this type of include in some of my projects and there I don't see the issue you are seeing.

Hence some questions:

  • Are you be able to pin point the exact commit which broke your build? (e.g. using git-bisect or so)?
  • Are you able to add a test within the DJA code base which reproduces this issue?

Thanks.

@tyler-baetz
Copy link
Author

Okay, I was able to get this to work after looking at the example code. If I change my ArticleSerializer in my example here to use a relations.ResourceRelatedField for author instead of an AuthorSerializer, it serializes the author correctly in the response. This is how the serializers are set up for the example tests, but I've usually just defined related fields in my projects the way I have it in my example, referencing the field's ModelSerializer directly (I think because the django-rest-framework documentation has it this way). I'm not entirely sure what the difference is in the output (if any) between using a ResourceRelatedField and referencing the resource's ModelSerializer, so I will probably just update my projects to use ResourceRelatedField.

I think I did find where this stopped working when using the ModelSerializer as the related field - 22c4587. This commit made changes to serializers.ModelSerializer._get_field_representation, where the if check there used to be:

if isinstance(field, ModelSerializer) and hasattr(field, field.source + "_id"):

it is now:

if (...) isinstance(field, ModelSerializer) and hasattr(instance, field.source + '_id'):

Before this change was made, a ModelSerializer like the author field would fail this check, so field.to_representation(attribute) would be returned. After this was changed, the author field passes both of these checks, and the OrderedDict would be returned which contained only the id and type fields, which is what I was seeing.

It seems like this code may be working as intended and I just need to update my serializers to use ResourceRelatedFields instead of referencing the ModelSerializer.

@tyler-baetz
Copy link
Author

tyler-baetz commented Sep 30, 2019

Actually I'm thinking there is still a bug here. Another change that was made in that commit was adding a check for the field's source being present in the includes. So the full if-check there is:

    is_included = field.source in get_included_resources(request)
    if not is_included and \
            isinstance(field, serializers.ModelSerializer) and \
            hasattr(instance, field.source + '_id'):

That get_included_resources returns a list of strings for the requested includes, split on commas, or the list of default includes if none were requested. So even with ModelSerializer relation fields, this still works just for the top-level includes. However, in my example this would return ['articles.author'], and author is not in this. If we combine the includes list:

    is_included = field.source in ','.join(get_included_resources(request))

this restores the functionality and allows nested includes to be serialized correctly when using ModelSerializer.

There may be some other concerns with doing this though - I can see a scenario where a model and its child models both have fields with the same name, especially when it comes to related users (both models having a created_by, or last_updated_by, for example). These fields would be calling field.to_representation on both models in this case, even if only one of them was requested in the include. This code change doesn't check if the requested resource is the exact one being evaluated - for example, created_by and article.created_by would evaluate as True for both models.

@sliverc
Copy link
Member

sliverc commented Oct 1, 2019

I have overlooked this when I looked at your example that you use a serializer instead of a ResourceRelatedField.

A serializer in rest framework would actually embed the relation whereas a PrimaryKeyRelatedField which ResourceRelatedField derives from would only represent it as its primary key. And that what actually happens in JSON API spec that per default relation is represented as primary key and only optionally would it be included when requested.

All in all I still think embedding a serializer should work as you do it for DRF compatibility although it is not the the preferred way.

Marking this as a bug therefore as you also outline there are issue with related fields. We have to think about it how to implement it though. The way how primary key optimization is done currently is complicated and I would prefer to use use_pk_only_optimization of PrimaryKeyRelatedField or similar instead.

Any contributions are welcome.

@sliverc sliverc changed the title Nested relationship paths are not serializing fields correctly Primary key optimization doesn't work (null attribtue values) with nested includes Oct 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants