Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OPTIONS fetches and shows all possible foreign keys in choices field #3751

Closed
mlissner opened this issue Dec 16, 2015 · 24 comments
Closed

OPTIONS fetches and shows all possible foreign keys in choices field #3751

mlissner opened this issue Dec 16, 2015 · 24 comments
Labels
Milestone

Comments

@mlissner
Copy link
Contributor

I have an object with foreign keys and when I load a OPTIONS request for that object type, it shows me a truly insane JSON object with every possible value for the foreign key (there are millions!)

So, for example, I have this object:

class OpinionCluster(models.Model):
    """A class representing a cluster of court opinions."""
    docket = models.ForeignKey(
        Docket,
        help_text="The docket that the opinion cluster is a part of",
        related_name="clusters",
    )

Which is serialized with:

class OpinionClusterSerializer(serializers.HyperlinkedModelSerializer):
    docket = serializers.HyperlinkedRelatedField(
        many=False,
        view_name='docket-detail',
        queryset=Docket.objects.all(),
        style={'base_template': 'input.html'},
    )

When I load this with OPTIONS, I get back something that contains:

    "docket": {
        "type": "field",
         "required": true,
         "read_only": false,
         "label": "Docket",
         "choices": [
            {
                "display_name": "4: United States v. Goodwin",
                "value": "http://127.0.0.1:8000/api/rest/v3/dockets/4/"
            },
            {
                "display_name": "5: Quality Cleaning Products v. SCA Tissue of North America",
                "value": "http://127.0.0.1:8000/api/rest/v3/dockets/5/"
            },

....millions more....

I know that there's a way to disable listing these items in the form of the HTML view, but in the OPTIONS request we need better default functionality than displaying millions of records.

mlissner added a commit to freelawproject/courtlistener that referenced this issue Dec 16, 2015
@mlissner
Copy link
Contributor Author

I made an attempt at resolving this by overwriting the metadata class as mentioned above, but that doesn't work because all you can do at that stage is remove data that's already been pulled from the DB. Meaning: If you do this, you still fetch millions of objects from the DB, you just don't pass those values to the front end.

The fix for this, unless I'm mistaken, is one of:

  • Remove the OPTIONS request from the API (Sad, but would work)
  • Make the endpoints read_only (Sad, breaks the API)
  • Some sort of fix in DRF

Thanks for any help or other ideas.

@mlissner mlissner changed the title OPTIONS shows all possible foreign keys in choices field OPTIONS fetches and shows all possible foreign keys in choices field Dec 16, 2015
mlissner added a commit to freelawproject/courtlistener that referenced this issue Dec 16, 2015
@tomchristie tomchristie added this to the 3.3.3 Release milestone Dec 17, 2015
@tomchristie
Copy link
Member

Milestoning to raise the priority.

@mordyovits
Copy link

Agreed. This is odd behavior, treating FK fields as a choices field. Makes OPTIONS a bear trap.

I think the fix is to alter the conditional at https://github.com/tomchristie/django-rest-framework/blob/master/rest_framework/metadata.py#L140 to not do the choices thing on FK fields, and then possibly to do something else for them.

@cvn
Copy link

cvn commented Feb 4, 2016

+1

@xordoquy xordoquy modified the milestones: 3.3.3 Release, 3.3.4 Release, 3.4.0 Release Feb 11, 2016
mlissner added a commit to freelawproject/courtlistener that referenced this issue Mar 22, 2016
This is another fix for encode/django-rest-framework#3751, which is an upstream issue that causes the OPTIONS requests to list out every possible FK for an item as possible choices. That's fine and all, except when you have thousands or millions of FKs on an item, as we pretty much always do.
@fleur
Copy link

fleur commented Mar 24, 2016

(This may better be submitted as a separate issue. Let me know if I should.)

Also note that if one of those choice fields is a settings.AUTH_USER_MODEL, e.g. a owned_by field, it puts actual account data in there. A censored snipped of the returned JSON:

"owned_by":{
  "type":"field",
  "required":true,
  "read_only":false,
  "label":"Owned by",
  "choices":[
        {"display_name":"engineering+admin@<us>.com","value":"18"}, 
        {"display_name":"engineering+notanadmin@<us>.com","value":"19"},...

IMHO, the default shouldn't do this considering the security risk.

@crgwbr
Copy link

crgwbr commented Mar 29, 2016

Is there any known work around while this issue is being sorted out? I'm having the security issue @fleur described with a ForeignKey to AUTH_USER_MODEL.

@mlissner
Copy link
Contributor Author

I've made these fields read_only. Works for our implementation, but sucks otherwise. Now there seems to be two reasons to fix this:

  1. Information leakage.
  2. Performance.

@charettes
Copy link
Contributor

In order to work around this limitation you'll need to implement your own SimpleMetadata subclass:

from rest_framework.metadata import SimpleMetadata
from rest_framework.relations import RelatedField

class NoRelatedFieldChoicesMetadata(SimpleMetadata):
    def get_field_info(self, field):
        related_field = isinstance(field, RelatedField)
        # Remove the related fields choices since they can end up with many values.
        if related_field:
            field.queryset = field.queryset.none()
        field_info = super(NoRelatedFieldChoicesMetaData, self).get_field_info(field)
        # Remove the empty `choices` list.
        if related_field:
            field_info.pop('choices')
        return field_info

You can set it per serializer using the metadata_class attribute or per-project using the REST_FRAMEWORK['DEFAULT_METADATA_CLASS'] setting.

@mlissner
Copy link
Contributor Author

@charettes beware, however, that this doesn't fix the performance issue. From my earlier comment:

I made an attempt at resolving this by overwriting the metadata class as mentioned above, but that doesn't work because all you can do at that stage is remove data that's already been pulled from the DB. Meaning: If you do this, you still fetch millions of objects from the DB, you just don't pass those values to the front end.

@kevin-brown
Copy link
Member

There is definitely a performance issue here when a field has a large number of possible options. I honestly have no idea what the best way to fix it would be without breaking backwards compatibility.

...information leakage...

This is something I keep hearing whenever this comes up: that DRF is leaking information that it shouldn't be. And I guess it's true, you aren't intending for that information to be public. I mean, why should anyone be able to get a list of the users that are possible values for the field?

But that's usually a sign that you are not restricting your querysets down to only the objects that should be allowed. All objects that DRF lists in the OPTIONS request are valid inputs to the field, so if there are objects showing up in the list then you probably aren't filtering them out for the field.

@charettes
Copy link
Contributor

@mlissner note that I use queryset.none() to avoid any queryset from being performed.

@xordoquy
Copy link
Collaborator

I mean, why should anyone be able to get a list of the users that are possible values for the field?

This is a common request to have all the related values within the option to let front end apps fill choice fields.
Note that it doesn't show read only fields.

@mlissner
Copy link
Contributor Author

@charettes That actually does look promising. You're right, that may be a workaround for the performance aspect of this issue.

charettes added a commit to charettes/django-rest-framework that referenced this issue Mar 29, 2016
…h metadata.

Listing related fields can leak sensitive data and result in poor performance
when dealing with large result sets.

Large result sets should be exposed by a dedicated endpoint instead.
charettes added a commit to charettes/django-rest-framework that referenced this issue Mar 29, 2016
…tadata.

Listing related fields can leak sensitive data and result in poor performance
when dealing with large result sets.

Large result sets should be exposed by a dedicated endpoint instead.
@callmewind
Copy link

Hi,
I'm also affected by this bug.
Meanwhile the fix arrives to PIP, I overrided the SimpleMetadata class with @charettes's fix in a6732e2. It solves the problem for related fields, but is missing to check also for serializers.ManyRelatedField.

craigds pushed a commit to koordinates/django-rest-framework that referenced this issue Apr 22, 2016
…tadata.

Listing related fields can leak sensitive data and result in poor performance
when dealing with large result sets.

Large result sets should be exposed by a dedicated endpoint instead.
@scripterkaran
Copy link

+1

@tomchristie
Copy link
Member

Fixed via #4021.

@callmewind
Copy link

Hi @tomchristie, I checked again #4021 and is still missing the case with ManyRelatedField.
In rest_framework/metadata.py:140
if (not field_info.get('read_only') and not isinstance(field, serializers.RelatedField) and hasattr(field, 'choices')):
should be
if (not field_info.get('read_only') and not isinstance(field, serializers.RelatedField) and not isinstance(field, serializers.ManyRelatedField) and hasattr(field, 'choices')):

@charettes
Copy link
Contributor

@callmewind, you're right, I assumed ManyRelatedField was a subclass of RelatedField. @tomchristie I can submit another PR if you don't have time to deal with it.

@tomchristie
Copy link
Member

Thanks for that! :)

@tomchristie
Copy link
Member

Many to Many now also resolved.

@vil-s
Copy link

vil-s commented Jun 7, 2016

When can we get a release for this? This is quite a severe information security concern.

@tomchristie
Copy link
Member

@unklphil Note that users are only able to see this information if they also have permission to set those fields in the first place.

The 3.4.0 release will be out sometime in the next few weeks.

If you want to disable this behavior prior to that set REST_FRAMEWORK['DEFAULT_METADATA_CLASS'] = None, or use a custom metadata class that implements the behavior as per the related pull requests.

@vellamike
Copy link

vellamike commented Mar 19, 2017

Now that this has been fixed, is there a way to get the choices if I want this behaviour(listing all the choices)? Or is there perhaps a way to indicate from OPTIONS where the URI listing the choices might be found?

For my API OPTIONS returns :

    "actions": {
        "POST": {
            "date_created": {
                "type": "datetime",
                "required": false,
                "read_only": true,
                "label": "Date created"
            },
            "owner": {
                "type": "field",
                "required": false,
                "read_only": true,
                "label": "Owner"
            },
            "language": {
                "type": "field",
                "required": true,
                "read_only": false,
                "label": "Language"
            },
        }
    }
}

And I would like a way for the frontend API to know where to look for the URIs with which it can populate language

@michael-k
Copy link
Contributor

michael-k commented Mar 19, 2017

This is what I would try:

Derive a class from SimpleMetadata. Undo commit 05b0c2a in your implementation. It should also give you a hint on how to return an URI. And don't forget to change the metadata class in your settings.

from rest_framework.metadata import SimpleMetadata
from rest_framework.serializers import ManyRelatedField


class CustomMetadata(SimpleMetadata):

    def get_field_info(self, field):
        field_info = super().get_field_info(field)

        if (not field_info.get('read_only') and
            isinstance(field, ManyRelatedField) and
                 hasattr(field, 'choices')):
            field_info['choices'] = [
                {
                    'value': choice_value,
                    'display_name': force_text(choice_name, strings_only=True)
                }
                for choice_value, choice_name in field.choices.items()
            ]
 
        return field_info

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests