Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filtering orgs & groups don't work while using multiple values #518

Open
wanam opened this issue Jan 27, 2023 · 0 comments
Open

Filtering orgs & groups don't work while using multiple values #518

wanam opened this issue Jan 27, 2023 · 0 comments

Comments

@wanam
Copy link

wanam commented Jan 27, 2023

I'm trying to harvest a ckan data source https://dati.comune.roma.it/catalog using bellow filter configuration:

{
"organizations_filter_include":["atac-s-p-a-azienda-per-la-mobilita","roma-servizi-per-la-mobilita"]
}

Ckan harvester converts the above configuration to a formatted url to gather the datasets's metadata, but it seems the syntax of the resulting url is not working properly, here is the generated url:
https://dati.comune.roma.it/catalog/api/3/action/package_search?rows=100&start=0&sort=id+asc&fq=organization%3Aatac-s-p-a-azienda-per-la-mobilita+OR+organization%3Aroma-servizi-per-la-mobilita

Sending this request will return all the datasets in the remote data catalog ~340, while it should only return ~11 datasets.

I'm not sure if this is a ckan querying compatibility issue, this issue is reproducible on ckan 2.9.

The correct url format should be:
https://dati.comune.roma.it/catalog/api/3/action/package_search?rows=100&start=0&sort=id+asc&fq=organization%3A(atac-s-p-a-azienda-per-la-mobilita OR roma-servizi-per-la-mobilita)

Here is a quick fix I'm using on my ckan instance:

        org_filter_include = self.config.get('organizations_filter_include', [])
        org_filter_exclude = self.config.get('organizations_filter_exclude', [])
        if org_filter_include:
            fq_terms.append('organization:(')
            fq_terms.append(' OR '.join(org_name for org_name in org_filter_include))
            fq_terms.append(')')
        elif org_filter_exclude:
            fq_terms.append('-organization:(')
            fq_terms.append(' OR '.join(org_name for org_name in org_filter_exclude))
            fq_terms.append(')')

Same thing applies for groups and tags.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant