Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Block harvest_source_list API endpoint on catalog #4725

Open
FuhuXia opened this issue May 1, 2024 · 1 comment
Open

Block harvest_source_list API endpoint on catalog #4725

FuhuXia opened this issue May 1, 2024 · 1 comment
Labels
bug Software defect or bug

Comments

@FuhuXia
Copy link
Member

FuhuXia commented May 1, 2024

Endpoint from ckanext-harvest harvest_source_list includes deleted harvest sources in the result. Anonymous user is not supposed to see deleted packages. The API does not support pagination. In order to show catalog's all harvest sources, we have to set a very high limit (2000?) to include all current (active) and deleted (inactlive) sources in one API call, which is very slow.

I think we should block this API endpoint and guide user to use alternative APIs

  1. Call this API to get all harvest sources in paginated results:
    https://catalog.data.gov/api/action/package_search?fq=(dataset_type:harvest)&fl=id,name,url,organization&rows=1000

  2. Get details on a specific source with this API. You can use either id or name:
    https://catalog.data.gov/api/action/harvest_source_show?id=energy-json

How to reproduce

https://catalog.data.gov/api/action/harvest_source_list

search active: false in the result

Sketch

We have a list of blocked api endpoint in nginx config:

https://github.com/GSA/catalog.data.gov/blob/8dda50797980f40d6921aa3e299087ddfe31d8c9/proxy/nginx-common.conf#L27-L44

@FuhuXia FuhuXia added the bug Software defect or bug label May 1, 2024
@gujral-rei
Copy link

Redirect the call to package search API call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Software defect or bug
Projects
Status: 📔 Product Backlog
Development

No branches or pull requests

2 participants