Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implements pagination #63

Open
niconoe opened this issue May 19, 2020 · 4 comments
Open

Implements pagination #63

niconoe opened this issue May 19, 2020 · 4 comments

Comments

@niconoe
Copy link
Contributor

niconoe commented May 19, 2020

Hi @sckott!

For a current project I'm working on with @damianooldoni, we'll need to access a long list of results from the name_usage() API call.

Instead of implementing the pagination/looping is our client code (like in this quick&dirty example: https://gist.github.com/niconoe/b9dcb6c468b996b6f77e18f51516e840), we were wondering if you'd be interested in receiving a PR to implement it in pygbif itself. That would be similar to what Damiano did (for rgbif) in ropensci/rgbif#291 and ropensci/rgbif#295.

The plan would be to:

  • Make the mechanism generic so it can be used not only for name_usage() but also other functions that deal with paginated results from the GBIF API.
  • Either add new functions that wrap the existing ones and add pagination (all_name_usages() for example) or change the existing functions. To avoid breaking the API, we could add an optional parameter (that default to False) to tell pygbif to handle the pagination. So for example: name_usage() and name_usage(handle_pagination=False) would keep the existing behaviour, but name_usage(handle_pagination=True) would take care of the pagination and return all results. I have a slight preference for the first option (new functions) because I find the API clearer, but it's up to you!

Just tell us what you think, if you're interested we hope to start working on a PR soon!

@sckott
Copy link
Collaborator

sckott commented May 19, 2020

👋 thanks for this. Adding pagination handling sounds good.

I lean towards adding the functionality to the existing methods as I don't love the idea of adding a bunch of new methods - if we went with new methods, i imagine we'd have to add a new method for every current method?


p.s. pygbif is using https://github.com/psf/black formatter now - so make sure you use that before pushing changes up - there's lots of text editor integrations and a command line tool, etc.

@Becheler
Copy link

Any advancement concerning the limit of 300 records using occ.search?

@sckott
Copy link
Collaborator

sckott commented Jan 28, 2022

No - note that this library is now maintained by the GBIF team - hopefully they'll chime in here to indicate if that's something they're interested in or not

@Becheler
Copy link

Ok. I came up with a bit of code that solved my problem: not very clean, but functional enough!

def paginated_search(max_limit, *args, **kwargs):
    """ In its current version, pygbif can not search more than 300 occurences at once: this solves a bit of the problem
    """
    MAX_LIMIT = max_limit
    PER_PAGE = 100
    results = []

    from pygbif import occurrences

    if(MAX_LIMIT <= PER_PAGE):
        resp = occurrences.search(*args, **kwargs, limit=MAX_LIMIT)
        results = resp['results']
    else :
        from tqdm import tqdm
        progress_bar = tqdm(total=MAX_LIMIT, unit='B', unit_scale=True, unit_divisor=1024)
        offset = 0
        while offset < MAX_LIMIT:
            resp = occurrences.search(*args, **kwargs, limit=PER_PAGE, offset=offset)
            results = results + resp['results']
            progress_bar.update(len(resp['results']))
            if resp['endOfRecords']:
                progress_bar.close()
                break
            else:
                offset = offset + PER_PAGE
        progress_bar.close()
    return results # list of dicts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants