Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

load-db-hpoa_by_pub should stream output #8

Open
cmungall opened this issue Oct 24, 2023 · 0 comments
Open

load-db-hpoa_by_pub should stream output #8

cmungall opened this issue Oct 24, 2023 · 0 comments

Comments

@cmungall
Copy link
Member

cmungall commented Oct 24, 2023

currently this loader will only generate output at the end. the reason it does this is that it needs to aggregate by pub. however the strategy is still pretty dumb. And v inconvenient

if self.group_by_publication:
for pub in by_pub.values():
yield pub

instead it should

  1. load all hpoa as one TSV
  2. aggregate by pub
  3. index these one at a time, yielding results

@julesjacobsen

@cmungall cmungall changed the title load-db-hpoa_by_pub should stream outpout load-db-hpoa_by_pub should stream output Oct 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant