Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documenting the guarantee that fingerprinter won't emit duplicate tokens for the stame field. #1078

Open
fgregg opened this issue Aug 11, 2022 · 1 comment

Comments

@fgregg
Copy link
Contributor

fgregg commented Aug 11, 2022

Right now this is true because we are careful to make sure that every predicate returns unique keys.

It would be safer, and sometimes more efficient to move the code to the fingerprinter itself.

something like

block_keys = {block_key + pred_id
              for pred_id, predicate in predicates
              for block_key in predicate(instance, target=target)}
for block_key in block_keys:
    yield block_key, record_id
@fgregg
Copy link
Contributor Author

fgregg commented Aug 11, 2022

one reason why we did it this way is to reduce large cartesian products for compound predicates. making this happen later would clean that up allow that to happen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant