Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Processing time high on large datasets #52

Open
PriyaNagesh opened this issue Mar 4, 2020 · 1 comment
Open

Processing time high on large datasets #52

PriyaNagesh opened this issue Mar 4, 2020 · 1 comment

Comments

@PriyaNagesh
Copy link

Taking long time for processing 50000 rows.Any chances to reduce it?

@ensslen
Copy link

ensslen commented Oct 22, 2021

Joining 200k records to 700k records (0.3 GB to 3.8GB) on 11 columns takes 90 minutes on my i7 laptop before crashing with an OperationalError: malformed MATCH expression. Python 3.9.0 with fuzzymatcher==0.0.5

If it output a result in that time I'd consider that reasonable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants