Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more stats about import process #22

Open
orangejulius opened this issue Oct 1, 2015 · 2 comments
Open

Add more stats about import process #22

orangejulius opened this issue Oct 1, 2015 · 2 comments

Comments

@orangejulius
Copy link
Member

It would be cool to get a report of exactly how many addresses were parsed during an import.

Something like this at the end of the import (or maybe for each file):
Total records in file: 900707
Records skipped due to missing data: 50383
Records skipped by deduplicator: 20837
Total records imported: 829487

@glifchits
Copy link

Wanted to say +1 to this! The importer is running for me and the output is so non-descript I have no idea if the importer is doing anything at all.

openaddresses_1   | 2016-11-11T16:48:49.533Z - verbose: [address-deduplicator]  total=1000, duplicates=0, uniques=0, timeSpentPaused=0
openaddresses_1   | 2016-11-11T16:48:49.921Z - verbose: [openaddresses] Number of bad records: 1
openaddresses_1   | 2016-11-11T16:48:59.541Z - verbose: [address-deduplicator]  total=1000, duplicates=0, uniques=0, timeSpentPaused=0
openaddresses_1   | 2016-11-11T16:48:59.923Z - verbose: [openaddresses] Number of bad records: 1
openaddresses_1   | 2016-11-11T16:49:09.549Z - verbose: [address-deduplicator]  total=1000, duplicates=0, uniques=0, timeSpentPaused=0
openaddresses_1   | 2016-11-11T16:49:09.925Z - verbose: [openaddresses] Number of bad records: 1
openaddresses_1   | 2016-11-11T16:49:19.560Z - verbose: [address-deduplicator]  total=1000, duplicates=0, uniques=0, timeSpentPaused=0
openaddresses_1   | 2016-11-11T16:49:19.927Z - verbose: [openaddresses] Number of bad records: 1
openaddresses_1   | 2016-11-11T16:49:29.570Z - verbose: [address-deduplicator]  total=1000, duplicates=0, uniques=0, timeSpentPaused=0
openaddresses_1   | 2016-11-11T16:49:29.929Z - verbose: [openaddresses] Number of bad records: 1
openaddresses_1   | 2016-11-11T16:49:39.582Z - verbose: [address-deduplicator]  total=1000, duplicates=0, uniques=0, timeSpentPaused=0
openaddresses_1   | 2016-11-11T16:49:39.931Z - verbose: [openaddresses] Number of bad records: 1
openaddresses_1   | 2016-11-11T16:49:49.592Z - verbose: [address-deduplicator]  total=1000, duplicates=0, uniques=0, timeSpentPaused=0
openaddresses_1   | 2016-11-11T16:49:49.931Z - verbose: [openaddresses] Number of bad records: 1
openaddresses_1   | 2016-11-11T16:49:59.604Z - verbose: [address-deduplicator]  total=1000, duplicates=0, uniques=0, timeSpentPaused=0
openaddresses_1   | 2016-11-11T16:49:59.934Z - verbose: [openaddresses] Number of bad records: 1
openaddresses_1   | 2016-11-11T16:50:09.613Z - verbose: [address-deduplicator]  total=1000, duplicates=0, uniques=0, timeSpentPaused=0

@orangejulius
Copy link
Member Author

Hey @glifchits,
In your case it looks like something is definitely stalled. Basically what we have for logging right now is a line that prints out every 10 seconds that lists how many records have been imported (yours shows a constant 1000). Try restarting, and if you're using the address-deduplicator, just turn it off, it's not that helpful anymore and often is a cause of stalled imports. We'll gladly help you debug further either here (actually, preferably in a new issue) or in our gitter chat, if that works better https://gitter.im/pelias/pelias

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants