Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardizer #51

Open
groverpr opened this issue Aug 14, 2019 · 0 comments
Open

Standardizer #51

groverpr opened this issue Aug 14, 2019 · 0 comments

Comments

@groverpr
Copy link

groverpr commented Aug 14, 2019

Thanks for creating this useful library.
I have a feature request. There can be use cases where we are required to match a number of addresses from two different datasets. In order to be able to properly match, the address from both of the datasets needs to be standardized in "single standard" format.

Current solution: Using the expand_address, we can get a list of possible addresses for an addrress. One option to compare any 2 addresses is to expand both of the addresses and see if any of the element in first list matches with any of the element in the second list. This works. But this becomes very inefficient when size of the datasets grow.

Requirement: What would work better here is a standardize_address function that doesn't give a list of all possible addresses but just a single standardized address. For eg. st., st, St, ST., str in an address etc. should all be renamed to "street". This function could be directly useful for many tasks and could be nice to have functionaliy for this library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant