Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Country name cleaning failed example #939

Open
yibenhuang opened this issue Aug 24, 2022 · 0 comments
Open

Country name cleaning failed example #939

yibenhuang opened this issue Aug 24, 2022 · 0 comments
Assignees
Labels
type: bug Something isn't working

Comments

@yibenhuang
Copy link

yibenhuang commented Aug 24, 2022

Describe the bug
Hi, just found the country name "Virgin Islands (British)" would be failed to clean to the correct name.

To Reproduce

import pandas as pd
from dataprep.clean import clean_country

df = pd.DataFrame({"country": ["Virgin Islands (British)", "Virgin Islands (U.S.)"]})
clean_country(df, column="country", output_format="name")

Output:

country country_clean
0 Virgin Islands (British) NaN
1 Virgin Islands (U.S.) United States Virgin Islands

Expected behavior
The based on project country_converter can work like below.

import country_converter as coco

names = ["Virgin Islands (British)", "Virgin Islands (U.S.)"]
cc = coco.CountryConverter()

cc.convert(names=names, to="name_short")
# Output: ['British Virgin Islands', 'United States Virgin Islands']

Desktop (please complete the following information):

  • OS: macOS
  • Browser: Chrome
  • Platform: Jupyter Notebook
  • Platform Version 6.4.12
  • Python Version: 3.10.5
  • Dataprep Version: 0.4.5
@yibenhuang yibenhuang added the type: bug Something isn't working label Aug 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants