Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle different languages #53

Open
pmayd opened this issue Feb 20, 2024 · 3 comments · May be fixed by #58
Open

Handle different languages #53

pmayd opened this issue Feb 20, 2024 · 3 comments · May be fixed by #58
Assignees
Labels
development internal issues for the dev team

Comments

@pmayd
Copy link
Collaborator

pmayd commented Feb 20, 2024

The wrapper is downloading data in the German format (commas as decimals etc.) - so pandas doesn't recognize the columns as numbers. We would have to do it manually. Would it be possible to pass arguments similar to pd.read_csv(decimals=",", thousands=".") And have missing values as NaN instead of "-"? If I remember correctly, Michael said something about an English API endpoint that delivers data in a format that works better for pandas.

TODOs:

  • compare Postman/raw data from API with our package when downloading the same table with language de and en
  • implement a handling for these languages
@pmayd pmayd added the development internal issues for the dev team label Feb 20, 2024
@pmayd pmayd assigned bergnerjonas and unassigned PiaSchroeder Apr 30, 2024
@pmayd
Copy link
Collaborator Author

pmayd commented May 23, 2024

@bergnerjonas here what we discussed:

  • it may be that Regionalstatistik does not have support for en so we can drop it there
  • when there is no data, we should not go to additional API requests like /find or /catalogue to get the label for a given code
  • we can try to parse the repeating code column names to get the right names, but we should investigate first how the switch to en is supported in different databases

@bergnerjonas
Copy link
Collaborator

Both GENESIS and Zensus currently support switching to 'en' and return proper translations. I have removed the support for Regionalstatsistik.

@bergnerjonas bergnerjonas linked a pull request May 27, 2024 that will close this issue
@pmayd
Copy link
Collaborator Author

pmayd commented May 27, 2024

Very good, I will review your PR this week!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development internal issues for the dev team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants