Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetch all variables as 'character' format #26

Open
yannikbuhl opened this issue Feb 14, 2024 · 2 comments
Open

Fetch all variables as 'character' format #26

yannikbuhl opened this issue Feb 14, 2024 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@yannikbuhl
Copy link
Collaborator

As of now, we let {readr} decide the data format based on the csv that we get from the API.

However, it is recommended that we read all variables as 'char', since in the German system of unique identifiers for municipalities, Länder, etc. there will in many cases be a leading zero. It will result in a small workload for the user to convert variables into the format they wish it to be but will be less error prone.

@yannikbuhl yannikbuhl added the enhancement New feature or request label Feb 14, 2024
@yannikbuhl yannikbuhl self-assigned this Feb 14, 2024
@ColdCactus
Copy link

Maybe provide an option that downloads datasets with the AGS value as character and left-paddd with leading zeros?

Or, this is a bit of a hack, but the code could check (some of) the unique values of a numeric variable against the list of valid AGS/GVIS municipality IDs. If >90% of values are valid AGS/GVIS and, conversely, 90% (for country-wide data) of all valid AGS/GVIS codes are in the numerical variable, it seems plausible that this variable is a municipality code. Then (only) this variable could be kept as character and all other ones converted to numeric.

If performance becomes a problem, all variables with decimal values may be excluded from this check - no valid AGS/GVIS has any commas/decimals.

@yannikbuhl
Copy link
Collaborator Author

Thank you very much @ColdCactus for your comments. We'll consider this when implementing a solution. Having a list of valid AGS/GVIS codes introduces some maintenance burden since changes occur more or less frequently, altough it is probably a more elegant solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants