Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Values of "1939-" and "2014+" converted to 0, 18 but looks like a census (not tidycensus) issue #526

Open
zross opened this issue Jun 9, 2023 · 2 comments

Comments

@zross
Copy link

zross commented Jun 9, 2023

@walkerke initially we thought this was an issue with tidycensus, but it looks like it's related to the census API. But I think it's worth letting you know about.

We were interested in median household age and used this tidycensus code:

dat <- get_acs(
  geography = "tract",
  state = "TX",
  table = "B25035",
  survey = "acs5",
  year = 2020,
  geometry = TRUE)

We noticed that some median ages were "0" and "18". Rather than the expected values which are a four-digit year.

So we downloaded the data as a CSV directly from the Census here and noticed that the "0" and "18" seemed to correspond to the character values of "1939-" and "2014+".

We assume this was tidycensus but then constructed the API call manually and when you hit the API with this call you also see the 0 and 18 values.

https://api.census.gov/data/2020/acs/acs5?get=B25035_001E%2CB25035_001M%2CNAME&for=tract%3A%2A&in=state%3A48

ACSDT5Y2020.B25035-Data.csv

@walkerke
Copy link
Owner

Thanks @zross for the heads up! I'll think about how to handle this; I see a couple directions. One would be to convert to NA; the other would be to manually top- and bottom-code by converting to 1939 and 2014 (or whatever it is for a particular ACS year), respectively. I see arguments for and against both options (though keeping it as-is isn't an option, I think).

@zross
Copy link
Author

zross commented Jun 12, 2023

I can't imagine a situation where this would be on purpose -- we did some searching to see if it was intentional and came up with nothing but it's also surprising that this escaped notice.

I'm a bit mixed. It's not really your job to fix Census-related issues right? And then you're on the hook for these changes. I wonder if just adding a warning for users when they pull that variable might be most appropriate? I don't know what is best here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants