Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential missing columns in json compared to csv #207

Closed
geneorama opened this issue Nov 9, 2022 · 1 comment
Closed

Potential missing columns in json compared to csv #207

geneorama opened this issue Nov 9, 2022 · 1 comment

Comments

@geneorama
Copy link
Member

When an entire column is missing it may not be imported when using the json endpoint. This could be considered a Socrata issue rather than an RSocrata issue, since the column isn't in the JSON at all and RSocrata is picking it up correctly.

In this example school2_secondary_address, school3_secondary_address, and school4_secondary_address is dropped.

df_json <- read.socrata("https://data.cityofchicago.org/resource/kqmn-byj8.json")
df_comma <- read.socrata("https://data.cityofchicago.org/resource/kqmn-byj8.csv")

dim(df_json) ## 1021, 17
dim(df_comma) ## 1021, 20


merge(data.frame(col = colnames(df_json), 
                 NAs_JSON = sapply(df_json, function(x)sum(is.na(x)==TRUE))),
      data.frame(col = colnames(df_comma), 
                 NAs_comma = sapply(df_comma, function(x)sum(is.na(x)==TRUE))),
      by = "col", all = T)


#                          col NAs_JSON NAs_comma
# 1                 definition       13         0
# 2                     grades       30         0
# 3            primary_address        1         0
# 4                school_name        0         0
# 5         school2_definition      985         0
# 6             school2_grades      987         0
# 7               school2_name      984         0
# 8    school2_primary_address      986         0
# 9  school2_secondary_address       NA      1021
# 10        school3_definition     1014        21
# 11            school3_grades     1014        21
# 12              school3_name     1014        21
# 13   school3_primary_address     1014        21
# 14 school3_secondary_address       NA      1021
# 15        school4_definition     1018        21
# 16            school4_grades     1018        21
# 17              school4_name     1018        21
# 18   school4_primary_address     1018        21
# 19 school4_secondary_address       NA      1021
# 20            second_address     1015        21

Also, the NAs are handled differently, but this is also more likely a Socrata issue than RSocrata issue.

@geneorama
Copy link
Member Author

Never mind, this is an exact duplicate of #184

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant