Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug reading datetimes (floating datetime format) #231

Open
alixepstein opened this issue Apr 24, 2024 · 5 comments
Open

Bug reading datetimes (floating datetime format) #231

alixepstein opened this issue Apr 24, 2024 · 5 comments

Comments

@alixepstein
Copy link

When I look at my data on Socrata this column is datetime type, and I can see the date and time down to the second, but when I use read.socrata it comes in as character type and while the date is there, the times are all 00:00:00.
Example:

sessiondf <- read.socrata(https://data.cambridgema.gov/resource/99xk-ybak.json, email = SocrataUsername, password = SocrataPW)

class(sessiondf$starttime)
[1] "character"
head(sessiondf$starttime)
[1] "2020-01-01T00:00:00.000Z" "2020-01-01T00:00:00.000Z" "2020-01-01T00:00:00.000Z" "2020-01-01T00:00:00.000Z"
[5] "2020-01-01T00:00:00.000Z" "2020-01-01T00:00:00.000Z"
tail(sessiondf$starttime)
[1] "2024-04-16T00:00:00.000Z" "2024-04-16T00:00:00.000Z" "2024-04-16T00:00:00.000Z" "2024-04-15T00:00:00.000Z"
[5] "2024-04-16T00:00:00.000Z" "2024-04-16T00:00:00.000Z"

@geneorama
Copy link
Member

It's difficult to troubleshoot without access, but the trailing Z looks suspicious in this example "2024-04-16T00:00:00.000Z". Can you provide more detail on the format?

@levyj do you have any insight into this?

posixify seems to be missing the pattern for the trailing Z, but I'd like to confirm that this is formatted as a date time, it's a valid format, and I'm not missing anything else.

Thanks

@levyj
Copy link
Member

levyj commented Apr 24, 2024

I think it is real. It indicates UTC time zone.

https://en.wikipedia.org/wiki/ISO_8601

image

I would like to note that if I am right, I just out-dorked @geneorama on a date/time issue! 😲

@geneorama
Copy link
Member

@levyj I was not sure if that's an expected format for how Socrata returns timestamps. I didn't think Socrata supported time zones or even specified UTC.

I did misremember the ISO-8601 format as having the Z in the middle (replacing the T), so thanks for confirming that it's a valid representation.

I see three possibilities.

  1. This is text that happens to be in ISO8601 format, but it isn't stored as a timestamp and should be treated as text.
  2. This is just a new issue that nobody has noticed or I've forgotten.
  3. This is a new bug because Socrata has changed their supported formats and there may be a slow trickle of bug reports as other new formats are discovered.

If it's number 3 I'm most inclined to suggest that the user convert the column manually with something like as.POSIXct("2020-11-14",format="%Y-%m-%d")

(I would suggest doing the conversion manually anyway because it may be a while before we can patch this.)

@alixepstein
Copy link
Author

alixepstein commented Apr 25, 2024 via email

@geneorama geneorama changed the title Bug reading datetimes Bug reading datetimes (floating datetime format) Apr 25, 2024
@geneorama
Copy link
Member

@alixepstein thanks for letting us know! I modified the issue title for this additional information which will help tracking in the future.

This is still an issue, even though you've found a way around it. I think the right way to resolve it would be to do client side time zone conversions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants