Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster OpenAQ (using openaq-fetches) #122

Open
zmoon opened this issue Jun 13, 2023 · 1 comment
Open

Faster OpenAQ (using openaq-fetches) #122

zmoon opened this issue Jun 13, 2023 · 1 comment
Labels
enhancement New feature or request
Milestone

Comments

@zmoon
Copy link
Member

zmoon commented Jun 13, 2023

Some initial timings suggest we can gain speed by using s3 URL scheme instead of current HTTPS:

In [8]: %timeit -n 10 pd.read_json("https://openaq-fetches.s3.amazonaws.com/realtime/2013-11-26/2013-11-26.ndjson",
   ...:  lines=True)
311 ms ± 37.1 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [9]: %timeit -n 10 pd.read_json("s3://openaq-fetches/realtime/2013-11-26/2013-11-26.ndjson", lines=True)
159 ms ± 13.5 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

similar to #113

Originally posted by @zmoon in #116 (comment)

@zmoon zmoon added the enhancement New feature or request label Jun 13, 2023
@zmoon zmoon added this to the v0.3 milestone Jun 13, 2023
@zmoon
Copy link
Member Author

zmoon commented Jun 13, 2023

Also was experimenting at some point and found some ways to speed up the JSON parsing, datetime in particular.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant