Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create download_vpts_aloft() function #553

Closed
6 of 13 tasks
peterdesmet opened this issue Apr 4, 2023 · 5 comments · Fixed by #579
Closed
6 of 13 tasks

Create download_vpts_aloft() function #553

peterdesmet opened this issue Apr 4, 2023 · 5 comments · Fixed by #579
Assignees
Labels
VPTS CSV Related to VPTS CSV format
Milestone

Comments

@peterdesmet
Copy link
Collaborator

peterdesmet commented Apr 4, 2023

download_vpts_aloft(
  date_min = NULL,
  date_max = NULL,
  radars = NULL,
  directory = ".",
  overwrite = FALSE,
  format = "csv" # also hdf5
  source = "baltrad", # also ecog-04003
)
  • Decide on controlled values for format. Either csv/hdf5 (widely applicable) or monthly, daily, hdf5
  • Should we make use of an S3 dependency to download files?
  • How should we warn for unrecognized radars?
  • How should we warn for dates without data?

Todo

  • Update bucket endpoint from https://lw-enram.s3-eu-west-1.amazonaws.com to s3://aloft
  • Adapt to new directory structure at bucket
  • Add new parameter source = BALTRAD
  • radars parameter will remain the same (5 letter code: bejab)
  • directory parameter can remain the same
  • overwrite parameter can remain the same
  • Update documentation
  • Update tests
  • Deprecate download_vpfiles() to download_vpts_aloft(format = "hdf5")
@PietrH
Copy link
Collaborator

PietrH commented May 17, 2023

I'd like to work on this, here are my thoughts:

Q/A

Decide on controlled values for format. Either csv/hdf5 (widely applicable) or monthly, daily, hdf5

I prefer csv/hdf5, and return in the same directory structure as the bucket

Should we make use of an S3 dependency to download files?

Let's try to do it without the dependency at first and reevaluate.

How should we warn for unrecognized radars?

radars: {missing_radars} not found, found {number_of_radars} other radars

How should we warn for dates without data?

No radars found for all dates, radars found for {first_date_found} to {last_date_found}

Tests

  • Can we use the existing tests from download_vpfiles()?
  • Testing the directory structure
  • Should we test the output files themselves using snapshots?
  • test the new warnings
  • test overwriting
  • test format

Branch

Has work already started on this? Is there a branch I can continue on?

@PietrH
Copy link
Collaborator

PietrH commented May 17, 2023

If a radar is missing, stop.

@PietrH PietrH self-assigned this May 17, 2023
@PietrH
Copy link
Collaborator

PietrH commented May 17, 2023

jsonlite::fromJSON("https://raw.githubusercontent.com/enram/aloftdata.eu/main/_data/OPERA_RADARS_DB.json")$odimcode
  • Allow downloading multiple radars
  • download_vpfiles() as an example
  • Error if radar doesn't exist based on json file
  • Try writing function so it'll download whatever it can
  • Use progress to show how many files have already been downloaded. Silence with progress=FALSE.
  • Message for each file downloaded (cf. download_vpfiles). Silence with verbose=FALSE.

@PietrH
Copy link
Collaborator

PietrH commented May 17, 2023

We have decided to build a function list_vpts_aloft() that returns a vector of urls that are known to exist, given the filtering parameters originally envisioned for download_vpts_alof()

list_vpts_aloft(
  date_min = NULL,
  date_max = NULL,
  radars = NULL,
  # directory = ".", This parameter is removed
  # overwrite = FALSE, This parameter is removed
  format = "csv" # also hdf5
  source = "baltrad", # also ecog-04003
)

Checking if a file exists can be done using the aws.s3 dependency via: aws.s3::get_bucket_df(bucket = "s3://aloft", prefix="baltrad/monthly", region = "eu-west-1", max = 2000) or much slower using httr: urls[!furrr::future_map_lgl(urls, ~httr::http_error(httr::HEAD(.x)))]

@PietrH PietrH linked a pull request May 17, 2023 that will close this issue
17 tasks
@PietrH PietrH mentioned this issue May 17, 2023
17 tasks
@adokter adokter reopened this Jul 17, 2023
@adokter adokter added this to the 1.0.0 milestone Jul 17, 2023
@peterdesmet
Copy link
Collaborator Author

peterdesmet commented Dec 14, 2023

Note: I think it might be better to create a generic download_files() function that is provided a vector of URLs (e.g. generated by list_vpts_aloft()), see #648

@adokter adokter closed this as completed in 3314710 Jun 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
VPTS CSV Related to VPTS CSV format
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants