Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VPTS objects as data.frame #568

Open
peterdesmet opened this issue May 16, 2023 · 0 comments
Open

VPTS objects as data.frame #568

peterdesmet opened this issue May 16, 2023 · 0 comments
Assignees
Milestone

Comments

@peterdesmet
Copy link
Collaborator

peterdesmet commented May 16, 2023

This issue summarizes the outcomes of a discussion of May 16, 2023.

Goal

Get rid of the bioRad objects vp and vpts. Instead, make them VPTS data frames, so they can are compatible with functions outside bioRad, like dplyr.

Design decisions

  1. VPTS are tibble dataframes, with an extra class attribute vpts:
    > class(vpts)
    [1] "vpts"        "spec_tbl_df" "tbl_df"      "tbl"         "data.frame"
    
  2. print(vpts) shows summary information and the default tibble preview. The summary information is for the whole data frame, not per radar. Users can reduce that info by running filter(radar == "bejab") beforehand:
    # Irregular time series of vertical profiles (class vpts)
    # Radars: bejab, bewid
    # Time range: 2023-02-01 00:00:00 / 2023-03-01 00:00:00 (2 days)
    # Height range: 0 / 4000
    # A tibble: 85,700 × 26
       radar datetime            height     u       v       w    ff    dd sd_vvp gap     eta  dens    dbz
       <chr> <dttm>               <dbl> <dbl>   <dbl>   <dbl> <dbl> <dbl>  <dbl> <lgl> <dbl> <dbl>  <dbl>
     1 bejab 2023-02-01 00:00:00      0 NA    NA       NA     NA     NA     4.04 TRUE   99.0  9.00  -5.61
     2 bejab 2023-02-01 00:00:00      0 NA    NA       NA     NA     NA     4.01 TRUE  111.  10.1   -5.11
     3 bejab 2023-02-01 00:00:00      0 NA    NA       NA     NA     NA     3.44 TRUE  120.  10.9   -4.79
     4 bejab 2023-02-01 00:00:00    200  4.50 -0.502   66.3    4.53  96.4   2.82 FALSE  58.0  5.27  -7.93
     5 bejab 2023-02-01 00:00:00    200  2.73 -2.19    47.7    3.50 129.    2.58 FALSE  61.9  5.63  -7.65
     6 bejab 2023-02-01 00:00:00    200  1.13 -1.77   -44.4    2.10 147.    3.40 FALSE  68.2  6.20  -7.23
     7 bejab 2023-02-01 00:00:00    400  1.58  0.373   -4.74   1.62  76.7   2.43 FALSE  26.6  2.42 -11.3 
     8 bejab 2023-02-01 00:00:00    400  2.17 -0.0270  -6.36   2.17  90.7   2.42 FALSE  20.6  1.87 -12.4 
     9 bejab 2023-02-01 00:00:00    400  1.64  0.339   -3.28   1.67  78.3   2.16 FALSE  26.5  2.41 -11.3 
    10 bejab 2023-02-01 00:00:00    600  1.13  0.827    0.327  1.40  53.8   1.29 FALSE  10.8  0    -15.2 
    # ℹ 85,690 more rows
    # ℹ 13 more variables: dbz_all <dbl>, n <dbl>, n_dbz <dbl>, n_all <dbl>, n_dbz_all <dbl>, rcs <dbl>,
    #   sd_vvp_threshold <dbl>, vcp <dbl>, radar_latitude <dbl>, radar_longitude <dbl>,
    #   radar_height <dbl>, radar_wavelength <dbl>, source_file <chr>
    # ℹ Use `print(n = ...)` to see more rows
    
  3. VPTS can contain multiple radars (this is different from current vpts). Functions that require a single radar return an error with the suggestion to filter(radar == "radar") on the dataframe.
  4. A single vertical profile is also a VPTS object. Functions that are specific to single VP (like plot.vp()) can be integrated in vpts functions, but return something specific (e.g. this plot) if it only contains a single radar/timestamp. Alternatively, those VPTS objects could have an additional class property:
    > class(vpts)
    [1] "vp"    "vpts"        "spec_tbl_df" "tbl_df"      "tbl"         "data.frame"
    
  5. When reading a single hdf5 vp file, metadata properties ($radar, $datetime, $attributes) could added as attributes, so they are available to the user for inspection. The moment the VPTS object contains multiple radars/timestamps, those attributes would be discarded.
  6. A check_vpts() function could be added to see if a VPTS object meets criteria (e.g. expect col names and col types).
  7. We could alter dplyr functions, so they don't create invalid VPTS objects. See https://dplyr.tidyverse.org/reference/dplyr_extending.html
  8. Rather than trying to clean a VPTS when reading, the regularize_vpts() function could be expended, so it allows to:
  • Remove duplicate rows
  • Regularize timestamps
  • Regularize height range across radars
  • Regularize height interval across radars
  1. All functions that accept the current vp, c(vp) or vpts as input should be adapted to work with VPTS data frames.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants