Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FR: optional .interval argument to fill_gaps #302

Open
warnes opened this issue Sep 28, 2023 · 0 comments
Open

FR: optional .interval argument to fill_gaps #302

warnes opened this issue Sep 28, 2023 · 0 comments

Comments

@warnes
Copy link

warnes commented Sep 28, 2023

I am joining multiple time series values collected on different intervals, ranging from months to years. Consequently, I need to harmonize the intervals to perform the join.

At the moment, I don't see a documented method for setting the desired interval, either directly, or when calling fill_gaps.

StackOverflow shows a mechanism for overriding the interval by explicitly changing the object attribute (see https://stackoverflow.com/a/75981369), but I prefer to use documented interfaces whenever possible.

For my current code, it would be very helpful to have an additional optional .interval argument to fill_gaps that performs this step.

Perhaps something like these:

set_interval<-function(object, ...)
{
  attr(object, 'interval') <- new_interval(...)
  object
}

fill_gaps_interval <- function(.data, ..., .full = FALSE, .start = NULL, .end = NULL, .interval=NULL) 
{
  if(!is.null(.interval))
  { 
    .interval <- as.list(.interval)
    .interval$object <- .data
    .data <- do.call(set_interval, .interval)
  }
  
  call <- match.call()
  call$.data <- .data
  call$.interval <- NULL
  call[[1L]] <- quote(tsibble::fill_gaps)
  eval(call, parent.frame())
}

Reproducable Example:

> library(tidyverse)
> library(tsibble)

> df1 <- tsibble(quarter = yearquarter(as_date(c('2020-1-1','2021-1-1','2022-3-1'))),
+                   amount = c(5, 2, 1))
Using `quarter` as index variable.

> df2 <- tsibble(quarter = yearquarter(as_date(c('2022-1-1','2022-4-1','2022-7-1'))),
+                   amount = c(5, 2, 1))
Using `quarter` as index variable.

> ###
> # Existing functionality
> ###
> 
> interval(df1)
<interval[1]>
[1] 4Q

> # --> Fills 4Q interval
> df1 %>% fill_gaps(.start=yearquarter('2020-01-01'), .end=yearquarter('2023-01-01'))
# A tsibble: 4 x 2 [4Q]
  quarter amount
    <qtr>  <dbl>
1 2020 Q1      5
2 2021 Q1      2
3 2022 Q1      1
4 2023 Q1     NA

> # --> Fills 1Q interval
> interval(df2)
<interval[1]>
[1] 1Q

> df2 %>% fill_gaps(.start=yearquarter('2020-01-01'), .end=yearquarter('2023-01-01'))
# A tsibble: 13 x 2 [1Q]
   quarter amount
     <qtr>  <dbl>
 1 2020 Q1     NA
 2 2020 Q2     NA
 3 2020 Q3     NA
 4 2020 Q4     NA
 5 2021 Q1     NA
 6 2021 Q2     NA
 7 2021 Q3     NA
 8 2021 Q4     NA
 9 2022 Q1      5
10 2022 Q2      2
11 2022 Q3      1
12 2022 Q4     NA
13 2023 Q1     NA

> ###
> # Desired functionality: Fill to individual quarter 
> ##
> df1 %>% fill_gaps_interval(.start=yearquarter('2020-01-01'), .end=yearquarter('2023-01-01'), .interval=c(quarter=1))
# A tsibble: 13 x 2 [1Q]
   quarter amount
     <qtr>  <dbl>
 1 2020 Q1      5
 2 2020 Q2     NA
 3 2020 Q3     NA
 4 2020 Q4     NA
 5 2021 Q1      2
 6 2021 Q2     NA
 7 2021 Q3     NA
 8 2021 Q4     NA
 9 2022 Q1      1
10 2022 Q2     NA
11 2022 Q3     NA
12 2022 Q4     NA
13 2023 Q1     NA

> df2 %>% fill_gaps_interval(.start=yearquarter('2020-01-01'), .end=yearquarter('2023-01-01'), .interval=c(quarter=1))
# A tsibble: 13 x 2 [1Q]
   quarter amount
     <qtr>  <dbl>
 1 2020 Q1     NA
 2 2020 Q2     NA
 3 2020 Q3     NA
 4 2020 Q4     NA
 5 2021 Q1     NA
 6 2021 Q2     NA
 7 2021 Q3     NA
 8 2021 Q4     NA
 9 2022 Q1      5
10 2022 Q2      2
11 2022 Q3      1
12 2022 Q4     NA
13 2023 Q1     NA
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant