Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CEP] date_add function for case search #31325

Open
snopoke opened this issue Mar 25, 2022 · 9 comments
Open

[CEP] date_add function for case search #31325

snopoke opened this issue Mar 25, 2022 · 9 comments
Labels
CEP: done CEP CommCare Enhancement Proposal

Comments

@snopoke
Copy link
Contributor

snopoke commented Mar 25, 2022

Abstract
Support basic date arithmetic in case search query language. Specifically adding / subtracting static values from dates to enable calculating values like '1 month ago'.

Motivation
The Case Search Query language is used in Case List Explorer and in Applications via Case Search expressions.

In Case List Explorer it would be useful to be able to generate date ranges dynamically so that you could have saved reports which report on data in sliding date ranges. e.g. monthly report of cases closed in that month.

In Applications it is possible to do some date calculations by using the application xpath engine however these need to be done at the integer level (days since epoch) and so it is impossible to reliably compute month based intervals.

Specification
This CEP proposes a new Case Search query function to support adding and subtracting intervals from dates.

date_add(today(), 'month', -1)
date_add('2020-01-01', 'day', 1)

The function accepts 3 arguments:

  • A date value (or an expression that evaluates to a date)
  • An interval name. One of 'day', 'week', 'month', 'year'
  • An integer value (positive or negative)

Impact on users
Allow users to create queries with dynamic dates.

Impact on hosting
NA

Backwards compatibility
Since this is a new function there are no comparability issues

Release Timeline
NA

Open questions and issues
None at this time

@snopoke snopoke added the CEP CommCare Enhancement Proposal label Mar 25, 2022
@snopoke snopoke changed the title [CEP] case search date arithmetic [CEP] date_add function for case search Mar 25, 2022
@millerdev
Copy link
Contributor

Is the logic of the date_add() function delegated to a built-in library or will we implement it? Is/will there be a specification that can be referenced? Consider questions like

  • What is returned by date_add('2020-02-29', 'year', 1)?
  • What does "month" mean? That is, is it 30 days, or is it different depending on the first argument? For example, does date_add('2020-02-29', 'month', -1) return 2020-01-29 (replace month), 2020-01-30 (subtract 30 days), or 2020-01-31 (same number of days from end of month)?

@snopoke
Copy link
Contributor Author

snopoke commented Mar 25, 2022

The actual logic would be done in python and make use of standard tools.

For intervals that are fixed e.g. day, week the timedelta object in the Python standard library can be used. This has better performance than the relativedetla library.

For intervals that are not fixed, month, year the relativedelta package will be used.

What is returned by date_add('2020-02-29', 'year', 1)?

This would be the same date in the previous year or the closest valid date:

> (date(2020, 2, 29) + relativedelta(years=1)).isoformat()
> '2021-02-28'

What does "month" mean? That is, is it 30 days, or is it different depending on the first argument? For example, does date_add('2020-02-29', 'month', -1) return 2020-01-29 (replace month), 2020-01-30 (subtract 30 days), or 2020-01-31 (same number of days from end of month)?

This would be the same date in the previous month (or the closet valid date):

> (date(2020, 2, 29) + relativedelta(months=-1)).isoformat()
> '2021-01-29'

> (date(2020, 3, 31) + relativedelta(months=-1)).isoformat()
> '2020-02-29'

@millerdev
Copy link
Contributor

For intervals that are fixed e.g. month, week the timedelta object in the Python standard library can be used.

timedelta does not support month. Did you mean day?

@snopoke
Copy link
Contributor Author

snopoke commented Mar 25, 2022

For intervals that are fixed e.g. month, week the timedelta object in the Python standard library can be used.

timedelta does not support month. Did you mean day?

yes, thanks. Updated comment.

@millerdev
Copy link
Contributor

Sounds good. Does it make sense for the date_add() docs to reference relativedelta and possibly timedelta? It would be simplest for users if the docs reference a single date library (rather than two), and it would be surprising if timedelta behaves differently than relativedelta for day and week offsets.

@millerdev
Copy link
Contributor

millerdev commented Mar 25, 2022

If we are referencing relativedelta would it make sense to use plural names? 'days', 'weeks', 'months', 'years'

Elaboration: the singular names have different meanings for relativedelta.

@snopoke
Copy link
Contributor Author

snopoke commented Mar 25, 2022

Does it make sense for the date_add() docs to reference relativedelta and possibly timedelta? It would be simplest for users if the docs reference a single date library (rather than two), and it would be surprising if timedelta behaves differently than relativedelta for day and week offsets.

I don't think it makes sense for the docs to reference the python documentation. These functions will mostly be used by app builders or CommCare users using reports. Linking to the python docs would likely only add confusion, particularly as we are only offering some of the functionality that is available in python.

would it make sense to use plural names? 'days', 'weeks', 'months', 'years'

I don't have preference, we could accept both. In relativedelta singular names replace that field in the date with the given argument. If we exposed that functionality in the query DSL it would be as a separate function so I don't think there's any danger of accepting both. This is how SQL intervals work as well interval '1 hour' interval '2 hours' (interval '1 hours' is also valid)

@millerdev
Copy link
Contributor

millerdev commented Mar 25, 2022

I'd prefer to only accept one form (plural) given the very important distinction in relativedelta and the fact that we plan to use that in the implementation and may point very technical questions toward those docs since they will almost certainly be more complete than our own documentation.

Edit: It's also nice to be strict with syntax so there are no ambiguities in the future. If you pass a non-plural value it should raise an error. Then if we ever want to support different meanings (like relativedelta does), we could add that in the future without backwards compatibility issues.

Otherwise 👍 on this CEP.

@esoergel
Copy link
Contributor

I like this idea a lot. One thing I would also consider (separately) is that a lot of real-world use cases are around calendar date ranges, like "last month" or " this week (Monday-today)". That is, users might want the sliding scale, but relative to the start of the current period, not necessarily the day the report is run. This spec definitely gets us closer and I don't think it would step on the toes of any future fixed date range logic, just something to think about.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CEP: done CEP CommCare Enhancement Proposal
Projects
None yet
Development

No branches or pull requests

3 participants