Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sealed hierarchy for arbitrary precision dates and better static typing #177

Open
wkornewald opened this issue Jan 10, 2022 · 3 comments
Open

Comments

@wkornewald
Copy link
Contributor

wkornewald commented Jan 10, 2022

Originally suggested in PR #173 which contained both the sealed hierarchy and YearMonth. The YearMonth feature request is in #168. The implementation of YearMonth (and Year) is related to this proposal, but maybe for the discussion here we should focus on the sealed hierarchy aspect.

This is just a very quick example of how we could add support for an arbitrary precision date that can be a Year or YearMonth or LocalDate or ZonedDateTime (which could itself be a sealed class of OffsetDateTime and RegionDateTime - see #175).

At least in the medical space it's pretty common to have flexible date precisions in the official specifications, so you have to support at least these operations

  • parsing
  • machine-formatting back to the same precision string (so parsing will result in the same object)
  • human-formatting (displaying in the UI, with different precisions levels)
  • sorting/comparing in a natural way - at least how humans would sort them

You might also want to treat the data differently based on its type. Here it helps a lot to have a sealed interface, so you can easily match all possible cases (which is also useful for implementing the formatting and sorting etc. functions).

Using a sealed hierarchy would allow writing code in a more statically safe way because you can enforce with types that at least a certain precision level is provided. You can also exhaustively match on all subtypes, which makes working with arbitrary precision dates much more comfortable.

The latest example code is here: https://github.com/Kotlin/kotlinx-datetime/compare/master...wkornewald:feature/yearmonth-and-arbitraryprecisiondate?expand=1

To start the discussion, here's a copy of the code, which is a strict precision hierarchy ranging from Year to ZonedDateTime and it only contains the types that humans usually work with:

public sealed interface AtLeastYear : Comparable<AtLeastYear> {

    public val year: Int

    override fun compareTo(other: AtLeastYear): Int {
        val result = toComparisonInstant().compareTo(other.toComparisonInstant())
        return if (result == 0) hierarchyLevel.compareTo(other.hierarchyLevel) else result
    }

    public fun toComparisonInstant(): Instant

    public companion object {

        public fun parse(value: String): AtLeastYear {
            TODO()
        }
    }
}

public val AtLeastYear.hierarchyLevel: Int get() = when (this) {
    is Year -> 0
    is YearMonth -> 1
//    is LocalDate -> 2
//    is ZonedDateTime -> 3
}

public data class Year(override val year: Int) : AtLeastYear {

    override fun toComparisonInstant(): Instant =
        "${year.formatLen(4)}-01-01T00:00:00Z".toInstant()
}

public sealed interface AtLeastYearMonth : AtLeastYear {

    public val month: Month

    public companion object {

        public fun parse(value: String): AtLeastYearMonth {
            TODO()
        }
    }
}

public data class YearMonth(override val year: Int, override val month: Month) : AtLeastYearMonth {

    override fun toComparisonInstant(): Instant =
        "${year.formatLen(4)}-${month.number.formatLen(2)}-01T00:00:00Z".toInstant()
}

public sealed interface AtLeastDate : AtLeastYearMonth {
    // ...
}

// class LocalDate : AtLeastDate
// sealed class ZonedDateTime : AtLeastDate

Of course one could also implement this outside of kotlinx.datetime, but having it here in this lib makes more sense because it works out of the box without any indirections, wrapping or custom types and it's helpful for everyone who must tackle this kind of problem.

@wkornewald wkornewald changed the title A sealed hierarchy would allow better static typing Sealed hierarchy for arbitrary precision dates and better static typing Jan 10, 2022
@dkhalanskyjb
Copy link
Contributor

First, some personal communication. @wkornewald, you've been very active with this project, and our conversations yielded some very insightful notions. Thank you a lot for this!

I feel though that we need to establish some expectations about user contributions. You've been mostly trying to propose some complete technical decisions, and then I would go on to argue to get the rationale behind them. It must surely seem from your side that I'm thick-headed and unable to see the beauty of your proposals, but the reality is more subtle.

In our internal decision-making, we consider a broad range of things, such as precedents for naming things, discoverability, the API shapes in the Kotlin ecosystem as a whole, ease of internalizing a whole library, the interconnection between different concepts, etc. Sometimes, one of us proposes a solution for some problem, but after long debates, we stumble upon a more elegant and clean solution for the very same things.

So, the problems are the root of our decision-making. A complete, fleshed-out implementation is often much less useful to us than a thorough description of the real problems that it solves. You have a lot of domain knowledge that includes datetime things, you understand how to solve the problems you've faced, but these solutions are not useful to us on their own, as we can't just incorporate them without first searching for a simpler way, using our domain knowledge. Maybe it turns out the problem can be solved with the existing abstractions. Maybe the problem is just theoretical and a solution for it won't be useful to anyone ever. Maybe we recognize the problem, and in the end, will converge to something very similar to what you're proposing. Maybe we'll implement something entirely unlike it. No way to know beforehand.

Therefore, when we argue about what may feel like minutia even though the abstraction is obviously logical and beneficial, it doesn't necessarily mean we disagree with it, it's just that we are capable of providing the abstractions ourselves, but we need to know to what end.


Now, for technical things.

so you have to support at least these operations

Three of the four things listed are for parsing and formatting, which does not necessarily require a separate class for each combination. Look at Python, or Swift: they're doing just fine, despite each having only one object to represent datetimes.

The fourth thing is sorting "at least how humans would sort them", which, I think, is not a viable goal. Humans are very adaptable, computers are not. As a human, if I was tasked to find the earliest date in the list "June 1st, June 17th, June, June 2nd, June 28th", I would point to "June 1st", not "June", as there's obviously no date earlier than that. If I was tasked to find the latest date, I would point to "June 28th", assuming that it's unlikely that the date that's just "June" is later than that. If I was tasked with sorting these things, I would sort them by assuming that "June" means a non-existent "June 0th". So, no, I am wary about providing any one way of sorting these, as such sorting could be misapplied.

Also, what are the use cases for sorting here? Which problem does it solve? Some specific examples, please, like "we use this mixed-data representation for X, which we have to output as Y, where the order is obviously Z".

You might also want to treat the data differently based on its type.

This seems logical, but we're not going after beautiful abstractions at all costs, as overloading a programmer with too many entities is confusing when all they want to do is solve some common task. What task requires such casing? Does anyone do anything nontrivial with such data, that is, something besides parsing and showing?

I saw in the PR you implying code like "the vaccination date is stored; if only the year Y is known and it's the first three months of the year Y+1, it's fine not to revaccinate; if the month is also known, then if half a year has not passed since these year + month, one also doesn't need to revaccinate; otherwise, suggest revaccination". Fair! Any other suggestions?

Using a sealed hierarchy would allow writing code in a more statically safe way because you can enforce with types that at least a certain precision level is provided.

If we introduced Year and YearMonth, then Pair<T, (T) -> YearMonth>, for example, would be sufficient for this.

You can also exhaustively match on all subtypes, which makes working with arbitrary precision dates much more comfortable.

This is true. Do you have any particular use cases in mind?

@wkornewald
Copy link
Contributor Author

First, some personal communication. @wkornewald, you've been very active with this project, and our conversations yielded some very insightful notions. Thank you a lot for this!

I feel though that we need to establish some expectations about user contributions. You've been mostly trying to propose some complete technical decisions, and then I would go on to argue to get the rationale behind them. It must surely seem from your side that I'm thick-headed and unable to see the beauty of your proposals, but the reality is more subtle.

Thank you for the explanation. Maybe our discussion would be much easier if we did it on the phone. 😄

so you have to support at least these operations

Three of the four things listed are for parsing and formatting, which does not necessarily require a separate class for each combination. Look at Python, or Swift: they're doing just fine, despite each having only one object to represent datetimes.

Well, Python's API is pretty old and it has multiple objects, too:

  • datetime which is a mixture of LocalDateTime, Instant and ZonedDateTime
  • date
  • time

If you look at more modern APIs like the [Temporal](https://tc39.es/proposal-temporal/docs/] proposal, you also find PlainYearMonth and other objects for varying precision levels.

Even if those objects are simple, it makes a lot of sense to provide a common type that can be used across different libraries without everyone reinventing his own abstraction and then translating between different objects. If someone builds a calendar picker and wants to allow varying precision levels then it's surely nicer if the communtiy can build on something official that is then compatible with other community libraries out of the box (see how Rust has handled async related APIs).

The fourth thing is sorting "at least how humans would sort them", which, I think, is not a viable goal. Humans are very adaptable, computers are not. As a human, if I was tasked to find the earliest date in the list "June 1st, June 17th, June, June 2nd, June 28th", I would point to "June 1st", not "June", as there's obviously no date earlier than that. If I was tasked to find the latest date, I would point to "June 28th", assuming that it's unlikely that the date that's just "June" is later than that. If I was tasked with sorting these things, I would sort them by assuming that "June" means a non-existent "June 0th". So, no, I am wary about providing any one way of sorting these, as such sorting could be misapplied.

Also, what are the use cases for sorting here? Which problem does it solve? Some specific examples, please, like "we use this mixed-data representation for X, which we have to output as Y, where the order is obviously Z".

We have a list of medical data with varying date precisions. The user wants to see all of them as a timeline, sorted by date. So you have to somehow handle dates like "June 2016" and "2016". A pretty sensible default behavior (that can still be overridden for special cases) is to sort the objects like this:

  • "2016"
  • "June 2016"
  • "June 10th, 2016"
  • "June 10th, 2016 15:00:00"
  • "June 10th, 2016 16:00:00"
  • "June 11th, 2016"

Though, I can see if you find this too controversial.

You might also want to treat the data differently based on its type.

This seems logical, but we're not going after beautiful abstractions at all costs, as overloading a programmer with too many entities is confusing when all they want to do is solve some common task. What task requires such casing? Does anyone do anything nontrivial with such data, that is, something besides parsing and showing?

Even showing the data can be non-trivial in that (real-world example) we have to treat ZonedDateTime specially because the UI requires showing date and time in separate widgets and/or different styles. Depending on the type you might even want to show "1 hour ago" or "sometime in 2016". Being able to use when to deal with all possible cases safely is a good thing.

Using a sealed hierarchy would allow writing code in a more statically safe way because you can enforce with types that at least a certain precision level is provided.

If we introduced Year and YearMonth, then Pair<T, (T) -> YearMonth>, for example, would be sufficient for this.

I don't understand your Pair example. What is T there and how does that make it possible to define a fun formatStyled(x: AtLeastYear) which knows how to display (with styles) a ZonedDateTime vs. LocalDateTime vs. YearMonth vs. Year?

Or how about a date picker where the doctor can enter different date precisions or modify an existing date?

Also, Pair and other generic types have a big problem with readability and type erasure. How can I even safely do an is check or type cast?

@wkornewald
Copy link
Contributor Author

BTW, this really is just a technical improvement suggestion to build more APIs around sealed (it often results in better APIs). It’s by far not my most important issue though.

I find it much more important to have a proper ZonedDateTime like in my draft PR.

Maybe secondary would be YearMonth and Year and MonthDay just to have good standard types that others can build upon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@wkornewald @dkhalanskyjb and others