Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Partial Order design proposal #276

Draft
wants to merge 25 commits into
base: master
Choose a base branch
from

Conversation

AlexKnauth
Copy link
Member

@AlexKnauth AlexKnauth commented Feb 8, 2023

A PartialOrder interface intended to be privately implemented.

Comparison operators like =~, <~, >~, and so on.

Equality within tolerance for testing with within.

Markdown-rendered design: https://github.com/AlexKnauth/rhombus-prototype/blob/partial-order/design/partial-order/partial-order.md

@rocketnia
Copy link

rocketnia commented Feb 9, 2023

(This comment is long but mostly resolves itself, with a few exceptions. These are marked with backwards footnotes that look like "[bf_]." At the end is a list of references to the unresolved points.)

Ooh, I take it this is meant primarily to be a generic partial order interface so that multiple parametrically polymorphic types with innate partial orderings can compose with each other like ListOfVectors<a> = List<Vector<a>>. I was surprised a bit by the way a partial order was implemented here at first and thought maybe it was more of a total order interface or an equality interface, but if partial orders are what it's meant for, that would certainly justify the name. (Pedantically, I'd like the name to be DecidablePartialOrderElement 😛 or maybe PartiallyComparable.[bf1])

Some of my doubts at first were due to a few conflicting elements in the design. I'm going to think through several of them out loud to show how a lot of them resolve themselves:

  • The NaN case isn't reflexive. In math, partial orders are reflexive. I think that's okay; this doesn't have to be a perfect representation of a mathematical partial order to accomplishes the goal of making partially ordered parametrically polymorphic types composable. A type with NaN in it just doesn't happen to be one of those.
  • Do mutable boxes really have an innate partial order that justifies departure from equal-always?? I suppose the idea might be that they do have at least some sense of innate equality to other mutable and immutable boxes with the same contents. All right, then. I see you're including a hash function in "Option B" and "Option C" too, so I guess we'll see hashes indexed by this. :) I like it.[bf4]
  • "behaves compatibly with .= on real numbers": I wonder if this .= can be merged with this if it's compatible.[bf2]
  • Comparison using within isn't transitive. But that's actually just fine. I like that. (More about this below.)
  • Short-circuiting might need more attention, but there actually is a way to do it using this interface. (More about this below.)
  • I think passing in a recur argument could be a lot easier than it is in Racket. (More about this below.)

I like the fact that in this design, checking for =~ takes only a single traversal. If it had to do two traversals (by combining the results of two applications <=~ in different directions), then a value consisting of lexically ordered containers nested N levels deep might perform 2^N traversals of its innermost elements. I think I would have made that mistake, and this design avoids it.

I get the impression this rather neatly accomplishes most of the important parts of what I was calling business=? in that very long #16 conversation. What I was proposing for business=? was a little different: It was to be Rhombus's default notion of comparison so that numbers would behave like they do in other spreadsheets, calculators, and popular programming languages the user might have encountered before learning Rhombus. I assumed business=? would be mostly be an extension of = (perhaps literally an update to = itself) that made it consistent with equal-always? except on types that specified specific behavior=? functionality.

It looks like "Option A" here doesn't coincide with equal? or equal-always? by default, and I find it unclear whether the other options change that. Keeping them independent seems fine in its own way (nice and opt-in), but it sounds like one of your goals listed at the top is for =~ to do equal-now comparison, so I wonder if I'm missing something that makes it coincide with equal? more often.[bf3]

If the intention weren't for this to be used for partial ordering, this seems like a great design for Equatable as well. 😄 Especially "Option B" and "Option C" which have the hash method. Let's let some low-level Equatable method return a -1, 0, or 1 result when recur is a partial ordering! That way we can manipulate recur to use partial ordering for some layers of a nested data structure and Equatable comparison for others.

Why it's okay that within lacks transitivity

I increasingly believe the point of recur is to make a generic comparison operation even more useful by providing access to a "predicately typed" version of the the parametricity free theorem for each type that implements it. The free theorem of a type of containers allows any relation on containers' elements to be extended into a relation on the containers themselves. When this is "predicately typed," the relation on the containers is a predicate, essentially a generic for/and predicate where the loop body is the recur. A for/and utility is useful even for checking for conditions that aren't anything like equivalences or orders.

We already get one for/and from equal-always?/recur and equal?/recur. From the comparison interface you're proposing here, we get another one.

The new one would be like the one we get from equal?/recur, but with a different default behavior for numbers and anything else that has an innate partial order that differs from equal?. Also, in place of using generalized Racket booleans, it would use use negative numbers, 0, positive numbers, and NaN.

From an operational perspective, these for/and results are loop control directives. NaN usually breaks out of this loop with a result of NaN, 0 continues, and positive or negative numbers taint the loop in respective ways. If a loop is tainted with both signs, it usually breaks out with a result of NaN. If the loop ends before it breaks out, then the loop's overall result is a number which indicates which sign the loop was tainted with by the end, or 0 if it wasn't tainted at all.

(Hmm, this could be generalized to a lattice. That could be useful for representing more sophisticated partial knowledge. But let's not.)

Lexicographically ordered containers are a little funky. In terms of this loop, they behave like challenge runs in which a fully untainted subtraversal of their first element unlocks a bonus subtraversal of their second.

(Aside) Lexicographically ordered containers in theory

Lexicographically ordered containers remind me of Sigma types in dependent type theory.

Values of a Sigma type consist of what I'll call a discriminant value (typically called fst) and a contents value (typically called snd). The type of the contents can depend on the value of the discriminant. For instance, the discriminant could be a "requisition type" of 'buy-pencils or 'buy-fish-food, and the contents could be "requisition details" consisting of either a brand of pencil or a specification of fish dietary restrictions. Presumably we don't ever want to feed pencils to the fish, and when static types are involved, they rule that out.

Lexicographically ordered containers are similar. Once we know that the first elements are somehow equivalent, we proceed to compare the second element. In the Sigma type situation, comparing the discriminant type can teach us something about what type the contents value is and how to compare it, so we might want to compare the discriminants first and only proceed if we extract consistent comparison strategy information from them.

That comparison makes me wonder whether the use of recur for these types should be a little different.

As far as the discriminant value goes, I wonder if its expression would have to implement some generic unification interface to allow it to appear at the type level in the contents value's type. If it does, maybe it would already have a notion of comparison that's more specific than recur and hence should be used instead.

As far as the contents value goes, its type could depend on the successful comparison result of the discriminant (including, for instance, a full rundown of the details that both discriminants have in common). Hence, we might want its comparison to take that comparison result as an argument alongside its other arguments.

But I think we should assume that for all the lexicographically ordered containers we'll define in the near future, all of those types end up being subtypes of what recur compares and are consistent with it. That's easier. 😄

Short-circuiting

Conceptually, a <=~ b could short-circuit as soon as it finds a false result for a_i <=~ b_i anywhere in its traversal.

In order to actually do that within the constraints of this interface, it has to supply a creative recur function.

First, that recur function compares a_i and b_i the usual way, and it uses that result in most cases. However, if it's a positive number (indicating a >~ b), recur returns NaN instead in order to short-circuit.

Similar techniques apply to all the other boolean comparisons like a >~ b and a =~ b.

Making it easier to customize recur

For one thing, I notice the recur-taking methods aren't public. Why not? They are kinda odd, but I believe they'd be for/and-style utilities that would have some real use.[bf6] For starters, they could be used to implement short circuiting for all those comparison operators, not to mention within. 🙂

For the same reason, I think an equal-mode-hash-code/recur function should be exposed in Racket. The design doc here mentions "But options B and C would probably require building this into the core of Racket so that they can access hash-proc methods with their recur arguments," and I think this is a prime example of how a lack of exposing this stands in the way of libraries.[bf5]

If they are exposed someday, I think they might benefit from a macro that has some things in common with for/and and with-handlers. A very rough sketch:

compare(my_equal?: |equal-always?/recur|):
  ~each:
    a: as
    b: bs
  | string?: ~is |string-ci=?|
  | number?: ~is |=|
  | mlist?:
    ~key |mlist->list|
    andmap(my_equal?, a, b)

That code would behave like the following Racket code. It performs a comparison between as and bs that's consistent with string-ci=? on strings, = on numbers, a composition of andmap and mlist->list on mutable lists, and equal-always?/recur on everything else:

(let my-equal? ([a as] [b bs])
  (or (eq? a b)
    (let ()
      (define a-is-string? (string? a))
      (define b-is-string? (string? b))
      (cond
        [(and a-is-string? b-is-string?) (string-ci=? a b)]
        [(or a-is-string? b-is-string?) #f]
        [else
          (define a-is-number? (number? a))
          (define b-is-number? (number? b))
          (cond
            [(and a-is-number? b-is-number?) (= a b)]
            [(or a-is-number? b-is-number?) #f]
            [else
              (define a-is-mlist? (mlist? a))
              (define b-is-mlist? (mlist? b))
              (cond
                [(and a-is-mlist? b-is-mlist?)
                 (define a-key (mlist->list a))
                 (define b-key (mlist->list b))
                 (andmap my-equal? a-key b-key)]
                [(or a-is-mlist? b-is-mlist?) #f]
                [else (equal?/recur a b my-equal?)])])])))))

If recur and mode arguments are somehow made available to users, I think good names for them would probably be:

  • recur: on_item, or maybe on_element
  • mode: on_state_identity, or maybe now?. I think an on_... name could be for the best, to emphasize that this is a handler for visiting a certain entity in the traversal.

Suggestions

Most of this comment has turned out to be positive and explanatory. Here's a list of my more open-ended suggestions:

  • [bf1] Call PartialOrder something else that describes the value rather than the algebra it belongs to. One choice might be PartiallyComparable.
  • [bf2] Can .= be merged with =~? Furthermore, in the spirit of the business=? idea I've mentioned, I'd be happy to see =~ be the default notion of equality in Rhombus, which might affect what we call it.
  • [bf3] Does this coincide with equal? a lot or not? It sounds like it's a goal for it to coincide a lot of the time, but it might be opt-in. Is that right? I don't know if the doc is completely clear on this in its current form.
  • [bf4] I do prefer having a hash function for this. I believe this is a point against "Option A".
  • [bf5] Let's have Racket expose equal-mode-hash-code/recur functionality. Then update the part of this doc that mentions not having access to it.
  • [bf6] I think the interface for passing in recur and mode arguments should not only be public, it should have a dedicated for/and-like, with-handlers-like syntax for better usability. And the names of recur and mode could be in the form on_... to emphasize that they handle different entities in the traversal.

@AlexKnauth
Copy link
Member Author

Re: ListOfVectors<a> = List<Vector<a>>

This would only be the case if ListOfVectors and List had a common supertype that implemented PartialOrder for them. In that case it wouldn't be "multiple" partial orderings but just one, on the supertype.

Re: [bf2] .= and =~

I don't want to merge .= and =~. In Racket I've taken a liking to type-specific equality predicates like =, string=?, and so on when I know what type the inputs are. If I used a generic equality predicate like equal? or equal-always?, I won't catch as many errors in my code soon enough to debug them quickly. So I don't want to get rid of .=.

Re: short-cirtuiting for <=~ via recur producing NaN

Ah, that's pretty clever! I like it. I'll have to change one thing in the definition of product-compare/recur to allow this to work properly, analogous to the guideline "when calling the third argument to 'recur' on pieces, pass the pieces in the same order they came in" from the Honest Custom Equality docs.

Re: [bf3] =~ vs equal? vs "equal-now"

Yes =~ is not the same as Racket's equal?, and that's true for options A, B, and C. But for options B and C it's not quite independent either.

The way I'm envisioning B and C working, the equal-mode+hash for a datatype would use Equatable when the mode is false, and PartialOrder when the mode is true. That seems like it would make =~ and equal? the same for that type, if everything the type contains and calls "recur" on also behaves the same way under =~ and equal?.

But if the type contains numbers and calls "recur" on them, =~ will call PartialOrder methods with "recur" arguments that behave in a =~ way on those, while equal? will call PartialOrder methods with "recur" arguments that behave in an equal? way there. This is similar to how equal? and impersonator-of? are different in Racket, even though they go through the same equal+hash paths on most types. It's very similar to how check-equal? and check-within ... 0 are different as well.

Re: ordering for Equatable as well

I'm worried that an interface intended to be an ordering with "always" will devolve into an ordering used as "now". I expect that many users will access mutable data in their ordering methods, especially when the only alternative that keeps "always" behavior would be to return NaN a lot which doesn't seem very useful.

Re: [bf6] and recur argument, mode argument, for/and-like, with-handlers-like stuff

I don't really understand what you mean for most of that.

I guess there could be different for/and-like macros for product orders vs lexicographic orders. The product order one would stop at NaN, including conflicting positive and negative directions producing NaN. While the lexicographic one would stop at the first non-zero value.

But... I still don't understand how recur, mode, or with-handlers-like stuff fits in or even what the uses cases of this would be.

@rocketnia
Copy link

rocketnia commented Feb 10, 2023

@AlexKnauth

Re: ListOfVectors<a> = List<Vector<a>>

This would only be the case if ListOfVectors and List had a common supertype that implemented PartialOrder for them. In that case it wouldn't be "multiple" partial orderings but just one, on the supertype.

Oh, I just meant ListOfVectors was a structural type alias like:

(define (list-of-vectors/c element/c)
  (listof (vector-immutableof element/c)))

As a type alias, it could have a for/and-like utility for its own predicately typed free theorem, which would likewise be defined as an alias of some composition of other such utilities:

(define (list-of-vectors=? a b element=?)
  ; We compare the outer list.
  (equal-always?/recur a b
    (lambda (a b)
      ; We compare each vector in the list.
      (equal-always?/recur a b
        (lambda (a b)
          ; We compare the elements.
          (element=? a b))))))

Oops, I could have actually used for/and there. ...Actually, for/and itself is a little bit more tedious to use, but since I've been talking about for/and so much, here it is for comparison:

(define (list-of-vectors=? a b element=?)
  ; We compare the outer list.
  (and (equal-always? (length a) (length b))
    (for/and ([a (in-list a)] [b (in-list b)])
      ; We compare each vector in the list.
      (and (equal-always? (vector-length a) (vector-length b))
        (for/and ([a (in-vector a)] [b (in-vector b)])
          ; We compare the elements.
          (element=? a b))))))

I don't want to merge .= and =~. [...] I won't catch as many errors [...]

Okay, fair reason. :)

It's also probably easier on Racket's optimizer (although I have hopes that static information could help the generic one expand to similar code).

The way I'm envisioning B and C working, the equal-mode+hash for a datatype would use Equatable when the mode is false, and PartialOrder when the mode is true.

Hmm, I'm not sure what your goal is with this. Why would gen:equal-mode+hash delegate to a comparison that's specifically intended to be used for partial ordering? I think it should call Equatable with whatever its mode is.

Re: ordering for Equatable as well

I'm worried that an interface intended to be an ordering with "always" will devolve into an ordering used as "now". I expect that many users will access mutable data in their ordering methods, especially when the only alternative that keeps "always" behavior would be to return NaN a lot which doesn't seem very useful.

I didn't say Equatable would be for "always"-style partial orderings, or for partial orderings at all. It would just use the same -1, 0, 1, NaN API (and typically always return 0 or NaN). That way, when the recur that's passed to it is a partial ordering, it would extend that partial ordering into another partial ordering, which could then be made use of in a PartialOrder implementation.

That kind of thing could be done with equal-always?/recur now, but since that one only returns booleans, it can't reach a result of -1, 0, 1, or NaN in just one pass and hence has the 2^N issue. That's why I think this would be an upgrade to the equality interface.

Meanwhile, returning NaN from comparisons is useful because it helps programs fail faster, like you were saying .= does. That was one of the things that came up at one point in my business=? line of thought. I was saying a certain benefit of adding NaN-like incomparable cases to the generic equality operation was that people could finally get error messages if they tried to use things in the key of a map that weren't designed to be compared, like mutable objects, lambdas, instances of an abstract data type whose author hadn't yet decided how it should be compared, or instances of an experimental type that was meant to someday coincide with another type but hadn't been merged upstream yet.

(Did I just say NaN helps programs fail faster? 😄 Must be opposite day.)

Since I'd like procedures and user-defined abstract data types to not be allowable as keys by default, I actually prefer a pretty steep departure from equal?/equal-always?. I can understand wanting to sometimes treat structural types like lists and vectors as though they implicitly derive a partial ordering based on the product of their elements' orderings, but that ordering makes sense for any Equatable type, and that's one of the things Equatable should allow us to do by passing recur.

Re: [bf6] and recur argument, mode argument, for/and-like, with-handlers-like stuff

I don't really understand what you mean for most of that.

I guess there could be different for/and-like macros for product orders vs lexicographic orders. The product order one would stop at NaN, including conflicting positive and negative directions producing NaN. While the lexicographic one would stop at the first non-zero value.

But... I still don't understand how recur, mode, or with-handlers-like stuff fits in or even what the uses cases of this would be.

I was talking about types whose PartialOrder instances were lexicographic orders, but I wasn't talking about any type whose values could be compared in both a partially ordered way and a totally ordered way. (I wasn't yet, anyway, and I won't get started right now. It's a useful topic, though.)

Use cases for recur? We're kind of swimming in them right here. 🙂 The reason recur is useful is because it lets you implement:

  • chaperone-of?, impersonator-of?, and variants and new implementations thereof.
  • The cycle-tolerant behavior of equal?.
  • The optimization where equal? and equal-always? don't compare values if they're already known to be eq?.
  • New generic comparison operations that serve different purposes but inherit some of their implementations from existing ones.
  • check-within and variants thereof.
  • Short-circuiting ordered comparisons in terms of PartialOrder's methods.
  • Ad hoc short-circuiting, zipping traversals like "for this list of argument type predicates and this list of actual argument values, check that each argument has the correct type."
  • Structural type definitions like the list-of-vectors/c example above, with their own type-specific comparator combinators like list-of-vectors=?.
  • Products and sums of of partial orders (or other relations) in terms of generic equality.

Don't check eq? before that because cases like NaN can refuse to be equal to themselves.
@AlexKnauth AlexKnauth marked this pull request as draft February 14, 2023 04:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants