Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT: clip function #5194

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

FEAT: clip function #5194

wants to merge 2 commits into from

Conversation

hiiamboris
Copy link
Collaborator

What would you think if you saw in someone's code: v: max min a b min max a b v?
Probably that poor fellow has gone bananas.
However v: clip v a b is easy to understand: "clip value V between A and B".

Thought I'd PR this function which I find very useful in many projects, and this particular design is the result of it's long usage, and doesn't interfere with any other planned features that I'm aware of.

>> ? clip
USAGE:
     CLIP a b c

DESCRIPTION:
     Return A if it's within [B,C] range, otherwise nearest to it range boundary.
     CLIP is a function! value.

ARGUMENTS:
     a            [scalar!]
     b            [scalar!]
     c            [scalar!]

RETURNS:
     [scalar!]

For the ease of understanding of it's meaning and use, as well as docstring formulation, docstring implies that A is the point, and [B,C] is the segment.

Yet the function has a nice property: one doesn't have to remember the order of arguments, because it doesn't matter! If A,B,C are scalars, it chooses one of A,B,C that is between the other two. If A,B,C are pairs, it applies the same logic to both X and Y axes. If types are mixed, result is at the mercy of min/max functions, most likely type promotion first, then the same logic follows.

An illustration:

view [
	base coal cyan 200x200 "Click me^/and drag around"
	all-over on-over [
		if event/down? [
			a: event/offset
			b: 0x0
			c: face/size 
			face/draw: compose [
				pen red   circle (clip a b c) 10
				pen green circle (clip b c a) 15
				pen blue  circle (clip c b a) 20
			]
		]
	]
]

@hiiamboris
Copy link
Collaborator Author

hiiamboris commented Sep 2, 2022

Although one thing we should consider is the range! datatype, in which case it could have a binary form instead: clip range value, clip -1..1 x - smth like that. Usually we put value first, range second, however in this case value is most likely an expression, whereas range is something known (named), and clip value range order doesn't usually work (sometimes it's the other way around though, and that's why having argument order irrelevant is a great property).

One option is to accept either a range or a block with expressions that reduces to [A B] segment, but the extra allocation or reduce call is not great here. Even though clip [expr1 expr2] value-expr is readable, I also don't like that it introduces an asymmetry into the argument order, whereas implementation doesn't have it.

Of course, we could just write clip range/1 range/2 expression with some minor repetition.

@dockimbel
Copy link
Member

dockimbel commented Sep 2, 2022

Just for the record, I've recently added a feature to View allowing to define a bounding box when using dragging mode:

view compose/deep [
    panel coal cyan 200x200 "Click me^/and drag around" [
        base 40x40 coal loose draw [
            pen red   circle 20x20 10
            pen green circle 20x20 15
            pen blue  circle 20x20 20
        ] options [bounds: (object [min: -20x-20 max: 180x180])]
    ]
]

That feature implementation could benefit from such clip function. Also clip usefulness/code-size ratio is very high, so I would accept it. Let's see if others have some comments about it.

Copy link
Member

@dockimbel dockimbel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good!

@meijeru
Copy link
Contributor

meijeru commented Sep 2, 2022

A remark: the docstring should perhaps be "... its range boundary" not "... it range boundary"

@hiiamboris
Copy link
Collaborator Author

It's "nearest to it (to A) range boundary". Perhaps replace it with A?

@meijeru
Copy link
Contributor

meijeru commented Sep 2, 2022

Or rather "the boundary nearest to A" ?

@greggirwin
Copy link
Contributor

I will post thoughts in a bit. I also had some comments for the clock PR, but that got merged before I could comment so it was less of a rush. :^)

@greggirwin
Copy link
Contributor

Arg Naming

There is some precedent for a b c, and I've used them myself, but it's not always a win. e.g. as-rgba uses them, which unfortunately makes 2 of them match associated by names. i.e. ais actuallyr`. Not too confusing, until...

>> as-rgba 1.1 2 3 4
*** Script Error: as-rgba does not allow float! for its a argument
*** Where: as-rgba
*** Near : 1.1 2 3 4
*** Stack:  

Of course we all love how it makes code shorter. I have a divisible? a b, but it's divide value1 value2. ValueN is used much more widely, and does keep us from numerator denominator, augend addend, and worse (also more confusing for normal people). Here the short names do make all the logic fit into a single, short line, but that line has little meaning to human eyes. This will always be a hard choice, but I prefer writing for human understanding, even if we say "Who's going to look at it? It just works." It's a precedent and a rule we should always keep in mind. Don't write code just so the author, or a few people immediately grok it, but so almost anyone can. Then the deep voodoo code really stands out, rattling its metaphorical tail at the unwary.

Arg Order

one doesn't have to remember the order of arguments, because it doesn't matter!
in this case value is most likely an expression, whereas range is something known (named), and clip value range order doesn't usually work (sometimes it's the other way around though, and that's why having argument order irrelevant is a great property).

Sometimes we don't care about the order, but most of the time order is important. Often crucially. Here's my old limit func for comparison:

limit: function [
	"Returns a value constrained between two boundaries (inclusive)"
	value
	bound1
	bound2
][
	lower: min bound1 bound2
	upper: max bound1 bound2
	max lower min upper value
]

Here my old laziness comes out, as there are no type specs, so the internal comparison will leak out on errors. :^\

				pen red   circle (clip a b c) 10
				pen green circle (clip b c a) 15
				pen blue  circle (clip c b a) 20

Thanks for the super cool example @hiiamboris. 👍 What my brain says, when I read the above code, is "Wait, what's being clipped? Why is the order different?" I get that it's an example to show the lack of ordering, but it's not compelling, because it's confusing code compared to, say:

view [
	base coal cyan 200x200 "Click me^/and drag around"
	all-over on-over [
		if event/down? [
			center: clip event/offset 0x0 face/size 
			face/draw: compose [
				pen red   circle (center) 10
				pen green circle (center) 15
				pen blue  circle (center) 20
			]
		]
	]
]

I'm open to compelling arguments of course. What we also have to reconcile is what the doc string says with whether we promote order independence.

I'll take another step, though, and ask what if we want the query version of clip/limit? Does order matter then?

between?: func [
	"Returns TRUE if value is between the two boundaries (inclusive)"
	value
	bound1
	bound2
][
	all [
		value >= min bound1 bound2
		value <= max bound1 bound2
	]
]

Next up: The Domain...

@greggirwin
Copy link
Contributor

I did a quick search for things I have related to this domain (as I see it):

between? limit
find-min find-max
bounds range
longer? longer-of
shorter shorter-of
linear-interpolate make-linear-interpolater
near?

Bounds-Range(old stuff)

I also found an experimental limit I did, which relates to @hiiamboris' question about arity and ranges. Basically:

limit: function [
	"Returns a value constrained between two boundaries (inclusive)."
	value [number! scalar! series!] "For series, upper bound is the length, series is modified"
	bound [number! scalar! block! ] "Upper bound (lower implied as zero) or block of lower/upper bounds"
	; For bounds, types must be compatible with value for comparisons
][ ...

with a question about the implied lower bound for dates (01-Jan-0000 ?)

Side Note: make-linear-interpolater takes the range values and makes a single-arity func with those baked in. For use cases like spaces and I will guess most clip/limit uses, that's not as useful.

I want to bring aggregators into the mix too, but that's peripheral to this domain, with some conceptual overlap. So I won't. :^)

@hiiamboris
Copy link
Collaborator Author

hiiamboris commented Sep 2, 2022

Thanks, @greggirwin

"Wait, what's being clipped? Why is the order different?"

That's exactly the moment one should experience to free oneself from the notion of order necessity. So, expected, and explained. But for most users there's a docstring which they can consult with or memorize, and your "compelling example" is what I would write when possible.

I'm still using clip [min max] value design in spaces (didn't have time to switch). But it requires a reduce call, which is eating at me ;) Moreover, I found myself writing things like this sometimes:
maybe space/origin: clip [origin 0x0] box - scrollers - csz
Weird, isn't it? I'm clipping some value between origin and zero to... get new origin?
It was initially:
maybe space/origin: clip [box - scrollers - csz 0x0] origin
But having put the longest expression first I just didn't like it, and rewrote as above.
If by design the arguments are equal though, and I know it, then I can write freely:
maybe space/origin: clip 0x0 origin (box - scrollers - csz)
And I'm free to interpret it as clip min value max now. Even though actually it's clip max value min ;)

Another bad example:
new: clip [(visible-size - cspace/size) 0x0] new
Which I prefer as:
new: clip 0x0 new (visible-size - cspace/size)
So as you can see, it's tradeoffs either way, and why input is so welcome.

I'll take another step, though, and ask what if we want the query version of clip/limit? Does order matter then?

Yes it does. And we have within?, which I hate because I can never recall the argument order and whether it accepts size or coordinate. The only human-friendly interface to within? in my opinion is value within? range, e.g. value within? 0x0..10x10. Then it's readable and you don't need to re-read the function's spec.

Inclusive within? can be written using clip as value = clip value low high. But within? excludes the upper coordinate, so it may be written as value = clip value low high - 1, but only as long as pairs are integer-based and we know that low and high are ordered. If they're floats, out of luck - need another test for margin inequality, which for pairs is trickier because you have to test against max low high in case limits are reversed or unordered (e.g. 10x0 and 0x10).

@greggirwin
Copy link
Contributor

Agreed 100% on tradeoffs.

That's exactly the moment one should experience to free oneself from the notion of order necessity.

Except that order is necessary most of the time. So this is the broader question in Red's design. One of those things we may have to try and see how it works in the real world.

@dockimbel dockimbel self-requested a review September 2, 2022 22:42
@hiiamboris
Copy link
Collaborator Author

hiiamboris commented Sep 3, 2022

I just thought, if there's no graceful way we can resolve the 2-arity/3-arity problem, maybe we can have both, e.g. clip value low high and clip2 range value? Precedents are atan/atan2 and their long variants, where I understand 2 is historical, but perhaps it stands for the number of arguments.

Another thought: 2-arity clip could accept two ranges, returning an intersection (as another range), maybe an empty one if there's no intersection.

@hiiamboris
Copy link
Collaborator Author

FYI other practical examples of clip usage (and why value should not come first, rather let's think of it as "we apply range to clip some value"):

repeat i 3 [rgb/:i: clip [0 255] to integer! m + do triple/:i]
fg-hsl/3: clip [0% 100%] fg-hsl/3 + (amnt - 1 / 2 * sign)  
size: clip [min max] size
offset: clip [0x0 limit] offset
clip [0 requested] either dir < 0 [from][csize/:axis - from]
repeat i nx [TWj+1: TWj+1 + clip [W1/:i W2/:i] W/:i]
W/:i: to float! clip [W1/:i W2/:i] size/x
H/:i: to float! clip [H2/:i H1/:i - 1] size/y
r: clip [0 r] requested
co: clip [0 len] co + n 
pos: skip text maybe field/caret/offset: co: clip [0 len] co 
max-org: clip [min-org 0] view-width - cx - cw - cmargin 
clip [min-org max-org] field/origin
if burn? [disp: clip [0x0 20x20] disp + (random 3x3) - 2x2]         

@hiiamboris
Copy link
Collaborator Author

Re: argument naming. Of course I thought of that. But I don't see how value1 value2 value3 are any better. Using anything more specific, like value low high has the drawback of instilling an assumption that low is less than high, and a doubt whether it will error otherwise or not. value margin1 margin2? Maybe, but that further pushes the assumption of arguments inequality. In the end I don't really care much, and I'm okay with any choice.

@greggirwin
Copy link
Contributor

On atan2: https://en.wikipedia.org/wiki/Atan2#History_and_motivation "The single-argument arctangent function cannot distinguish between diametrically opposite directions." So the goal was different there, though it vaguely aligns with the interface in my new limit function, where a scalar arg implies a lower bound.

two ranges, returning an intersection (as another range),

That seems like a very specialized case, which will make the common cases much harder to reason about.

"we apply range to clip some value"

Indeed.

I don't see how value1 value2 value3 are any better.

They at least hint at the order, and are explicitly meaningless. a b c could have meaning, as in the to-rgba example. If we set a new convention of single letter var names, ordered alphabetically, I think we'll end up with a real mess of things in some cases.

low/high min/max issues are why I went with bound1/2. Meaningful, but not ordered. Nothing but compromise choices here.

@greggirwin
Copy link
Contributor

Let's also consider how dependent/constrained types might be represented. I don't remember where @rebolek's experiment was, but I'm thinking of how you'd define them (progress percent! [0% 100%] or progress [0% 100%] with the type inferred), and how they'll be reflected when interrogating systems that use them. This aligns with your examples of applying ranges, as the range is the data as much as the value. Metadata really.

@greggirwin
Copy link
Contributor

This is why I have so much mezz code that I haven't PR'd. Even the simplest things lead to deep conversations, and we all need to work through them in a timely manner or thoughts get stale.

@hiiamboris
Copy link
Collaborator Author

On atan2: https://en.wikipedia.org/wiki/Atan2#History_and_motivation "The single-argument arctangent function cannot distinguish between diametrically opposite directions." So the goal was different there, though it vaguely aligns with the interface in my new limit function, where a scalar arg implies a lower bound.

Open ranges are a must have if we have closed ones, so provided we figure a good syntax for that, there will be no need in specifying just lower or higher bound. And that page confirms that 2 stands for binary, thanks.

@greggirwin
Copy link
Contributor

It's interesting that I recently tinkered on another func where the arg order raised questions for me.

fill-block-template: function [
	"Fill in a block template, with rule keys containing values."
	template [block!] "Parse rules template to fill in using input rules."
	rules    [block!] "Key-value block of [<template-value> [<values to take from>]]"
	/local =key
][ ...

@luce80
Copy link

luce80 commented Sep 5, 2022

Just 2 thoughts for the record:
OpenGL has a function Clamp(x,min,max)
I usually use: v: min max a x b because it looks similar to a <= x <= b

@greggirwin
Copy link
Contributor

Yes, clamp is an historical name, but not one I personally like. To me it means "to fasten" more than "to limit" in active form. Other words: restrict confine constrain. I like limit because it is shorter, easier to type, already associated with bounds/boundaries and gives us leverage in what we call the "limits", and also being more dimensionally neutral. Clip is UI/graphics/region oriented, and not normally used in 1D/numerical contexts.

@dockimbel
Copy link
Member

I prefer named arguments that imply an order, as it would make it easier to remember and read for everyone. That does not preclude people from using it in different order if they want to confuse others. ;-) I let you choose the names.

For the function name, clip is fine, as often the clipping will be done on 2D data, eventually on 3D (colors) or 4D (IPv4 addresses). limit is used often as a variable name (and a useful one), so as a function call, it would be confusing to me (and probably many others).

@hiiamboris
Copy link
Collaborator Author

hiiamboris commented Mar 13, 2023

After using ternary clip a b c for a while, I confirm that I prefer it more than the block version clip [a b] c that forced the order.

I can confirm segment intersection need comes up from time to time, e.g. if I want to extract text fragment [S..E] and it has a list of bold ranges [A..B C..D ...] then I might write using binary clip:

map-each range list [when 0 < span? range: clip range fragment (range - fragment/1)]

to both filter out non-intersecting ranges and offset those that do intersect. Same applicable to N-dimensional points.

As to whether we should use binary clip low..high value or ternary clip low high value it comes down to:

  1. Whether range construction will be a cheap operation (like as-pair).
    Can the range fit into a single cell? It will have to be able to carry two pairs/points.
    How big a point can be? For 3D that's 3x64=192bits minimum. Two 3D points = 384bits + header.
    Are we going to support N-dimensional points as vectors of fixed length? Then they will be series.
    Range will have to carry two full cells of payload, totaling 3 cells.
    Depends on how optimized allocator and GC will be for such tiny chunks.
  2. Whether we can find meaningful behavior for things like clip 1..2 3..4 (no intersection) and clip 1 2 (no range).
    Should clip 1..2 3..4 return 2..2 or 3..3? or none (extra check to test - not good)?
    Should clip 1 2 throw an error? Or return value closest to zero (as Gregg's limit does I suppose)?

But in this example I test specifically for zero intersection and throw it away.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants