Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multi-argument distance #311

Merged
merged 3 commits into from May 21, 2024

Conversation

lognaturel
Copy link
Member

@lognaturel lognaturel commented May 10, 2024

Adds a multi-argument variant to distance. This makes it easier to compute the distance between multiple points within a form's primary instance. Currently to achieve this you'd have to introduce a calculate which concatenates those points together and then call the distance function with a reference to that calculate as the argument.

It also makes it possible to filter a nodeset based on distance as described in this forum post. This is not currently possible because the single-argument variant that exists today only accepts a nodeset.

I also considered accepting a literal string which would meet the requirement but it feels less intuitive and convenient than accepting a variable number of arguments.

I can't actually come up with a usage for more than two arguments so another option would be to only allow two. But I don't think we lose anything by adding flexibility, especially since we already compute distance between more than two points with the other input types.

I considered adding this variant to area as well but I'd rather wait until we have a known use for it.

@eyelidlessness @seadowg does this seem like an ok spec addition? I think it's unlikely to be controversial so I'm inclined to Make It So without much discussion if it seems reasonable to you. I've also shared on the forum.

Copy link
Member

@eyelidlessness eyelidlessness left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine! I think the addition (as emphasized):

a single node-set of geopoints or a single geoshape value or a single geotrace value or a sequence of geopoint values

... resolves some ambiguity, but maybe adds a bit too.

It seems that (geopoint|string) arg* still allows for variadic node-set arguments, so long as each node-set resolves to a single geopoint value. This is what I'd expect! And I'd approach implementation with something like (pseudocode):

if (isVariadic(args)) {
  // Note: not `flatMap`
  geopoints = args.map(asGeopoint)
} else {
  geopoints = asString(args[0]).split(';').map(asGeopoint)
}

Does that sound reasonably close to the intent?

I think maybe longer term, it might help to break up signature overloads into separate definition lines. But that's reasonably out of scope here, and I'll have a couple other thoughts on this section's organization when I organize notes from getodk/web-forms#110.


Aside: I kind of have to just trust that our published rendering will format this appropriately. Both this signature and the area signature above it seem to confuse GitHub's markdown renderer, but that's true on the default branch rendering as well.

@lognaturel
Copy link
Member Author

maybe adds a bit too

Can you elaborate, I'm not seeing it!

(geopoint|string) arg* still allows for variadic node-set arguments, so long as each node-set resolves to a single geopoint value

Yes, exactly.

Does that sound reasonably close to the intent?

Yes! I think the single-arg case may need a separate branch for when the single argument is a nodeset of geopoints (there won't be semicolons) but that's an implementation detail.

I kind of have to just trust that our published rendering will format this appropriately

I have run it and verified that it does.

@lognaturel
Copy link
Member Author

From @tiritea at getodk/javarosa#761 (comment) :

it actually seems more intuitive to me that the (singular) argument to the existing XPath distance() function should be able to support either a node/nodeset (for example, containing a geotrace value), or a literal string of the same. Many functions behave this way...

I don’t disagree if our target audience were developers. For a form designer writing a form I think having to concat two values that are points with a semicolon in between feels weird and surprising — I have these two locations, let me do something with them! A lot of folks who use points don’t know about traces or shapes.

That said, I imagine most folks will use documentation and examples to use this functionality so I don’t feel too strongly either way. Also happy to do both.

@tiritea
Copy link

tiritea commented May 14, 2024

A special flavor of distance() which supports two geopoint arguments is probably useful enough, and minimally supplemental - and not to mention more user-friendly - to warrant it. And it is easy enough to readily identify the new 'flavor' (for implementation purposes) from simply counting the number of arguments.

But that's independent of a desire to support string literals/expressions, to bring the function more in line with some its its compatriots... :-)

@tiritea
Copy link

tiritea commented May 14, 2024

For a form designer writing a form I think having to concat two values that are points with a semicolon in between feels weird and surprising

Somewhat an aside, but it may be worth noting that it still remains an option - especially for things that are potentially 'unintuitive' - that they can ultimately be accommodated by either by XForms/XPath, or just by XLSForm semantic sugar of existing function. Full disclosure: I tend to lean towards not implementing new XForm/XPath functions - for things that can, or perhaps should, be implemented with (minor additional?) functionality, and let XLSForm semantic sugar worry about making it the most user-friendly.

Just another (unsolicited) $0.02 :-) lol

@lognaturel
Copy link
Member Author

A special flavor of distance() which supports two geopoint arguments is probably useful enough, and minimally supplemental - and not to mention more user-friendly - to warrant it. And it is easy enough to readily identify the new 'flavor' (for implementation purposes) from simply counting the number of arguments.

👍 And do you feel strongly about two geopoint arguments as opposed to N?

that's independent of a desire to support string literals/expressions, to bring the function more in line with some its its compatriots...

That's a good point and I was surprised in the forum thread that brought this up that a literal string didn't work. I believe that the way the spec is as-published suggests both literal values and references to them should be supported. Does that sound right? If so, we can go ahead and make the necessary changes to the implementations that need them.

they can ultimately be accommodated by either by XForms/XPath, or just by XLSForm semantic sugar of existing function

Yes, that's a good point and I'll think about where we can codify that practice. In this case, I agree with your earlier point that this is minimally supplemental and therefore attractive. I also think that XLSForm sugar in this case would have a high probability of leading to really confusing errors for users like an error related to a concat function call they didn't make.

Thanks!

@@ -191,7 +191,7 @@ For convenience, the functions are categorized based on their main usage. Some f
<a id="fn:indexed-repeat" href="#fn:indexed-repeat">`indexed-repeat(node-set arg, node-set repeat1, number index1, [node-set repeatN, number indexN]{0,2})`</a> | string | Returns a single node value from a node-set by selecting the 1-based index of a repeat node-set that this node is a child of. It does this up to 3 repeat levels deep. E.g. `indexed-repeat(//node, /path/to/repeat, //index1, /path/to/repeat/nested-repeat, //index2)` is meant to be a shortcut for `//repeat[position()=//index1]/nested-repeat[position()=index2]/node` in native XPath syntax.
<a id="fn:Geographic-Functions" href="#fn:Geographic-Functions">**Geographic Functions**</a>|||
<a id="fn:area" href="#fn:area">`area(node-set ns|geoshape gs)`</a> | number | Returns the calculated area in m2 of either a node-set of geopoints or a geoshape value (not a combination of both) on Earth. It takes into account the circumference of the Earth around the Equator but does not take altitude into account.
<a id="fn:distance" href="#fn:distance">`distance(node-set ns|geoshape gs|geotrace gt)`</a> | number | Returns the distance in meters of either a node-set of geopoints or a single geoshape value or a single geotrace value (not a combination of these) on Earth, in the sequence provided by the points in the parameter. It takes into account the circumference of the Earth around the Equator and does not take altitude into account.
<a id="fn:distance" href="#fn:distance">`distance(node-set ns|geoshape gs|geotrace gt|(geopoint|string) arg*)`</a> | number | Returns the distance in meters of either a single node-set of geopoints or a single geoshape value or a single geotrace value or a sequence of geopoint values (not a combination of these) on Earth. It takes into account the circumference of the Earth around the Equator and does not take altitude into account. The distance is computed in the order that the points are provided. The multiple-argument variant can take a combination of literal string geopoint values and references to geopoint nodes.
Copy link
Member

@seadowg seadowg May 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like everyone else already understands this except for me, but what does "distance" actually calculate here? I'm guessing it's the total of the distance between subsequent points in the "sequence" being passed. If so, that was not obvious to me from this description. My instinct would be that a "distance" function only ever takes two args and then calculate the distance between them.

I think it'd also be useful to be more specific on how "distance" is calculated. I'm guessing there's an equation/algorithm that can be referenced instead of:

It takes into account the circumference of the Earth around the Equator and does not take altitude into account

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My instinct would be that a "distance" function only ever takes two args and then calculate the distance between them.

Yes, I agree that would be the typical assumption and easier for most people to understand which is why I think this addition is a good one.

I've added clarification on the algorithm, hopefully that is clearer now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's great!

@lognaturel lognaturel requested a review from seadowg May 20, 2024 16:00
@lognaturel lognaturel merged commit baa331e into getodk:gh-pages May 21, 2024
@lognaturel lognaturel deleted the multi-arg-distance branch May 21, 2024 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants