Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial geo-traits implementation #1011

Merged
merged 13 commits into from
Jun 21, 2023
Merged

Conversation

kylebarron
Copy link
Collaborator

@kylebarron kylebarron commented Apr 20, 2023

  • I agree to follow the project's code of conduct.
  • I added an entry to CHANGES.md if knowledge of this change could be valuable to users.

As discussed on the discord, in #838, and originally in #67, this is an initial bare-bones implementation of geometry access traits for discussion. As @frewsxcv suggested, for now this is a separate crate in the workspace to let us iterate on some ideas.

Future work

Minimize cloning in the geo-types implementation

I got lost in lifetime errors so to get things to compile I added cloning to the geo-types implementations. I.e. for MultiPoint each point is cloned when accessed.

fn y(&self) -> Self::T;

/// Returns a tuple that contains the x/horizontal & y/vertical component of the coord.
fn x_y(&self) -> (Self::T, Self::T);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it's reasonable to have a default implementation for this so implementers don't need to implement it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that makes sense. I figure one possible approach to these traits is to have a lot of default methods that currently live in geo-types, like

pub fn dot(self, other: Self) -> T {
self.x() * other.x() + self.y() * other.y()
}

readme = "../README.md"
keywords = ["gis", "geo", "geography", "geospatial"]
description = "Geospatial traits"
rust-version = "1.65"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bumped this to 1.65 for this crate because the traits use GATs, which came in 1.65

@michaelkirk
Copy link
Member

Thank you for moving the traits discussion forward with some concrete code! I think these proposed trait implementations looks reasonable in the abstract. I'd love to see how some algorithms can be implemented in terms of these traits, and what kind of changes/concessions/benefits arise when rubber meets the road.

I think seeing how they can be used will be a necessary measure of how "good" these trait definitions are.

MultiPoint(&'a MP),
MultiLineString(&'a ML),
MultiPolygon(&'a MY),
GeometryCollection(&'a GC),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any thoughts on adding the other types offered in geo-types? Namely Triangle, Line, and Rect

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely something to discuss. I didn't implement them yet because I wasn't sure the pros/cons of how hard that might be for some data structures to implement vs the usefulness in adapting for geo algorithms. If a data structure doesn't natively have a Rect type, should it allow a polygon to fill in as a Rect?

geo-traits/src/line_string.rs Outdated Show resolved Hide resolved
[features]

[dependencies]
geo-types = "0.7.9"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about making geo-types optional? In theory if someone wanted to just implement geo-traits for their own geometry types, they shouldn't need access to geo-types, I think

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm actually this depends on CoordTrait. Maybe that should live in this crate someday 🤔

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think making geo-types optional definitely makes sense but in this initial implementation I made a hard dependency for simplicity

use std::slice::Iter;

pub trait PolygonTrait<'a>: Send + Sync {
type ItemType: 'a + LineStringTrait<'a>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe RingType would be more descriptive? 🤷🏻

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we change the name ItemType in each trait then? I.e. CoordType in LineStringTrait?

use std::iter::Cloned;
use std::slice::Iter;

pub trait PolygonTrait<'a>: Send + Sync {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we'll want to specify guarantees around winding order for implementors.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know... some stores might not have that information? I think GEOS objects don't have a known winding order?

@kylebarron
Copy link
Collaborator Author

I'd love to see how some algorithms can be implemented in terms of these traits, and what kind of changes/concessions/benefits arise when rubber meets the road.

I agree things will become clearer with code implementation. Should I make a new stacked PR to iterate or add edits to this PR?

@frewsxcv
Copy link
Member

My vote would be to merge as-is, and then iterate on it with subsequent PRs. Then it's easier for others to collaborate too

@kylebarron
Copy link
Collaborator Author

kylebarron commented Jun 16, 2023

Any changes I should make in this PR? Unclear what should be discussed further; maybe add rect/line/triangle?

@frewsxcv
Copy link
Member

Let's wait to hear what @michaelkirk thinks! If he approves of the plan then yeah we can merge it

@michaelkirk
Copy link
Member

My concern is that we may be putting the cart before the horse a bit. My understanding of this work is that ultimately the purpose of it is to be able to define algorithms in terms of these traits. geo-types would be one implementor, but people might want to bring their own types which are better suited for their use cases for whatever reasons, and write their own impl geo_traits::* for MyGeometryType - like an Arrow backed geometry. Right?

I don't think we need to come up with a perfect and complete design before merging. I'm saying "show me that the code you want to merge works at all". Yes it compiles, but if the point of these traits is to be able to implement algorithms, then lets see what it looks like to implement an algorithm with them.

My hope is that if this is a reasonably good design for geometry traits, that this will not be an unreasonably hard task. If it turns out to be hard to do, then maybe this is not a good design for a geometry traits and we all will learn something.

@@ -0,0 +1,64 @@
use geo_types::{Coord, CoordNum, Point};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧑‍🚒 hot take: It was a mistake to have geo_types::Coordinate separate from geo_types::Point they are redundant, and I'm guessing this only happened because of ideas copied over from Java where a "light weight" representation required splitting the two concepts.

Rust doesn't have the overhead of "primitive type" vs "object". Regardless of how many methods or whatever we add to it, geo-types::Coordinate<f64> will always be the same size as [f64; 2].

The cost of our current approach is frequent confusion about whether a method can/should accept a Coordinate, a Point, or both.

Maybe geo-traits could be an opportunity to fix that wart. We could delete CoordTrait, and have only PointTrait. Both geo-types::Coordinate and geo_types::Point would implement PointTrait, allowing any methods defined in terms of PointTrait to accept both, and finally resolving the question "Should this be a Point or a Coord?"

This is a half baked armchair suggestion, so no changes demanded, but think about it!

In fact, this might be a relatively nice way to start showing the value of a trait system proposal - by simplifying some of our existing APIs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, no judgement on the original geo_types::Coordinate vs geo_types::Point decision. It's gotten us pretty far, and it seemed totally reasonable to me at first too, until years of stubbing my toe on it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be fully supportive of making Coord and Point both implement a shared PointTrait! I admit I do get confused when to use coord vs point.

@frewsxcv
Copy link
Member

@michaelkirk For the purposes of a proof-of-concept, do you see the implementations you desire living in geo or geo-traits?

@michaelkirk
Copy link
Member

I was assuming our existing algorithms in geo would be modified to work in terms of geo-traits where possible.

Did you have something else in mind?

@michaelkirk
Copy link
Member

For the purposes of a proof-of-concept

I guess I'm a little uncertain about the usage of "proof of concept". I think that if there is code in georust/geo we should expect that people will use it.

If we want a place to experiment with crazy stuff that we don't want people to use I don't think it should live in this repository.

@frewsxcv
Copy link
Member

I think that if there is code in georust/geo we should expect that people will use it.

Curious where this concern comes from! And how them using unpublished code would affect the lives of us as maintainers. As long as we are unpaid volunteers supporting their efforts, we don't owe them any guarantees about how we run our work. Unlike many technical decisions, I feel pretty strongly about unpaid labor haha

@frewsxcv
Copy link
Member

frewsxcv commented Jun 16, 2023

I was assuming our existing algorithms in geo would be modified to work in terms of geo-traits where possible.

Did you have something else in mind?

I ask because if they live in the geo/ directory then that muddies the water in terms of separating our draft code and keeping everything in the geo/ directory deployable.

@michaelkirk
Copy link
Member

michaelkirk commented Jun 16, 2023

I agree that we should keep geo/ deployable at all times (modulo a version bump in Cargo.toml or whatever).

Can you elaborate on how you'd like to see geo-traits development proceed @frewsxcv? All I can tell is that you seem to not be quite on board with my vision for it. 😅

@frewsxcv
Copy link
Member

I see us iterating on it in a separate crate directory akin to the other crate directories. If a pull request change to geo-traits is an improvement, we approve it and merge it in. There's no need to increase the barrier to contribution with a separate repository or ensuring the full vision is there. We can do that when we're ready to publish.

@b4l
Copy link
Member

b4l commented Jun 16, 2023

Alternatively, we could agree to work on a traits branch analogous to a development branch for a while until we figure out if it is the way to go. There we would also have the liberty to directly refactor the existing algorithms and put the code in the right place and have a nice diff.
Mostly working with 3d data nowadays and the stalled effort to introduce another coordinate dimension in geo-types prevents me from using it which I find quite unfortunate.
Just providing options here and hope we can find space to be bold at times and traits may as well offer some possibility to support somehow arbitrary dimension support.

Edit: Also +1 for merging the coord and point traits as long as they don't diverge and what about using std::ops::Index instead of 'point(i: usize)` and the likes to access individual members of multi types?

@michaelkirk
Copy link
Member

Another concern I have is less objective, but subjectively, I suspect my position as a maintainer gets easier (and thus more delightful) when there are clear delineations between "this code is thought to work" vs. "this code is a prototype, and I'm not sure to what extent it is intended to work". I'm willing to accept that this is an idiosyncrasy founded on my own personal trauma doing open source development on The Internet (🤣😭) and may not be relevant to other people, but it's still my preference.

I see us iterating on it in a separate crate directory akin to the other crate directories.

Ok, I'm with you here, but more specifically, if we merged these trait definitions, what's next? I'm assuming implementing some algorithms with these traits would be the next step (right?).

So how would we go about that? My best guess is you mean we'd be duplicating the existing algorithms from geo into geo-traits and editing that copy to work with geo-traits instead of geo-types? Or did you intend something else?

we could agree to work on a traits branch analogous to a development branch for a while until we figure out if it is the way to go.

(Thanks for chiming in @b4l!)

I think long running branches can be fraught, typically because they can get out of sync, but I do think it's preferable to duplicating/re-implementing the algorithms in geo-traits. At least with a long running branch, version control can give us some hints as to where conflicts arise (and hopefully it won't be that long running anyway, right? 😬). The alternative of duplicating the implementations means we are relying on our own fragile minds to reason about/remember when conflicts occur.

There we would also have the liberty to directly refactor the existing algorithms and put the code in the right place and have a nice diff.

👍👍 Having a diff against the current geo algorithm implementation is the primary thing I'm looking for when trying to evaluate how "good" these trait implementations are.


What do other people think about merging traits work into a traits branch until we're comfortable releasing something usable and relevant to geo?

I'd especially like to hear from @kylebarron, who seems most affected by this decision in the near term.

@kylebarron
Copy link
Collaborator Author

I'm happy to have this merge onto a traits branch, and ongoing PRs can have that as the base branch. We could also have an ongoing open draft PR from the traits branch to the main branch to make it easy for people to see and comment on the state of the two branches at large

@rmanoka
Copy link
Contributor

rmanoka commented Jun 17, 2023

This an excellent contribution; thanks @kylebarron ! I'm also quite in favour of exploring the trait-based approach for geo. Specifically w.r. to this PR, I would also prefer we merge this as-is/quickly into a separate branch until we identify a plan to switch to it (either incrementally or in one release). This would help have smaller PRs to help push this out.

Particularly we should try to understand:

  1. Port effort. Is the changes needed per algo. straight-forward or needs knowledge of algo? Do we need to port all algos in one release or can it be done incrementally?
  2. Dependencies. geo-traits abstracts over geo-types, and hence the new dependency chain is: geo -> geo-traits -> geo-types. In the long run, geo won't even directly depend on geo-types. But, are there any dependency issues to understand? Will every breaking bump of geo-traits need bump of geo? This is fine, but would we then have to impl. the algos for different versions of geo-traits via complicated macros?
  3. Performance. Would be good to benchmark some algos on this approach to ensure the compiler understands that we're just wrapping around [f64; 2] or similar.

Pls. add more, and we could track this in the PR that merges the traits branch to main. Reg. @michaelkirk valid concern on the feature branch diverging: since the traits are just wrapping types, at least until we finalize the details of the traits, only changes in geo-types should affect the difference between the branches. So, may be worth the risk?

@urschrei
Copy link
Member

(apologies for the drive-by comment)
I also prefer the idea of a separate branch for now. Yes, there's some overhead to keeping it in sync, but as has been pointed out it hopefully won't be too long-running.

@frewsxcv
Copy link
Member

I have a preference for merging into main, but I am also okay with a separate traits branch.

@frewsxcv frewsxcv changed the base branch from main to traits June 19, 2023 01:07
@frewsxcv
Copy link
Member

I just updated the base branch of this pull request to be a new traits branch. Anyone else have blockers before merging?

@kylebarron
Copy link
Collaborator Author

kylebarron commented Jun 19, 2023

Seems fine on my side; thanks for making the branch!

@frewsxcv
Copy link
Member

bors r+

🎉 🎉 🎉 🎉 🎉

@bors
Copy link
Contributor

bors bot commented Jun 21, 2023

Build succeeded!

The publicly hosted instance of bors-ng is deprecated and will go away soon.

If you want to self-host your own instance, instructions are here.
For more help, visit the forum.

If you want to switch to GitHub's built-in merge queue, visit their help page.

@bors bors bot merged commit 1b9557f into georust:traits Jun 21, 2023
1 check passed
@frewsxcv frewsxcv mentioned this pull request Jun 21, 2023
2 tasks
@michaelkirk
Copy link
Member

🥳
Thanks for your flexibility @frewsxcv.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants