Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change SphericalMath::PointXYZ to implement BigDecimal instead of Float #213

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

keithdoggett
Copy link
Member

Summary

Resolves #212

Valid polygons will sometimes fail the linear ring test in spherical factories if they have a high point density. The test erroneously detects that arcs of the polygon intersect causing it to fail the is_simple? condition. After some testing, I found out the issue has to do with floats lacking enough precision in the spherical coordinate system.

To fix this, I cast the floats to BigDecimals before any calculations are done on them. This has so far worked for all of the cases I've found of the issue.

Other Information

I attempted to integrate the BigMath library into the class to handle trigonometric and sqrt calculations but it made the tests take about 240x longer. I didn't notice any increase in accuracy by using it so removed it.

@jasonhr13
Copy link

Would love if this could get merged!

@keithdoggett
Copy link
Member Author

If you need to use the fix immediately and you're using a Gemfile/Bundler, you can add this to the Gemfile to build from the branch.

gem 'rgeo', git: "git://github.com/Kdoggett887/rgeo.git", branch: "fix-212"

Just make sure to change it back once it's merged in.

If you use the branch and are still getting errors, post in the original issue and I can take a look.

@keithdoggett
Copy link
Member Author

Instead of forcing BigDecimal to be used, uses_decimals can be passed into spherical_factory as an option. If it's used, PointXYZ will use decimals as its data type instead of floats.

Latitudes and longitudes on points will still be represented as floats.

usage:
factory = RGeo::Geographic.spherical_factory(uses_decimals: true)

@jdalt
Copy link

jdalt commented Apr 6, 2020

@keithdoggett I've been bitten by this issue several times, and I really appreciate your efforts to fix the issue. I was wondering what the performance implications of this change are. I created a test to validate that this patch fixes my issue and noticed the test taking another 1.3s (0.3s vs 1.6s).

I was also wondering whether there might be an alternative approach where we continue to use floats but only trigger #intersects_arc? when the float passes a certain configurable threshold.

@keithdoggett
Copy link
Member Author

@jdalt thanks for the reply and testing out the code. Definitely appreciate the feedback.

You're right that the performance with decimals instead of floats is not nearly as good. I tried to alleviate that somewhat with my last update to make the option to use decimals configurable at the factory level, but it's still slower in cases you need to use it and the way that option is passed around isn't the cleanest.

Your approach sounds interesting, but I need a little bit of clarification. When you say the float passes a certain configurable threshold, do you mean distance from another point or how precise a coordinate the float is representing?

@jdalt
Copy link

jdalt commented Apr 9, 2020

@keithdoggett My idea was to use a threshold for detecting arc intersection. I'm not certain this would be a viable solution though since it sacrifices correctness, and I'm realizing these rounding errors actually let several invalid polygons into my dataset.

Ultimately I've worked around this by passing uses_lenient_assertions: true to my spherical factory. I'm using a different factory to validate whether the geometry is simple and valid. I only need the spherical coordinates for certain area calculations. The performance cost was too much for my app, unfortunately. I only tested with ruby 2.5, perhaps ruby 2.6 performance will be better.

@keithdoggett
Copy link
Member Author

Gotcha. At this point, it does seem to be a trade-off between correctness and performance.

One theory that I haven't gotten around to testing is that the inaccuracies arise from converting from a spherical system to cartesian in PointXYZ. There's a comment in the class that indicates that this was done to improve calculation speed and increase stability on some calculations, but this may be inadvertently creating validation errors.

I think that the solution you posed is the best at this point, though. Do the validations in a projection or some other factory and then spherical calculations in a factory with lenient assertions.

@jdalt
Copy link

jdalt commented Apr 14, 2020

I'd be interested to see dummy examples of the following:

  1. the projected coordinate system catches a false intersection that doesn't really exist in spherical coordinates
  2. the projected coordinate system doesn't catch an intersection that only exists when calculated in spherical coordinates

I would imagine that one of these happens when the geometry is very small (the dense packing problem) and one happens when things are very large. It would be nice to develop some guidance as to when these are important.

For both of these edge case situations, the spherical factory with decimals would be necessary. For everything else you might as well use a CAPI backed projected coordinate system. Knowing the heuristic for which factory to select is key.

@waissbluth
Copy link
Contributor

hey there, did anybody find another workaround for this?

@keithdoggett
Copy link
Member Author

@waissbluth we haven't made any progress on this issue specifically, although there have been discussions about potential solutions that we'll try to implement at some point in the future.

But we have been reworking the validity handling for version 3 (#271) so that errors won't be raised on instantiation and validity will only be checked on an as-needed basis (ex. poly.area). This wouldn't fix the issue of incorrectly identifying invalid polygons, but it would reduce some of the friction with the current method of aggressively raising errors.

There is still the option to pass the uses_lenient_assertions option into the spherical factory and it should bypass the check. Is there a reason why that wouldn't work for your use case?

@waissbluth
Copy link
Contributor

There is still the option to pass the uses_lenient_assertions option into the spherical factory and it should bypass the check. Is there a reason why that wouldn't work for your use case?

Thank you @keithdoggett, this works:

RGeo::ActiveRecord::SpatialFactoryStore.instance.tap do |config|
  config.default = RGeo::Geos.factory_generator(uses_lenient_assertions: true)
end

Surprisingly (?), this also solves the issue:

RGeo::ActiveRecord::SpatialFactoryStore.instance.tap do |config|
  config.default = RGeo::Geos.factory_generator
end

Which is surprising since I thought that was the default behavior and therefore adding that could should not have an effect on this bug?

For reference, this is the geometry that is causing trouble: SRID=4326;POLYGON((-77.59076683799999 37.430905558,-77.59080505999999 37.430928995,-77.590813457 37.430920288,-77.59086806000001 37.430953768,-77.59084707 37.430975542,-77.590852529 37.43097889,-77.590829437 37.431002839,-77.590834896 37.431006188,-77.590822301 37.431019251,-77.59081684 37.431015903,-77.590793748 37.431039851,-77.59073368200001 37.431003021,-77.59075467700001 37.430981249,-77.590732835 37.430967856,-77.590749629 37.430950438,-77.59073324800001 37.430940393,-77.59076683799999 37.430905558))

@keithdoggett
Copy link
Member Author

Surprisingly (?), this also solves the issue:

RGeo::ActiveRecord::SpatialFactoryStore.instance.tap do |config|
  config.default = RGeo::Geos.factory_generator
end

The SpatialFactoryStore by default will use RGeo::Geos.factory_generator for columns where geographic is falsy and RGeo::Geographic.spherical_factory(srid: 4326) for columns where geographic is true. So your change actually makes it so that all columns are processed by a GEOS factory instead of spherical factory.

You could do something like this

RGeo::ActiveRecord::SpatialFactoryStore.instance.tap do |store|
  store.default = RGeo::Geos.factory_generator

  store.register(RGeo::Geographic.spherical_factory(srid: 4326, uses_lenient_assertions: true), {sql_type: 'geography'})
end

To send all geographic columns to the specified spherical factory and all geometry columns to the default factory.

Let me know if that helps.

@BuonOmo BuonOmo deleted the branch rgeo:main July 7, 2022 15:38
@BuonOmo BuonOmo closed this Jul 7, 2022
@BuonOmo BuonOmo reopened this Jul 7, 2022
@BuonOmo BuonOmo changed the base branch from master to main July 7, 2022 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Some "valid" polygons are treated as invalid by rgeo
5 participants