Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-enable rectangular search for SDSS query_region #2659

Closed
emirkmo opened this issue Feb 8, 2023 · 9 comments · Fixed by #2663
Closed

Re-enable rectangular search for SDSS query_region #2659

emirkmo opened this issue Feb 8, 2023 · 9 comments · Fixed by #2663

Comments

@emirkmo
Copy link

emirkmo commented Feb 8, 2023

#2477 Removed rectangular search from query_region in SDSS and converted it to a cone search. However, a drawback was not significantly discussed:

  • The rectangular search can use a much larger "radius" i.e. field size. The limit for a cone search is 3 arcmin while the limit for a rectangular search can be as large as 12 arcmin (or more).

With the sparse fields we usually query for extragalactic transients, this does not usually get close to row-size limits.

The changes in #2477 were made without Depreciation warnings as well, which I understand with the pre-release model.. however it would have been nice to at least preserve the old endpoint in function such as query_rectangle_region or in a parameter such as rectangular: bool = False (where setting it to True uses the old query_region) so that backwards compatibility can be maintained.

Anyway this is a Feature Request to add this back in, as the old version still works (can be tested by pinning astroquery to e.g., v0.4.6, but making the change to the endpoint being HTTPS, a recent fix from #2654):

from astroquery import sdss
sdss.conf.skyserver_baseurl = sdss.conf.skyserver_baseurl.replace("http://","https://")
working_sdss_table = sdss.SDSS.query_region(<pars>,radius=<large radius multiple times over the MAX_CROSSID_RADIUS>) 

In fact in SDSS rectangular search there is only a suggestion, not a hard limit, on "radius" of 0.2 degrees, which seems to be not enforced by radius but by rows returned. See e.g.,: https://skyserver.sdss.org/dr17/SearchTools/rect

@bsipocz
Copy link
Member

bsipocz commented Feb 8, 2023

See relevant discussion at the end of the PR #2477

@weaverba137
Copy link
Member

I think my take not radically different from comments already made on #2477, and apologies if I'm duplicating, but the items below all sort of fit together. Being explicit rather than implicit now will save time in the future.

  1. Zeroth, let's make sure we take time to convert the discussion into specific requirements, potentially revisiting old requirements. Since we are not dealing with an issue that has appeared in a tagged version, we can take that time.
  2. I'm glad to see that no one is suggesting a 100% reversion of Convert query_region to cone search #2477, because there is are lot of infrastructure improvements in there. We can further modify the public-facing API while keeping that infrastructure.
  3. I have no problem reintroducing a rectangular search, which would then be based on the infrastructure of Convert query_region to cone search #2477, and a SQL query as previously suggested. However we should have a specific requirement on what a "rectangle" means, because the original version did not correct for cos(dec), see Make SDSS query region a circle #585 and linked issues there.
  4. That brings me to the definition of query_region(). I recall there was a specification for this, and here it is. When I read that carefully it sounds like query_region() should support both radial and rectangular searches. Let's take the time to revisit that and decide if that's really what we want.
  5. Following up on item 3, in other classes, what does query_region() do in practice? My reading of Make SDSS query region a circle #585 was that users were surprised that SDSS.query_region() performed a rectangle search, which implies that in other classes, a radial search is typically performed. Maybe other classes all support both radial and rectangular. I'm going to have to ask other experts to go through to check how this works in practice; it would take me too long to do that.
  6. For completeness, another possible reading of Make SDSS query region a circle #585 is that users were surprised that the keyword radius performed a rectangular search. I can't be certain either way.
  7. There is another possible ambiguity in existing API requirements. Here it says "Query a region around a coordinate.", but I'm pretty sure that multiple coordinates are actually desired, and that's the path that Convert query_region to cone search #2477 took. And in the previous tagged version, a rectangular search was performed around each of a possible set of coordinates. This would also be worth comparing to other classes in astroquery.
  8. When we perform these searches around multiple coordinates, both radial and rectangular, we should make it clear that it is a radius or rectangle around each point, because without that clarification, one could assume that it is a region that circumscribes every point. Again, it would be worth checking that no other classes assume this. It's easy enough to change the SDSS method documentation to say this, but I think it should also be explicitly stated in the API specifications.
  9. Finally I note for the record that the SDSS class appears to be unique in having a query_crossid() method. There is no existing API specification for that method that I am aware of. Is that because SDSS (the project, not the class) is unique in providing such a service?

@keflavich
Copy link
Contributor

Thanks for the careful thought @weaverba137 .

Your points 3, 4, and 5 are right; query_region(..., radius=<something>) should do a projected circle (cone) search. Most query_region tools do this; all should. The surprising behavior was that radius was yielding a rectangle or square search; keywords width and height should trigger those (as per the linked API).

On (6), generally yes, query_region should be a region around a single coordinate, though some services support vectorized queries (e.g., SIMBAD) in which a list of coordinates can be sent. Whether query_region takes a vector of coordinates depends on the upstream service, since some do not support multiple coordinate searches. #682 is our open issue to see whether other services can be modified to support vector queries.

(7) - agree

(8) - Yes, SDSS (the project) is unique in providing this service. Most astroquery-supported services do not have unique identifiers on which to perform cross-matches. So we do not have a general API for it.

@pllim
Copy link
Member

pllim commented Feb 9, 2023

Personally, I don't care if it is cone or rectangle. The problem on my side is that the cone one has an arbitrary 3 arcmin limit on radius that is not documented (and does not seem to be just a default, I get a server error when radius exceeds 3 arcmin). If there is a way to workaround this limit without resorting back to SQL or doing multiple queries, then I will be happy enough. I cannot speak for others though.

@weaverba137
Copy link
Member

Maybe I'm misreading what you mean, but I would not characterize a server-side limit as arbitrary. The astroquery team did not make that choice out of thin air. Yes, it could be documented better, but it is not the case that it is not documented at all.

@weaverba137
Copy link
Member

@keflavich, thank you that is very useful information. I would still like to arrive at a consensus of what "rectangle" actually means, but I think restoring rectangle search functionality could proceed while that decision is still being made.

@keflavich
Copy link
Contributor

Indeed, I bet there is quite a bit of ambiguity in what 'rectangle' means across different services, since it could mean either a projected rectangle with a specific size in angular units, or it could mean "what you get if you look in the range RA +/- width, Dec +/- height". The latter is perhaps less correct because of the failure to account for cos(dec), but it is easier to implement and therefore likely used in some cases. For now, the most important thing is to be clear in the documentation which is being used; we can work on unification across services after that.

@emirkmo
Copy link
Author

emirkmo commented Feb 10, 2023

I think it is important to point out that rectangle search is an SDSS specific endpoint. So I would expect rectangle to mean what it means in SDSS: a range in ra and a range in dec.

How a "radius" is converted into a rectangle is ultimately something that should be as direct as possible as I would rather calculate what that means myself, using my own projection.

But if it's documented, it's ok if a correction for what astronomers expect when they mean 5x5arcmin around coordinates x,y etc. is automatically done. Just need to know.

@bsipocz
Copy link
Member

bsipocz commented Feb 10, 2023

I think it is important to point out that rectangle search is an SDSS specific endpoint. So I would expect rectangle to mean what it means in SDSS: a range in ra and a range in dec.

The thing is that it wasn't a specific endpoint on the server side, but a thing that only existed in astroquery. In fact this motivated the whole upgrade to use the specific cone-search (or whatever area search the server supports) endpoint.
We have a mixture of radius or height/width kwarg combination for services to do one or the other and for the case of sdss we unfortunately used radius but under the hood constructed a rectangular search and that lead to the numerous bug reports (that it's not a cone, it's not a real radius, no cos(dec) is being done, etc.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants