Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

searchsorted by applies by function to item being searched #9429

Open
RauliRuohonen opened this issue Dec 21, 2014 · 6 comments
Open

searchsorted by applies by function to item being searched #9429

RauliRuohonen opened this issue Dec 21, 2014 · 6 comments
Labels
domain:search & find The find* family of functions status:help wanted Indicates that a maintainer wants help on an issue or pull request

Comments

@RauliRuohonen
Copy link

Consider this:

using Base.Dates: Second

immutable tsp # time series point
    t :: DateTime
    y :: Float64
end

t0 = now()
S = [tsp(t0+Second(10), 42), tsp(t0+Second(13), -1), tsp(t0+Second(17), 31)]

searchsortedfirst(S, t0+Second(12), by=x -> x.t)
searchsortedfirst(S, tsp(t0+Second(12), NaN), by=x -> x.t)

The first search call fails and the second is not ideal because one has to use a dummy value (NaN here) to fill in the parts of the item being searched that are not part of the key. I think that in searchsortedfirst(S, x, by=f) it would make more sense to apply f() to the items of S but not to the item x. If the searchsorted functions worked like that, then the current behavior is easily obtained by using searchsortedfirst(S, f(x), by=f). Inversion of f() is not necessarily as easy.

@kmsquire
Copy link
Member

I think that makes sense. Care to try fixing this and making a pull request? No worries if not.

@ihnorton ihnorton added the status:help wanted Indicates that a maintainer wants help on an issue or pull request label Dec 30, 2014
@krcools
Copy link

krcools commented Nov 20, 2015

I encountered the exact same problem in trying to find a column in a column-lex-sorted Matrix:

A = abs(rand(Int,3,7)) % 10
B = sortcols(A)
x = B[:,5]
searchsortedfirst(collect(1:7), x, by = i->B[:,i], lt = lexless)

This fails like the OP describes. The alternative is even more clumsy here as it would require the addition of an extra column containing the search target to B.

I don't have experience with pull request but am willing to look into it.

@RauliRuohonen
Copy link
Author

This came up at #19295 (comment). I guess #19295 covers this now?

@nalimilan nalimilan added the domain:search & find The find* family of functions label Nov 29, 2017
@haampie
Copy link
Contributor

haampie commented Jan 17, 2019

#19295 is now closed, but this issue still exists.

Let me give another example where it is very annoying:

struct Coord{Tv}
  x::Tv
  y::Tv
end

norm2(p::Coord) = p.x * p.x + p.y * p.y

function function_from_another_package()
  ps = [Coord(randn(), randn()) for i = 1 : 10]
  I = sortperm(ps, by = norm2)
  return ps, I
end

function find_first_coord_outside_unit_circle()
  ps, I = function_from_another_package()

  # You can't pass the distance 1 as you'd expect
  searchsortedfirst(I, 1.0, by = i -> norm2(ps[i])) # nope...

  # You can't pass a coord on the unit circle either...
  searchsortedfirst(I, Coord(1.0, 0.0), by = i -> norm2(ps[i])) # nope...

  # The only solution seems to be the following -- but this 
  # requires me looking into the internals of searchsortedfirst
  searchsortedfirst(I, 1.0, lt = (p, q) -> norm2(ps[p]) < q)
end

@haampie
Copy link
Contributor

haampie commented Jan 18, 2019

@tkf
Copy link
Member

tkf commented Apr 10, 2020

If we have #31553 (or any solution to #19198), I think we can just write searchsortedfirst((for f.(xs)), y).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:search & find The find* family of functions status:help wanted Indicates that a maintainer wants help on an issue or pull request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants