Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Call-site autowrap function or macro? #142

Open
rofinn opened this issue Oct 16, 2020 · 11 comments
Open

Call-site autowrap function or macro? #142

rofinn opened this issue Oct 16, 2020 · 11 comments

Comments

@rofinn
Copy link
Member

rofinn commented Oct 16, 2020

While I don't think this is a good long term solution, I think having a temporary call-site autowrap function or macro for function calls that don't work on NameDimsArrays (or KeyedArrays in AxisKeys.jl) would help improve adoption and reduce friction. General logic would include:

  1. Extract/unwrap parent array
  2. Call the function f
  3. Use a list of dims to specify how the result should be rewrapped.
    • Default could be : meaning the result should be a scalar?
    • AxisKeys.jl should be able to extend this in some way and error if the resulting dims don't match the input (e.g., filtering)
@oxinabox
Copy link
Member

oxinabox commented Oct 16, 2020

I think there is something sound to this idea.
Though its hard because there are so many ways you might want to rewrap results.
Just look at all the ones in https://github.com/invenia/NamedDims.jl/blob/master/src/functions.jl

@rofinn
Copy link
Member Author

rofinn commented Oct 16, 2020

Yeah, I feel like making users specify the dims for rewrapping might be the best approach for that. Otherwise, it seems like we'd fall down the rabbit hole of building a rule base for determining best guesses based on inputs and outputs. Looks like you've already started doing that a bit in functions.jl, but at least that's only looking a static set of functions.

@oxinabox
Copy link
Member

functions.jl doesn't guess 😂 , it knows based on that static set of functions.

If we can write this macro you propose and it is nice, i suspect we can write something similar to use internally that might simplify the code in functions.jl.

@rofinn
Copy link
Member Author

rofinn commented Oct 19, 2020

One plus side to having this be a call-site function is that we don't need to define explicit wrappers with signatures that match the original function. Seems like a lot of those edge cases are because of inconsistencies with how dims is used. My guess is that in a lot of cases we can just take args and kwargs and make decisions on what is being passed through (e.g., if kwargs has dims=:, args are a NamedDimsArrays)?

@mcabbott
Copy link
Collaborator

mcabbott commented Oct 19, 2020

Here's one attempt...

function apply(f, args...; kw...)
    names = ()
    clean = map(args) do a
        a isa NamedDimsArray || return a
        names = dimnames(a)
        parent(a) # edit -- this got lost somehow!
    end
    names === () && return f(args...; kw...)
    dropped = ()
    cleankw = map(kw.itr, Tuple(kw.data)) do s,v
        s === :dims || return s => v
        d = NamedDims.dim(names, v)
        dropped = (dropped..., d...)
        :dims => d
    end
    out = f(clean...; cleankw...)
    if ndims(out) == length(names)
        NamedDimsArray(out, names)
    elseif ndims(out) == length(names) - length(dropped)
        final = NamedDims.remaining_dimnames_after_dropping(names, dropped)
        NamedDimsArray(out, final)
    else
        out
    end
end

using NamedDims
nda = NamedDimsArray(rand(2,3), (:x, :y))
ndb = apply(sum, log, nda, init=100.0, dims=:x)
ndc = apply(dropdims, ndb, dims=:x)

@code_warntype apply(prod, nda.data, dims=2) # Any

@rofinn
Copy link
Member Author

rofinn commented Oct 19, 2020

Thanks, I think there are two additions that we'll want to add.

  1. I think we want to return parent(a) in the clean args
  2. I think we may want to specify a return type which would:
    • Allow AxisKeys.jl to extend the function for KeyedArrays
    • Allow us to specify a corresponding return type declaration for type stability
    • Provides a short circuit fallback if someone calls apply(f, NamedDimsArray{(:x, :y)}, args...; kwargs...)

@mcabbott
Copy link
Collaborator

mcabbott commented Oct 19, 2020

Sorry 1. got lost somehow, between variants.

I don't really see why this can't be made type-stable, it's all known right? Agree you could supply a type, although by the time you're writing NamedDimsArray{(:x, :y)}, perhaps you may as well re-wrap yourself, with a simpler applier with return f(clean...; cleankw...). And NamedDimsArray{(:x, :y)}(::Any) still isn't a happy outcome.

How many functions would this be useful for? I suppose it's for random packages... here's one where "like dropdims" is the wrong guess:

using GroupSlices
apply(groupslices, nda, dims=:y) # wrongly applies :x name

@rofinn
Copy link
Member Author

rofinn commented Oct 20, 2020

I don't really see why this can't be made type-stable, it's all known right?

The fact that FunctionWrappers.jl exists suggests that it's a harder problem than expected. While we can know all our input arguments we don't know what the result of f will be which I think is the main problem.

How many functions would this be useful for? I suppose it's for random packages...

Yeah, a simple example is that some operations need to work on a DenseMatrix for efficiency reasons and it'd be nice to have a simple work around rather than needing to make a PR to wrap it. I seem to recall running into similar issues with linear algebra functions that needed to be wrapped. Maybe there aren't enough cases to warrant that though?

@oxinabox
Copy link
Member

oxinabox commented Oct 20, 2020

One plus side to having this be a call-site function is that we don't need to define explicit wrappers with signatures that match the original function.

We want to define all the wrapper functions though. So using things just works.
At least for all Base functions and for libraries we commonly use this with.
I think its a tenant of the package:

  1. no runtime overhead
  2. not having the user need to do anything different, except using names for dimensions and kwargs for indexing

Providing something call-site would just be to make temporary work-arounds of missing overloads easier.
But in general such uses should be accompanied by:
TODO remove this once <link to issue>

@mcabbott
Copy link
Collaborator

#24 (or something like it) for LinearAlgebra would be nice to have.

The lighter-weight this package is, the easier it is to get 3rd-party packages to depend on it, instead of the reverse. Maybe it would be worth doing this to Tracker.jl in order to drop Requires.jl?

The fact that FunctionWrappers.jl exists suggests that it's a harder problem than expected. While we can know all our input arguments we don't know what the result of f will be which I think is the main problem.

I don't really understand FunctionWrappers.jl but it sounds like it's for harder cases. When f(args...) is type-stable, why can't apply also be be... except that Julia may give up too early? Maybe more fiddling could fix it, overwriting names = () might be a hurdle. Or maybe a @generated version could behave better.

@rofinn
Copy link
Member Author

rofinn commented Oct 21, 2020

We want to define all the wrapper functions though. So using things just works.

Sorry, I wasn't trying to suggest that this would replace the existing explicit wrappers. I agree that an explicit wrapper should always trump an autowrap call. I was just trying to point out the edge cases with inconsistent signatures that functions.jl needs deal with wouldn't be as much of an issue at the call-site.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants