Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested Tuples #11

Open
cscherrer opened this issue Dec 9, 2020 · 11 comments
Open

Nested Tuples #11

cscherrer opened this issue Dec 9, 2020 · 11 comments

Comments

@cscherrer
Copy link

Hi, I just put together this little proof of concept using Setfield:
https://github.com/cscherrer/NestedTuples.jl

I wonder if it could make more sense to move it to using Accessors. Also, @rafaqz , I just saw your post, looks like there might be some overlap in our needs. Pinging you here in case that's right :)

The problem I'm working on is a good way to represent samples in a Monte Carlo simulation. In Soss, each sample is a named tuple. And I need to manage these programmatically, so I need a clean interface.

I thought StructArrays might already do it, but from what I've seen so far, StructArrays don't have much support for nested structure. So I started down this road today thinking I might get to something more tailored for the problem.

So I guess my questions are

  • Should I be using Accessors instead? Or maybe not yet?
  • Should some of this functionality go into Accessors itself? Or maybe abstracted differently?

I've been really impressed with the data manipulation work I've seen from you (@jw3126 and @tkf ), so I hope anything I contribute can fit cleanly into that :)

@rafaqz
Copy link
Member

rafaqz commented Dec 9, 2020

Flatten.jl will handle that (err maybe I misunderstood your use-case). Also have some code here that incorporates lenses and nested tuple flattening into a single process:
#9 (comment)

I need to turn that into a package or integrate it with Accessors.jl at some point, but can't really fid the time right now.

@jw3126
Copy link
Member

jw3126 commented Dec 9, 2020

Thanks a lot for opening this!

The problem I'm working on is a good way to represent samples in a Monte Carlo simulation. In Soss, each sample is a named tuple. And I need to manage these programmatically, so I need a clean interface.

Very excited to see Setfield used in this context. I am a probabilistic programming fan boy (by no means an expert). But some time ago, I wondered if Gen.ChoiceMaps could be replaced by something with more optics flavor.

* Should I be using Accessors instead? Or maybe not yet?

I think Accessors is stable enough so that it makes sense to use it in a new project. I think function lenses are more convenient with Accessors and they might be useful for what you are doing. Also, Accessors allows more general optics than just lenses, which might also be useful.

* Should some of this functionality go into Accessors itself? Or maybe abstracted differently?

Probably things can be abstracted differently and some parts moved to Accessors. How and which parts? No idea 😄 I think that needs exploration and I think it's easier if you do it in your own package.

If you discover specific pieces that you feel should live in Accessors, please tell us.

@cscherrer
Copy link
Author

Thanks for the feedback. I'll try to move things over to Accessors and let you know if I hit any snags.
@rafaqz your code is really interesting, but I don't understand yet how query is set up. I think I'll need to dig into this a bit more and come back to it. ModelParameters also looks really nice, reminds me a little of TransformVariables

@rafaqz
Copy link
Member

rafaqz commented Dec 10, 2020

Yeah, nested @generated functions are not easy to reason about. But it usually compiles away completely so it's kind of worth the overhead. Generally do you think something like that would take care of your use-case? It sounds similar but I don't completely understand what you need.

ModelParameters.jl is pretty much a wrapper for Flatten.jl for a specific use-case. Eventually it should also be buillt on that code I linked to - either in Accessors.jl or the yet to be written ObjectQueries.jl - because it would be nice to use Optics with the nested flattening for some tasks, like I do in that example. It's just not clear to me how best to integrated these things yet.

@cscherrer
Copy link
Author

Basically what I want is something like StructArrays, but with stronger support for nested (named) tuples. I'm making some progress! I used some generated function tricks to get this working, think it can become the core of a very efficient modify:

julia> using NestedTuples

julia> x = (a=[1,2], b=(c=([3,4], [5,6]),d=[7,8]))
(a = [1, 2], b = (c = ([3, 4], [5, 6]), d = [7, 8]))

julia> @btime leaf_setter($x)(1,2,3,4)
  0.016 ns (0 allocations: 0 bytes)
(a = 1, b = (c = (2, 3), d = 4))

@cscherrer
Copy link
Author

Ok, I managed to put together a Leaves optic. The idea is that is descends into tuples and named tuples, and works with the result. All of this should be manageable at compile-time.

Here's a little example:

function ind(x, j)
    f(arr) = @inbounds arr[j]
    modify(f, x, Leaves())
end

Then we can do

julia> x = (a = [1, 2], b = (c = ([3, 4], [5, 6]), d = [7, 8]));

julia> @btime ind($x, 1)
  1.901 ns (0 allocations: 0 bytes)
(a = 1, b = (c = (3, 5), d = 7))

As one building block for this, I had been trying to do

leaves(x, y...) = (leaves(x)..., leaves(y)...)
leaves(x::Tuple) = leaves(x...)
leaves(x::NamedTuple) = leaves(values(x)...)
leaves(x) = (x,)

I would expect that to be type-stable, but it's not (try @code_warntype leaves(x) to see). So just t try it out I used Flatten.flatten(obj, Array). This is really fast (well done @rafaqz !)

@jw3126
Copy link
Member

jw3126 commented Dec 10, 2020

Cool! About the leaves example, I think it is worth opening an issue at julia.

@jw3126
Copy link
Member

jw3126 commented Dec 10, 2020

I could get in inferred by doing the following, but probably very brittle and still worth opening an issue:

module M
leaves(x::Tuple) = cattuples(map(leaves, x))
leaves(x::NamedTuple) = cattuples(map(leaves, values(x)))
leaves(x) = (x,)

cattuples(x) = _cattuples(x...)
@inline _cattuples(x, y...) = (_cattuples(x)..., _cattuples(y)...)
@inline _cattuples(x, y) = (_cattuples(x)..., _cattuples(y)...)
@inline _cattuples(x::Tuple) = _cattuples(x...)
@inline _cattuples(x) = (x,)

x = (a = [1, 2], b = (c = ([3, 4], [5, 6]), d = [7, 8]));
using Test
@test leaves(x) == ([1, 2], [3, 4], [5, 6], [7, 8])
@inferred leaves(x)

end

@cscherrer
Copy link
Author

Cool! About the leaves example, I think it is worth opening an issue at julia.

Good idea
JuliaLang/julia#38825

I could get in inferred by doing the following

Oh, that's funny. I got rid of the cattuples call I originally had, because it was faster to inline it manually. Thanks for the quick fix :)

@jw3126
Copy link
Member

jw3126 commented Dec 11, 2020

Oh, that's funny. I got rid of the cattuples call I originally had, because it was faster to inline it manually. Thanks for the quick fix :)

I think factoring out cattuples is not important for inference. What I tried to do is reduce the splatting and having the explicit two arg version

_cattuples(x, y)

was the crucial part.

@cscherrer
Copy link
Author

Right, Michael Abbott suggested something similar ("reduce splatting"):
JuliaLang/julia#38825 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants