[WIP] make autobatching matrix multiplies more flexible #566

yoavg · 2017-05-25T23:46:29Z

This introduces two batchable versions of matmul: one in which the first argument is part of the signature, and a second in which it is not. The first version is triggered for cases where the first argument is shared with >2 matmul nodes.

Former-commit-id: 46c9a58

neubig · 2017-05-26T01:09:43Z

Hmm, I really don't like the special handling of matrix multiplies here.
Even if we do introduce something like this I think we should try to be more general.

yoavg · 2017-05-26T01:19:58Z

We can easily count how many different nodes a given node is an arg of, instead of how many matmuls. But why is it better? Where is this relevant besides matmul (and maybe affine transform in the future)?

neubig · 2017-05-26T04:15:29Z

Ones that are highly relevant in addition to matmul are affine transform, conv2d, tensor contraction, etc. (I'm probably forgetting several). All of these will be much faster if you share parameters vs. iterating over them.

Everything else with an arity over 2 is somewhat relevant. Sharing parameters will reduce the need for a memory copy at least.

Basically my main priority is either that all nodes be treated the same, or alternatively that we have a couple of equivalence classes like "benefit largely from grouping" or "don't benefit much from grouping", so when we implement a new node we can specify which one it belongs to.

neubig · 2017-05-26T13:05:02Z

I thought about this a little more. What about keeping a reference count for each node, then providing this reference count to the autobatch_profile() function? For operations where grouping is beneficial, the profile can then decide to group together arguments based on their reference count. For example, we may want to group together all arguments that have a reference count equal to the maximum of all the incoming arguments. This will work for matmul of course (the parameter matrix should have more references than others), and will work for AffineTransform as well, because the weight matrix and bias vector should both have the same reference count most of the time. How does this sound? If it seems reasonable I can try it.

yoavg · 2017-05-26T13:30:43Z

I don't fully understand the details of your proposal, but it sounds like extending the count from being first args of matmul to being an arg of anything, and a cleverer way of setting the threshold. sure, sounds good for me, go for it.

another option I thought about (which is a little less automatic) is to add a "shared" flag to nodes. Param nodes will have this on by default, for others it can be turned on. The autobatch_sig and autobatch_concat methods will look at this flag for the args and act accordingly. But if you can get your proposal to work, lets go for it.

neubig · 2017-05-26T13:33:42Z

Cool, I'll give this a shot.

introduce a version of matmul which does not depend on first arg

06ec749

Former-commit-id: 46c9a58

neubig changed the title ~~introduce a version of matmul which does not depend on first arg~~ [WIP] make autobatching matrix multiplies more flexible Sep 22, 2017

neubig force-pushed the master branch from fc5ffc2 to 6416283 Compare November 7, 2017 17:37

neubig force-pushed the matmul_mult_sigs branch from 46c9a58 to 06ec749 Compare November 7, 2017 17:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] make autobatching matrix multiplies more flexible #566

[WIP] make autobatching matrix multiplies more flexible #566

yoavg commented May 25, 2017

neubig commented May 26, 2017

yoavg commented May 26, 2017

neubig commented May 26, 2017

neubig commented May 26, 2017

yoavg commented May 26, 2017

neubig commented May 26, 2017

[WIP] make autobatching matrix multiplies more flexible #566

Are you sure you want to change the base?

[WIP] make autobatching matrix multiplies more flexible #566

Conversation

yoavg commented May 25, 2017

neubig commented May 26, 2017

yoavg commented May 26, 2017

neubig commented May 26, 2017

neubig commented May 26, 2017

yoavg commented May 26, 2017

neubig commented May 26, 2017