Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a dynamic_partial_sum operator to tensorflow? #7662

Closed
zasdfgbnm opened this issue Feb 19, 2017 · 10 comments
Closed

Add a dynamic_partial_sum operator to tensorflow? #7662

zasdfgbnm opened this issue Feb 19, 2017 · 10 comments
Labels
stale This label marks the issue/pr stale - to be closed automatically if no activity stat:contribution welcome Status - Contributions welcome type:feature Feature requests

Comments

@zasdfgbnm
Copy link
Contributor

Hi,

In my application I need to do operations that dynamically sum some rows of a matrix to get a new matrix. There will be an input tensor named "index" that guides which part of the tensor to be summed.

An example is, if the input matrix is

[ [   1,   1,   1],
  [  10,  10,  10],
  [ 100, 100, 100] ]

And the index is

[ [ 0, 2 ],
  [ 2, 3 ]]

which simply says the output tensor will have two rows (because the "index" have two rows), the first row is the sum of rows with row number i that satisfies 0 <= i < 2 in the input, and the second row is the sum of rows with row number 2. So the result should be

[ [  11,  11,  11],
  [ 100, 100, 100] ]

I don't find any existing operation that does this job, so I implement it (support only 2d matrix and GPU) by myself. Since I already have an implementation, I'm not requesting a new feature here. But I do want to know that if the tensorflow team is interested in adding this operation as part of tensorflow. If the answer is yes, I will add CPU support (maybe also xla? I have no idea on how to add xla support yet), and then create a pull request for that.

@aselle
Copy link
Contributor

aselle commented Feb 20, 2017

@ebrevdo, @rmlarsen, what do you two think? What would we call such an operation. It does seem useful. I've wanted to do this type of thing once or twice, even.

@aselle aselle added type:feature Feature requests stat:awaiting tensorflower Status - Awaiting response from tensorflower labels Feb 20, 2017
@ebrevdo
Copy link
Contributor

ebrevdo commented Feb 20, 2017 via email

@zasdfgbnm
Copy link
Contributor Author

zasdfgbnm commented Feb 20, 2017 via email

@zasdfgbnm
Copy link
Contributor Author

To be honest, numpy's numpy.ufunc.reduceat is so confusing and unnatural to me. I prefer the behavior I define here at this issue.

@aselle
Copy link
Contributor

aselle commented Mar 8, 2017

This functionality seems useful to me. Marking as contributions welcome.

@aselle aselle added stat:contribution welcome Status - Contributions welcome and removed stat:awaiting tensorflower Status - Awaiting response from tensorflower labels Mar 8, 2017
@girving
Copy link
Contributor

girving commented Apr 10, 2017

Isn't this a scan, a gather, and a subtraction?

@zasdfgbnm
Copy link
Contributor Author

zasdfgbnm commented Apr 10, 2017

@girving Yes, in the case of sum it is, but I don't think it is a good idea to implement it this way. Let's say you have a tensor containing 10000 rows, each row have value about ~1, and you want the sum of row number 5000 to 5005, then computationally speaking, implementing using scan, gather and subtract, you are doing a 5005-5000, which has less precision than 1+1+1+1+1.

In the case of product, it is not, because 0 * something / 0 is NaN. Neither does max and min. In my application, I only use sum, so it might be OK to use scan-gather-subtraction for me. But I'm not sure if other reduction operations like prod, min, max would be useful to others.

By the way, there are similar thing as the Op I'm implementing named tf.segment_sum, written in C++, see:
https://www.tensorflow.org/api_docs/python/tf/segment_sum
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/segment_reduction_ops.cc

@shoyer
Copy link
Contributor

shoyer commented Apr 22, 2017

To be honest, numpy's numpy.ufunc.reduceat is so confusing and unnatural to me. I prefer the behavior I define here at this issue.

Over in numpy/numpy#834, we are discussing how to replace reduceat, which I think everyone agrees was poorly designed. We have not yet settled on a replacement, however.

@github-actions
Copy link

This issue is stale because it has been open for 180 days with no activity. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label Mar 28, 2023
Copy link

This issue was closed because it has been inactive for 1 year.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale This label marks the issue/pr stale - to be closed automatically if no activity stat:contribution welcome Status - Contributions welcome type:feature Feature requests
Projects
None yet
Development

No branches or pull requests

5 participants