Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Put the output of the model into existing NDArrays when provided #618

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

zemei
Copy link
Contributor

@zemei zemei commented Feb 6, 2021

Introducing an optional output NDList parameter on the Predictor::predict interface. When the output parameter is provided the underlying engine will copy the inference result into the corresponding NDArray instead of creating new objects.

This functionality is useful for high throughput systems as it reduces the number of memory allocations and reduces the load on the garbage collector.

TODO:

  • The engine specific implementation is only provided for PyTorch:
    • On the Java side, the copy is only implemented for the NDArray type.
    • The PyTorch lib side was refactored since and not included in the PR.

@frankfliu
Copy link
Contributor

In general, this is a good idea to reuse the output NDArrays, but define NDList at Predictor is debatable.

I think, this can be implemented at Translator level instead, and we can document how to reuse the NDList in Translator implementation.

The outputs support at Block level is necessary to support reuse the NDList, but may not work for all block, we can add TODO to some of the Block implementation to make it reusable.

@lanking520
Copy link
Member

@frankfliu @zemei Maybe we can try to tweak around on the ctx through implementing a cached NDArray holder. User are responsible to control the life-cycle of it.

public O processInput(Context ctx, I input) {
      NDArray array = ctx.getCachedNdArray("name");
}

This CachedNDArray will not be vanished until the end of Model life cycle. But there is a risk for memory leak if use misuse this component.

@lanking520 lanking520 changed the title Put the output of the model into existing NDArrays when provided [WIP] Put the output of the model into existing NDArrays when provided Mar 2, 2021
Lokiiiiii pushed a commit to Lokiiiiii/djl that referenced this pull request Oct 10, 2023
@zemei zemei requested review from zachgk, frankfliu and a team as code owners April 26, 2024 19:14
@frankfliu
Copy link
Contributor

@zemei

Are you still interested in this PR? I took a close look today. The idea to reuse input/output tensor is great, however I don't think this implementation is optimal:

  1. This optimization is only needed for inference, we don't really need touch so many classes, we only need add Predictor.setOutputTensor(NDList)
  2. In this PR, the output tensor is still created in JNI, and copied to existing NDArray, I don't think this saved anything. I actually don't know if it's possible to re-use output tensor in libtorch.
  3. I feel re-use input tensor might be a more feasible approach

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants