Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] Support Edge Deployment #40

Open
gaocegege opened this issue Jul 16, 2020 · 6 comments
Open

[feature] Support Edge Deployment #40

gaocegege opened this issue Jul 16, 2020 · 6 comments
Assignees
Labels
priority/P2 Must be staffed and worked on either currently, or very soon, ideally in time for the next release.

Comments

@gaocegege
Copy link
Member

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug
/kind feature

What happened:

We are glad to investigate if we can support deploy the model on edge servers/devices.

The workflow will be like: Users downloads the binary from the model registry and use it to server at edge.

Ref https://aws.amazon.com/cn/blogs/aws/amazon-sagemaker-neo-train-your-machine-learning-models-once-run-them-anywhere/

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

@hicaistar
Copy link

hicaistar commented Jul 16, 2020

A brief flow of Amazon SageMaker Neo:

  1. Compile a model to an executable binary that can be run by Neo runtime(a python package)
  2. Application run on eade device loads Neo runtime
  3. Neo runtime loads model and starts serving

Neo runtime is the core. It's very small. Is that compiling serving program together with a model to a binary that can run on different platform be feasible?

@gaocegege
Copy link
Member Author

Is that compiling serving program together with a model to a binary that can run on the different platforms be feasible?

I do not think we should provide one unified binary, I think we will provide one binary for one arch. WDYT

@gaocegege gaocegege added the priority/P2 Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Jul 17, 2020
@hicaistar
Copy link

Is that compiling serving program together with a model to a binary that can run on the different platforms be feasible?

I do not think we should provide one unified binary, I think we will provide one binary for one arch. WDYT

LGTM

@gaocegege
Copy link
Member Author

gaocegege commented Jul 17, 2020

It may be related to kleveross/ormb#47

@gaocegege
Copy link
Member Author

/assign @judgeeeeee

@xieydd
Copy link

xieydd commented Aug 17, 2020

For different platform edge device, there are two keypoints:

  • speed is the key point.
  • Supported platform , like ios、Android or NVIDIA edge device
    There is no doubt, TVM is a best choice, but TVM is too complicated for users and developer .We have deploy many application on edge device, the pipeline is :
pytorch/tensorflow -> onnx(or direct) -> mnn/ncnn/tnn (We will compare with two framework using internal tools, but an opensource like this https://github.com/AI-performance/embedded-ai.bench)
# the framework model is easy convert to binary.

Looking forward to klever edge deployment Design.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/P2 Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

No branches or pull requests

4 participants