Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mindir format support. #1227

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

Ethan-Chen-plus
Copy link

馃馃憢Hi! We have added our .mindir format. Please review our commit.

@lutzroeder
Copy link
Owner

lutzroeder commented Feb 18, 2024

@Ethan-Chen-plus can you provide some context:

  • What is mindir, what problem is it solving and for how many users?
  • Why is a new format needed given there are many similar formats?
  • Create a script in ./tools following the patterns used for other frameworks to recreate the -proto file.
  • Add a test to ./test/models.json and share an index of 50+ existing files in this format.
  • Run npm run validate

@Ethan-Chen-plus
Copy link
Author

PS I:\netron> npm run validate
eslint
pylint
test
gguf/phi-2.Q2_K.gguf
onnx/candy.onnx
keras/1151.4.keras
coreml/Exermote.mlmodel
pytorch/alexnet.ptl
pytorch/DCGAN2.pt
tf/conv-layers.pb.zip
tflite/mobilenet_v1_0.75_160_quantized.tflite
tflite/squeezenet.tflite

I have updated my code, please review.
@lutzroeder

@lutzroeder
Copy link
Owner

@Ethan-Chen-plus can you provide some context:

  • What is mindir, what problem is it solving and for how many users?
  • Why is a new format needed given there are many similar formats?
  • Please rebase to latest main.
  • The ./tools scripts should generate regenerate to files from the original .proto. When changing the format or generator for -proto.js or -metadata.json these files should automatically get regenerated from the original source when running ./tools/mindir sync schema metadata.

@Ethan-Chen-plus
Copy link
Author

@lutzroeder Thank you again and we have updated our code. And let me introduce some background infomation:

  • What is mindir, what problem is it solving and for how many users?
    MindIR (MindSpore IR) is a graph-based functional Intermediate Representation (IR) format developed for the MindSpore framework by Huawei. Its primary purpose is to define the logical structure of networks and the attributes of operators, facilitating the decoupling of model files from hardware platforms through a unified IR. This supports a train-once, deploy-multiple-times paradigm. A key advantage of the MindIR format is its ability to store network structures and weight parameter values, enabling deployment for inference tasks on cloud-based Serving and edge-oriented Lite platforms. The same MindIR file can be deployed across various hardware types, including Ascend AI processors, GPUs, and CPUs.

The user base for the MindIR format primarily comprises researchers, developers, and enterprises needing to deploy AI models across multiple hardware platforms. Given its compatibility with diverse hardware including Ascend AI processors, GPUs, and CPUs, MindIR is particularly well-suited for applications that require deployment both in the cloud and on edge devices. For instance, MindIR can be utilized in Huawei's Atlas 200/300/500 inference products, which are specifically designed for edge AI processing scenarios.

-Why is a new format needed given there are many similar formats?

Compared to other similar IR formats such as ONNX and AIR, MindIR presents unique advantages and application scenarios. ONNX (Open Neural Network Exchange) is a universal format designed for the expression of machine learning models, primarily used for model transfer between different frameworks or on inference engines like TensorRT. AIR (Ascend Intermediate Representation), defined by Huawei, is an open file format tailored for machine learning to better accommodate Huawei AI processors, typically used for inference tasks on Ascend 310. MindIR, on the other hand, is a functional IR based on graph representation. It not only defines an extensible graph structure and operator IR representations but also eliminates model discrepancies across different backends. It is generally used for cross-platform inference tasks, such as executing inference on Ascend 310, GPUs, and MindSpore Lite for models trained on Ascend 910.

@Ethan-Chen-plus
Copy link
Author

@lutzroeder Hello.馃憢 I noticed that eslint failed. Could you please tell me which config file eslint is using? I'll make the necessary fixes based on that config file. Thank you!

@lutzroeder lutzroeder force-pushed the main branch 4 times, most recently from 952b6ed to 2781faa Compare May 5, 2024 03:43
@Ethan-Chen-plus
Copy link
Author

Ethan-Chen-plus commented May 5, 2024

Workflow requires approval, thank you馃

@lutzroeder lutzroeder force-pushed the main branch 3 times, most recently from d9a23ec to 5028ef6 Compare May 20, 2024 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants