Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allows the user to specify a module ID to emit unique functions #1694

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

kernhanda
Copy link
Contributor

This change adds a command-line option to provide a "module id" which is then used as a suffix on all emitted symbols, thereby allowing for the linking of multiple onnx-mlir emitted object files into the same executable.

Signed-off-by: Kern Handa kern.handa@gmail.com

Signed-off-by: Kern Handa <kern.handa@gmail.com>
@jenkins-droid
Copy link
Collaborator

Can one of the admins verify this patch?

@jenkins-droid
Copy link
Collaborator

Can one of the admins verify this patch?

@AlexandreEichenberger
Copy link
Collaborator

@jenkins-droid test this please

Copy link
Collaborator

@AlexandreEichenberger AlexandreEichenberger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be a worthwhile addition. @gongsu832 @tungld do you mind chiming in it the proposed approach make sense

Thanks

@tungld
Copy link
Collaborator

tungld commented Sep 15, 2022

@kernhanda Could you please eleborate on "the linking of multiple onnx-mlir emitted object files into the same executable"? Do "object files" here mean the ones produced by --EmitLib or LLVM bitcode files or something else? It would be nice if you give us an example of your need.

Any relationship between this PR and #1663. Is it that PR #1663 is not enough to use multiple compiled onnx models in the same executable?

@kernhanda
Copy link
Contributor Author

kernhanda commented Sep 19, 2022

@kernhanda Could you please eleborate on "the linking of multiple onnx-mlir emitted object files into the same executable"? Do "object files" here mean the ones produced by --EmitLib or LLVM bitcode files or something else? It would be nice if you give us an example of your need.

Any relationship between this PR and #1663. Is it that PR #1663 is not enough to use multiple compiled onnx models in the same executable?

PR #1663 is not sufficient to use multiple compiled models in the same executable.

Say you have two models, model1.onnx and model2.onnx. With just the change in #1663, you might try to do the following:

# ~/code/workspace/data/models/test1 on git:7dfbbcc o [13:45:20]
$ ~/code/onnx-mlir/build/Debug/bin/onnx-mlir --EmitObj --graphName=model1 model1.onnx

# ~/code/workspace/data/models/test1 on git:7dfbbcc o [13:45:53]
$ ~/code/onnx-mlir/build/Debug/bin/onnx-mlir --EmitObj --graphName=model2 model2.onnx

# ~/code/workspace/data/models/test1 on git:7dfbbcc o [13:46:11] C:2
$ objdump --demangle --syms model1.o

model1.o:     file format elf64-x86-64

SYMBOL TABLE:
0000000000000000 l    df *ABS*  0000000000000000 LLVMDialectModule
0000000000000000 l    d  .text  0000000000000000 .text
0000000000000000 l       .rodata.cst4   0000000000000000 .LCPI0_0
0000000000000000 l     O .data.rel.ro   0000000000000010 _entry_point_arrays
0000000000000000 l    d  .data.rel.ro   0000000000000000 .data.rel.ro
0000000000000000 g     F .text  000000000000030c model1
0000000000000000         *UND*  0000000000000000 malloc
0000000000000000         *UND*  0000000000000000 expf
0000000000000000         *UND*  0000000000000000 free
0000000000000310 g     F .text  0000000000000083 _mlir_ciface_model1
00000000000003a0 g     F .text  0000000000000196 run_model1
0000000000000000         *UND*  0000000000000000 omTensorListGetOmtArray
0000000000000000         *UND*  0000000000000000 omTensorGetDataPtr
0000000000000000         *UND*  0000000000000000 omTensorGetShape
0000000000000000         *UND*  0000000000000000 omTensorGetStrides
0000000000000000         *UND*  0000000000000000 omTensorCreateUntyped
0000000000000000         *UND*  0000000000000000 omTensorSetDataPtr
0000000000000000         *UND*  0000000000000000 omTensorSetDataType
0000000000000000         *UND*  0000000000000000 omTensorListCreateWithOwnership
0000000000000540 g     F .text  000000000000001f omQueryEntryPoints
0000000000000560 g     F .text  0000000000000024 omInputSignature
0000000000000000 g     O .rodata        000000000000000b _entry_point_0
0000000000000000         *UND*  0000000000000000 strncmp
0000000000000010 g     O .rodata        0000000000000045 _entry_point_0_in_sig
0000000000000590 g     F .text  0000000000000024 omOutputSignature
0000000000000060 g     O .rodata        0000000000000045 _entry_point_0_out_sig


# ~/code/workspace/data/models/test1 on git:7dfbbcc o [13:48:15] C:1
$ cat main.cpp
#include <stddef.h>

extern "C" {
        void* run_model1(void*);
        void* run_model2(void*);
}

int main() {
        run_model1(NULL);
        run_model2(NULL);

        return 0;
}

# ~/code/workspace/data/models/test1 on git:7dfbbcc o [13:48:14]
$ gcc -o test1_exe main.cpp model1.o model2.o  ~/code/onnx-mlir/build/Debug/lib/libcruntime.a -lm
/usr/bin/ld: model2.o: in function `omQueryEntryPoints':
LLVMDialectModule:(.text+0x370): multiple definition of `omQueryEntryPoints'; model1.o:LLVMDialectModule:(.text+0x540): first defined here
/usr/bin/ld: model2.o: in function `omInputSignature':
LLVMDialectModule:(.text+0x390): multiple definition of `omInputSignature'; model1.o:LLVMDialectModule:(.text+0x560): first defined here
/usr/bin/ld: model2.o:(.rodata+0x0): multiple definition of `_entry_point_0'; model1.o:(.rodata+0x0): first defined here
/usr/bin/ld: model2.o:(.rodata+0x10): multiple definition of `_entry_point_0_in_sig'; model1.o:(.rodata+0x10): first defined here
/usr/bin/ld: model2.o: in function `omOutputSignature':
LLVMDialectModule:(.text+0x3c0): multiple definition of `omOutputSignature'; model1.o:LLVMDialectModule:(.text+0x590): first defined here
/usr/bin/ld: model2.o:(.rodata+0x80): multiple definition of `_entry_point_0_out_sig'; model1.o:(.rodata+0x60): first defined here
collect2: error: ld returned 1 exit status

As you can see, there are duplicate symbols being emitted that prevent the two compiled models from being linked together. This is the motivation behind this change.

Signed-off-by: Kern Handa <kern.handa@gmail.com>
@jenkins-droid
Copy link
Collaborator

Can one of the admins verify this patch?

@jenkins-droid
Copy link
Collaborator

Can one of the admins verify this patch?

Signed-off-by: Kern Handa <kern.handa@gmail.com>
@jenkins-droid
Copy link
Collaborator

Can one of the admins verify this patch?

@kernhanda
Copy link
Contributor Author

@jenkins-droid test this please

@jenkins-droid
Copy link
Collaborator

Can one of the admins verify this patch?

@tungld
Copy link
Collaborator

tungld commented Sep 21, 2022

@jenkins-droid test this please

@jenkins-droid
Copy link
Collaborator

Can one of the admins verify this patch?

@tungld
Copy link
Collaborator

tungld commented Sep 21, 2022

@jenkins-droid test this please

@AlexandreEichenberger
Copy link
Collaborator

AlexandreEichenberger commented Sep 21, 2022

@kernhanda
For this feature to be useful, it would have to be documented. Right now we have a few docs examples (add and mnist), I feel it might be time to have a page that gather a collection of advice/help for certain scenarios.

Could you add new md page, maybe docs/DeployingModelHelp.md or OnnxMlirUsageSecenario.md and provide a section on how to accommodate two models? You don't have to have a working model, e.g. "say you have a first model X and second model Y" and want to combine them in a binary (maybe use there your C example.

extern "C" {
        void* run_model1(void*);
        void* run_model2(void*);
}

int main() {
        run_model1(NULL);
        run_model2(NULL);

        return 0;
}

and list the options you would need to use onnx-mlir the proper way to handle this.

Eventually, we will add additional examples on that page.

Does that make sense?

@tungld
Copy link
Collaborator

tungld commented Sep 22, 2022

@kernhanda Thanks for the explanation! It makes sense. In addition to Alex's comments, could you add the following lit tests:

  • Test for the newly added option moduleId with the command onnx-mlir
  • Test for krnl-to-llvm to see how entry points and signatures are generated with the presence of onnx-mlir.moduleId attribute.

Many thanks!

@AlexandreEichenberger
Copy link
Collaborator

@kernhanda let us know when the suggestions above are addressed, and we will review the PR one more time, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants