Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] How do you handle graph breaks coming from Dynamo? #431

Closed
tbaggu opened this issue Feb 23, 2024 · 6 comments
Closed

[Bug] How do you handle graph breaks coming from Dynamo? #431

tbaggu opened this issue Feb 23, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@tbaggu
Copy link

tbaggu commented Feb 23, 2024

Hi

I am taking this repo as reference to implement code for our custom back-end. During the development process most of the cases when i use hugging-face models directly i see lot of graph breaks, in Fx graph,

My understanding on the inductor side is, each subgraph compiled and ran the inference sent the results to CPU, and the next subgraph will start executing which is time consuming?

So my question is ,
Have you seen such graph breaks if so how Hidet handles them?

similar to below case
https://discuss.pytorch.org/t/stitching-together-graph-breaks-for-large-compilation-units/194793/5

@tbaggu tbaggu added the bug Something isn't working label Feb 23, 2024
@yaoyaoding
Copy link
Member

Hi @tbaggu ,

The dynamo custom backends can not control how the torch model is partitioned and converted to fx graph.

The torch dynamo dispatch the fx graph to custom backend and the backend compiles the fx graph to some executable and returns to dynamo. The compilation only happens once and the compiled executable will be used many times. As long as the compiled executable is efficient, the overhead will not be very large.

@tbaggu
Copy link
Author

tbaggu commented Feb 23, 2024 via email

@yaoyaoding
Copy link
Member

Yes.

model_opt = torch.compile(model, backend='custom-backend')

model_opt(x) # sub graph will be compiled and executed
model_opt(x) # the cached compiled executable will be used, no compilation at all

@tbaggu
Copy link
Author

tbaggu commented Feb 23, 2024 via email

@yaoyaoding
Copy link
Member

for each subgraph result should comeback to CPU

No, the result will stay in the same device (cpu or gpu) as the eager execution of the orginal model.

@wangshangsam
Copy link
Collaborator

Closing as this issue is not directly related to Hidet.

@wangshangsam wangshangsam closed this as not planned Won't fix, can't repro, duplicate, stale Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants