[Dynamo] make bytecode of resume function resemble natural bytecode #126630

youkaichao · 2024-05-18T23:49:21Z

cc @ezyang @msaroufim @bdhirsh @anijain2305 @chauhang @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng

pytorch-bot · 2024-05-18T23:49:24Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/126630

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (3 Unrelated Failures)

As of commit b7ad5f3 with merge base a8195f2 ():

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

inductor-periodic / cuda12.1-py3.10-gcc9-sm86-periodic-dynamo-benchmarks / test (aot_eager_timm, 2, 2, linux.g5.4xlarge.nvidia.gpu) (gh) (matched linux rule in flaky-rules.json)
The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.
inductor-periodic / cuda12.1-py3.10-gcc9-sm86-periodic-dynamo-benchmarks / test (dynamic_aot_eager_torchbench, 2, 2, linux.g5.4xlarge.nvidia.gpu) (gh) (similar failure)
Error response from daemon: a prune operation is already running

UNSTABLE - The following job failed but was likely due to flakiness present on trunk and has been marked as unstable:

inductor / cuda12.1-py3.10-gcc9-sm86 / test (inductor_timm, 1, 2, linux.g5.4xlarge.nvidia.gpu, unstable) (gh) (#126884)
eca_halonext26ts

This comment was automatically generated by Dr. CI and updates every 15 minutes.

youkaichao · 2024-05-18T23:53:48Z

When we deal with natural bytecode (Python bytecode that is generated by Python compiler, from source code), we can observe that co_freevars seem to be sorted, no matter how they are declared:

def helper_outer_function():
    data_ptr = None
    last_dim_size = None
    last_two_dims_size = None
    shape = None
    stride = None
    qkv_format = None
    def dummy1():
        nonlocal data_ptr, last_dim_size, last_two_dims_size, shape, stride, qkv_format
        # Access variables in the order they were declared
        _ = data_ptr
        _ = last_dim_size
        _ = last_two_dims_size
        _ = shape
        _ = stride
        _ = qkv_format
        print(data_ptr)
        print(last_dim_size)
        print(last_two_dims_size)
        print(shape)
        print(stride)
        print(qkv_format)
    dummy1()

print(helper_outer_function.__code__.co_consts[1].co_freevars)

The output is:

('data_ptr',
 'last_dim_size',
 'last_two_dims_size',
 'qkv_format',
 'shape',
 'stride')

Note that our declaration order is data_ptr, last_dim_size, last_two_dims_size, shape, stride, qkv_format , but Python sorts them.

If Dynamo generated bytecode does not obey this rule, it means we cannot generate source code that can compile the same bytecode as Dynamo, which makes it impossible to understand Dynamo bytecode by decompiling it into source code.

youkaichao · 2024-05-18T23:58:49Z

Note: we require the new bytecode has exactly the same co_freevar as the old one, in order to have faster new frame construction. It is proposed by me in #115062 .

youkaichao · 2024-05-19T00:05:46Z

reference:

https://github.com/python/cpython/blob/caf6064a1bc15ac344afd78b780188e60b9c628e/Python/compile.c#L530-L534

/* Sort the keys so that we have a deterministic order on the indexes
   saved in the returned dictionary.  These indexes are used as indexes
   into the free and cell var storage.  Therefore if they aren't
   deterministic, then the generated bytecode is not deterministic.
*/

indexes of free and cell var storage are sorted.

youkaichao · 2024-05-19T00:18:55Z

TODO: how to add test for this. We need to generate a resume function with freevars.

ezyang · 2024-05-21T00:23:48Z

Trying @williamwen42 as reviewer, but this looks pretty harmless, shout if it gets lost

williamwen42

LGTM - just add a test

youkaichao · 2024-05-21T18:30:34Z

@williamwen42 thanks for the reply. Can you give me some guidance on how to add tests? In particular, I don't know how to manually instruct Dynamo to generate a resume function as I want.

williamwen42 · 2024-05-21T20:35:35Z

You can compile a function with a graph break and then search globals/locals for the resume function:

import torch

def fn(x):
    x = x + 1
    torch._dynamo.graph_break()
    x = torch.sin(x)
    return x

opt_fn = torch.compile(fn, backend="eager")
opt_fn(torch.randn(10))

for k, v in list(globals().items()):
    if k.startswith("__resume_at"):
        print(k)
        print(v)

youkaichao · 2024-05-21T20:47:35Z

This kind of resume function does not have freevars:

__resume_at_16_3.__code__.co_freevars == ()

williamwen42 · 2024-05-21T20:49:05Z

You can use a different function that has freevars - compile a function with a closure?

youkaichao · 2024-05-21T20:53:19Z

Still no freevars:

import torch

def fn(x):
    x = x + 1
    @torch.compile(backend="eager")
    def inner(x):
        x = x + 1
        torch._dynamo.graph_break()
        x = x * 2
        return x
    y = inner(torch.sin(x))
    return y

fn(torch.randn(10))

for k, v in list(globals().items()):
    if k.startswith("__resume_at"):
        print(k)
        print(v)

youkaichao · 2024-05-21T20:54:48Z

When will resume function contain free vars?

williamwen42 · 2024-05-21T21:24:24Z

Looks like resume functions with freevars are not exposed to the global scope, so we'll need to add something like

            # expose code object for debugging purposes
            self.output.install_global_unsafe(name, new_code)

before the cg.make_function_with_closure(name, new_code, True, stack_len) line in def create_call_resume_at (symbolic_convert.py).

Then a function like this should work:

import torch

def create():
    cl = 1
    def fn(x):
        x = x + 1
        torch._dynamo.graph_break()
        x = x + cl
        return x
    return fn

fn = create()
opt_fn = torch.compile(fn, backend="eager")
print(opt_fn(torch.randn(10)))

breakpoint()
for k, v in list(globals().items()):
    if k.startswith("__resume_at"):
        print(k)
        print(v)
        print(v.co_freevars)
        print(v.co_cellvars)

youkaichao · 2024-05-21T22:53:00Z

@williamwen42 thanks for the guidance! do you know if the test failures in the commit are related with the changes in this PR?

williamwen42 · 2024-05-21T22:55:15Z

They don't look related, but we'll see what CI shows.

youkaichao · 2024-05-22T19:12:07Z

@williamwen42 can you take a look at whether ci test failures are related?

williamwen42 · 2024-05-22T20:06:16Z

They don't look related - we're having some issues with CI atm.

youkaichao · 2024-05-22T20:26:38Z

@pytorchbot merge

pytorchmergebot · 2024-05-22T20:28:32Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2024-05-22T20:28:52Z

Merge failed

Reason: 15 jobs have failed, first few of them are: pull / linux-focal-py3.8-clang10-onnx / test (default, 2, 2, linux.2xlarge), inductor-periodic / cuda12.1-py3.10-gcc9-sm86-periodic-dynamo-benchmarks / test (dynamo_eager_torchbench, 1, 2, linux.g5.4xlarge.nvidia.gpu), inductor-periodic / cuda12.1-py3.10-gcc9-sm86-periodic-dynamo-benchmarks / test (aot_eager_torchbench, 1, 2, linux.g5.4xlarge.nvidia.gpu), inductor-periodic / cuda12.1-py3.10-gcc9-sm86-periodic-dynamo-benchmarks / test (dynamic_aot_eager_torchbench, 1, 2, linux.g5.4xlarge.nvidia.gpu), inductor / cuda12.1-py3.10-gcc9-sm86 / test (inductor, 1, 1, linux.g5.4xlarge.nvidia.gpu, unstable)

Details for Dev Infra team

Raised by workflow job

youkaichao · 2024-05-22T21:37:28Z

@williamwen42 then how can we merge this? do we need to wait until the ci team fixes the issue?

huydhn · 2024-05-22T23:41:18Z

@pytorchbot merge -r

pytorchmergebot · 2024-05-22T23:43:08Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

pytorchmergebot · 2024-05-22T23:43:12Z

Successfully rebased youkaichao-patch-1 onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout youkaichao-patch-1 && git pull --rebase)

pytorchmergebot · 2024-05-22T23:44:26Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorch-bot bot added ciflow/inductor module: dynamo labels May 18, 2024

pytorchbot added the open source label May 19, 2024

youkaichao mentioned this pull request May 19, 2024

[Bug]: depyf.prepare_debug causes torch.compile crash with free var mismatch error thuml/depyf#28

Closed

drisspg added the oncall: pt2 label May 20, 2024

ezyang requested a review from williamwen42 May 21, 2024 00:23

mikaylagawarecki added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label May 21, 2024

williamwen42 approved these changes May 21, 2024

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label May 22, 2024

youkaichao added the topic: not user facing topic category label May 22, 2024

pytorchmergebot added the merging label May 22, 2024

pytorchmergebot removed the merging label May 22, 2024

youkaichao added 5 commits May 22, 2024 23:43

Update resume_execution.py

0418088

update sort tuple

94fc626

expose code object for debugging purposes

22d7a4e

add tests for resume functions

6d196fd

fix lint

b7ad5f3

pytorchmergebot force-pushed the youkaichao-patch-1 branch from 50903e5 to b7ad5f3 Compare May 22, 2024 23:43

pytorchmergebot added the merging label May 22, 2024

pytorchmergebot added the Merged label May 23, 2024

pytorchmergebot closed this in 36e7057 May 23, 2024

pytorchmergebot removed the merging label May 23, 2024

youkaichao deleted the youkaichao-patch-1 branch May 23, 2024 05:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Dynamo] make bytecode of resume function resemble natural bytecode #126630

[Dynamo] make bytecode of resume function resemble natural bytecode #126630

youkaichao commented May 18, 2024 •

edited by pytorch-bot bot

pytorch-bot bot commented May 18, 2024 •

edited

youkaichao commented May 18, 2024

youkaichao commented May 18, 2024

youkaichao commented May 19, 2024

youkaichao commented May 19, 2024

ezyang commented May 21, 2024

williamwen42 left a comment

youkaichao commented May 21, 2024

williamwen42 commented May 21, 2024

youkaichao commented May 21, 2024

williamwen42 commented May 21, 2024

youkaichao commented May 21, 2024

youkaichao commented May 21, 2024

williamwen42 commented May 21, 2024 •

edited

youkaichao commented May 21, 2024

williamwen42 commented May 21, 2024

youkaichao commented May 22, 2024

williamwen42 commented May 22, 2024

youkaichao commented May 22, 2024

pytorchmergebot commented May 22, 2024

pytorchmergebot commented May 22, 2024

youkaichao commented May 22, 2024

huydhn commented May 22, 2024

pytorchmergebot commented May 22, 2024

pytorchmergebot commented May 22, 2024

pytorchmergebot commented May 22, 2024

[Dynamo] make bytecode of resume function resemble natural bytecode #126630

[Dynamo] make bytecode of resume function resemble natural bytecode #126630

Conversation

youkaichao commented May 18, 2024 • edited by pytorch-bot bot

pytorch-bot bot commented May 18, 2024 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/126630

✅ You can merge normally! (3 Unrelated Failures)

youkaichao commented May 18, 2024

youkaichao commented May 18, 2024

youkaichao commented May 19, 2024

youkaichao commented May 19, 2024

ezyang commented May 21, 2024

williamwen42 left a comment

Choose a reason for hiding this comment

youkaichao commented May 21, 2024

williamwen42 commented May 21, 2024

youkaichao commented May 21, 2024

williamwen42 commented May 21, 2024

youkaichao commented May 21, 2024

youkaichao commented May 21, 2024

williamwen42 commented May 21, 2024 • edited

youkaichao commented May 21, 2024

williamwen42 commented May 21, 2024

youkaichao commented May 22, 2024

williamwen42 commented May 22, 2024

youkaichao commented May 22, 2024

pytorchmergebot commented May 22, 2024

Merge started

pytorchmergebot commented May 22, 2024

Merge failed

youkaichao commented May 22, 2024

huydhn commented May 22, 2024

pytorchmergebot commented May 22, 2024

pytorchmergebot commented May 22, 2024

pytorchmergebot commented May 22, 2024

Merge started

youkaichao commented May 18, 2024 •

edited by pytorch-bot bot

pytorch-bot bot commented May 18, 2024 •

edited

williamwen42 commented May 21, 2024 •

edited