Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When running many iterations, they get progressively slower until one gets stuck #5316

Open
hmijail opened this issue Apr 11, 2024 · 6 comments
Labels
during 1: program development Bad error message or documentation; IDE bug; crash compiling invalid program kind: bug Crashes, unsoundness, incorrect output, etc. If possible, add a `part:` label priority: not yet Will reconsider working on this when we're looking for work

Comments

@hmijail
Copy link

hmijail commented Apr 11, 2024

Dafny version

4.6

Code to produce this issue

I'm using the file src/dafny/core/precompiled.dfy from https://github.com/ConsenSys/evm-dafny/blob/measuring/src/dafny/core/precompiled.dfy

Command to run and resulting output

dafny measure-complexity --log-format csv --isolate-assertions --iterations 10000 src/dafny/core/precompiled.dfy

What happened?

When running the given command in my MacBook Pro, the first ~200 iterations take about 2 seconds each; a dafny process uses ~200% CPU, and 4 z3 processes take 50%CPU each.
But from around iteration ~200 to ~250 the dafny process' CPU usage increases noticeably, the z3 processes' CPU usage decreases, and iterations turn slower.
Around iteration ~250, they take over 1 minute to run, and eventually one of them gets stuck and doesn't finish even after hours of waiting.
RAM usage doesn't seem to grow significantly, so it doesn't look like a memory leak.

If I run the given command without --isolate-assertions, the same evolution happens, but it takes about 2600 iterations to reach the point where one iteration gets stuck.

What type of operating system are you experiencing the problem on?

Mac

@hmijail hmijail added the kind: bug Crashes, unsoundness, incorrect output, etc. If possible, add a `part:` label label Apr 11, 2024
@hmijail
Copy link
Author

hmijail commented Apr 12, 2024

I tried running smaller batches of 100 iterations each to avoid this problem, and turns out that even then eventually one iteration also gets stuck, with no CPU activity.

Can I do anything to help debug this? There are no error messages in this case nor the previous one, so would e.g. a stack sample from macOS' activity monitor help?

@keyboardDrummer
Copy link
Member

I tried running smaller batches of 100 iterations each to avoid this problem, and turns out that even then eventually one iteration also gets stuck, with no CPU activity.

Can it be reduced to a batch size of 1 and still get stuck? That would provide a good reproduction case.

If we have a reproduction that doesn't take too long to run, then we can run the problem with a profiler attached.

@hmijail
Copy link
Author

hmijail commented Apr 12, 2024

Ugh, I didn't think of that. But no, just now I repeated the same command that got stuck and now it ran to the end. Seems to be random.

It was this line:

dafny measure-complexity --random-seed 1712868753 --iterations 100 --log-format json;LogFileName=TestResults/20240412-065233_dafny46srcdafnycoreprecompiled.dfyrs1712868753i100.json src/dafny/core/precompiled.dfy

... which originally got stuck at iteration 61 (with random seed 1307108582)

@atomb
Copy link
Member

atomb commented Apr 12, 2024

Ugh, I didn't think of that. But no, just now I repeated the same command that got stuck and now it ran to the end. Seems to be random.

I don't know if this is what's going on, but we have previously had issues detecting when Z3 has crashed. I put in a fix that works under at least certain circumstances, but it's possible that there are other, similar situations that it doesn't handle well. I have seen cases where the minor changes introduced by measure-complexity are enough to make Z3 crash.

Normally, after you've been running many iterations like that, Dafny should have started as many Z3 processes as the value specified in argument to --cores. If you notice fewer running when it hangs, that might be an indication that one of them has crashed.

@hmijail
Copy link
Author

hmijail commented Apr 15, 2024

If you notice fewer running when it hangs, that might be an indication that one of them has crashed.

Actually what I'm seeing when the slow-down happens is that, even though there's 4 z3 processes always running (which is my --cores default), many do little more than appear, hang around a bit accumulating less than 1 sec of CPU time, disappear and be substituted for a new one.
While this happens, the dafny process is using over 200% CPU, and typically the kernel_task uses about the same. Maybe a lot of IPC is happening?

@keyboardDrummer
Copy link
Member

Next step should be to add the option of doing some performance related logging. Would be good to report how long Dafny spends on preparing verification (translation to Boogie and preprocessing Boogie), before getting to actually verify something.

@keyboardDrummer keyboardDrummer added during 1: program development Bad error message or documentation; IDE bug; crash compiling invalid program priority: not yet Will reconsider working on this when we're looking for work labels Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
during 1: program development Bad error message or documentation; IDE bug; crash compiling invalid program kind: bug Crashes, unsoundness, incorrect output, etc. If possible, add a `part:` label priority: not yet Will reconsider working on this when we're looking for work
Projects
None yet
Development

No branches or pull requests

3 participants