Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calling GenerateConsoleCtrlEvent in process_terminate creates zombie #19

Open
ghost opened this issue Jul 14, 2019 · 14 comments
Open

Calling GenerateConsoleCtrlEvent in process_terminate creates zombie #19

ghost opened this issue Jul 14, 2019 · 14 comments

Comments

@ghost
Copy link

ghost commented Jul 14, 2019

This is likely related to an old terminal issue, reported on GH here:

My setup is Windows 10, Visual Studio 2019 (16.1.5) and problem can be observed when running background.cpp reproc++ sample under VS. The observable results is that console window cannot be closed (Visual Studio Debugger Console cannot be closed if there are any child processes left, and the zombie process can't be stopped, can only be forcefully killed).

For me this doesn't repro in cmd.exe but does reproduce in Visual Studio Debugger Console as well as cmd replacements like @malxau's yori. I suspect that this is caused by a race between process->running set to false (i.e. child process terminating by itself) and platform specific process_terminate() being called (which in turn calls GenerateConsoleCtrlEvent()). In regular terminal I get lucky, in debugger console and yori I don't.

I tried a simple workaround by calling FreeConsole()/AttachConsole(child)/SetConsoleCtrlHandler(NULL, TRUE) before the GenerateConsoleCtrlEvent() call but that makes it impossible for the parent process to reattach to its console later on since the first FreeConsole() drops console's refcount to 0 an causes its release.

A dummy process would have to be spawned to hold to the parent console while it generates CTRL_BREAK_EVENT on the child but I haven't tried that yet and I'm not 100% convinced this is a correct solution to this problem.

@DaanDeMeyer
Copy link
Owner

I don't immediately see how the race condition you describe would work. running is only changed in reproc_wait which is distinct from reproc_terminate.

If microsoft/terminal#335 is the problem (which looks very plausible to me), ideally GenerateConsoleCtrlEvent would be fixed instead of working around the problem in reproc. I'd prefer to not start spawning dummy processes to keep reproc as simple as possible. I'd love to look into fixing GenerateConsoleCtrlEvent myself but unfortunately I'm swamped with other work for now.

I'm also open to other simple ways of implementing reproc_terminate on Windows but as far as I remember I didn't find any other satisfactory way of implementing this when I last looked into it.

@ghost
Copy link
Author

ghost commented Jul 17, 2019

You're obviously right, process->running is only cleared in wait. I assumed there's a timing issue causing it to be called when running in cmd but not in VS debugger console but debugged some more and it's not the case. So I have no hypothesis as to why there's a difference in behavior. Consequently GenerateConsoleCtrlEvent bug may or may not be the root cause.

The behavior remains though - in certain situations background.cpp sample creates zombie process that causes console host to hang and become unresponsive to close button. :S It seems that the simplest workaround (at least the simplest I've found so far) is to call process::stop with cleanup::wait as the first cleanup strategy. Whether this is accidental or a proper workaround, I don't know yet. As it stands cleanup::terminate is unfortunately unreliable on Windows.

I'll keep looking into it. I want to use reproc in my project but this prevents me from doing so.

@DaanDeMeyer
Copy link
Owner

Have you looked what happens when using cleanup::terminate?

@malxau
Copy link

malxau commented Jul 17, 2019

Dominik, note one difference between Yori and CMD is that Yori will call GenerateConsoleCtrlEvent, wait for 50ms to see if processes die, then call TerminateProcess. The background sample appears to just launch a child and wait, so if it's not processing console events in a timely fashion, I'd expect Yori to kill it and leave behind its child process. Is this what you're seeing? reproc doesn't look to use SetConsoleCtrlHandler anywhere, so it's depending on default Ctrl+C handling being enabled when it was launched; I tried to do this, but it's another way to end up in this state.

I don't know what the VSDebugger is doing, but leaving Ctrl+C handling off or forcibly terminating both seem plausible.

@DaanDeMeyer
Copy link
Owner

I'm wondering why the execution environment makes a difference in the behaviour. Theoretically, the same reproc code should be executed regardless of the execution environment. Does the execution environment (cmd, yori, vs debugger) affect the behaviour of Windows API calls in any way (specifically those related to child processes)?

@malxau
Copy link

malxau commented Jul 18, 2019

Yes, it can. The case I was referring to above is that SetConsoleCtrlHandler defines the handling of Ctrl+C, and that behavior is propagated to child processes. They can call the API to change it, but doing nothing ends up getting behavior defined by their environment. The handling of Ctrl+C by the console also changes depending on the current console input mode, which is configured by SetConsoleMode, which is also propagated to child processes. Worse, the console is a shared resource, so a child can modify the environment of its parent.

In theory everything I just said is specifically for Ctrl+C, not Ctrl+Break. And one goofy thing Yori does as a result is observe Ctrl+C, and generate Ctrl+Break.

I think I'll need to compile some things and try this out myself. The only thing that's bothering me is although I can have screwed it up, I can't have screwed up the VS debugger too.

@DaanDeMeyer
Copy link
Owner

reproc only generates Ctrl+Break so as far as I understand it, only the child process should be able to ignore it by calling SetConsoleCtrlHandler. @ddalek, Does cleanup::terminate actually stop the child process? If that's the case, we're sure the default handler is called and the problem might be somewhere else.

Could it maybe have something to do with handle inheritance? Although I specifically took care to not leak handles to child processes in reproc.

@malxau
Copy link

malxau commented Jul 20, 2019

@ddalek were you launching a GUI process by any chance? (link /dump can show the subsystem from the PE header.)

@ghost
Copy link
Author

ghost commented Jul 21, 2019

All processes are console ones (3 subsystem windows CUI as reported by dumpbin).

I looked at the state of various processes when running the sample app and it looks very much like what's described in terminal bugs.

Console application with background.cpp sample code spawns testchild process (both are console subsystem). Child process has PID 10556

https://i.imgur.com/gx6EpV0.png

Child gets closed via GenerateConsoleCtrlEvent() but conhost still holds a handle to that process (overlapping console window at the top shows the end of the console output; VS console adds pause at the end of debug run and Press any key prompt is clearly not there).

https://i.imgur.com/Zryb6Db.png

If I forcibly close this handle in conhost process, VS debug console runs to completion.

https://i.imgur.com/7yFcSHN.png

So unfortunately it is a problem with conhost. I'm still not sure why this bug does not reproduce in vanilla console but does under VSDC and yori.

@malxau
Copy link

malxau commented Jul 23, 2019

I'm sure you're right that it's a conhost bug, and you're probably right that it's this conhost bug, but I'm just not seeing how the conditions are satisfied right now.

If:

  • reproc didn't use CREATE_NEW_PROCESS_GROUP; or
  • the process wasn't a console process; or
  • the process was launched on a different console (eg. CREATE_NEW_CONSOLE)

then this bug would mean GenerateConsoleCtrlEvent would try to associate the child with the console reproc is on. In the latter two cases, this is incorrect and fouls everything up. But if those three conditions are false, it should already be on this console, part of a process group, and the "normal" case should exist.

It's likely that I'm just missing something here, and that there's another condition that gives rise to this bug or for some other reason one of the above conditions becomes true, but if that's the case, I'm not currently understanding why.

@ghost
Copy link
Author

ghost commented Jul 23, 2019

Ah, I see what you're saying. I'll tackle that overcomplicated build setup and try debugging conhost when I have the time then.

@malxau
Copy link

malxau commented Jul 23, 2019

Building it is straightforward; AFAIK it's just:

msbuild /p:Configuration={Debug,Release} /p:Platform=x64 <path to sln file>

This will spit out conhost, even though it will error out trying to generate Cascadia.

Unfortunately when I did this my windbg couldn't make sense of the symbol file; I think I need to upgrade windbg to understand the pdb format it's currently using.

@DaanDeMeyer
Copy link
Owner

According to this Stackoverflow answer, Python uses the same mechanisms that reproc does for stopping processes on Windows. I looked around and unfortunately CTRL+Break seems to be the only way of gracefully stopping console processes on Windows.

I don't think there's anything I can do in reproc to work around this bug except recommending to use reproc_kill instead of reproc_terminate, but that has obvious issues as well in that cleanup cannot be performed by the child process.

@Tarquiscani
Copy link

I experienced the same problem with Visual Studio Debugger Console. A simple workaround is to force Visual Studio to use the vanilla cmd.exe as a debugging console, by changing the following setting:

Tools > Options > Debugging > General > Automatically close the console when debugging stops

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants