Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dorico's VSTAudioEngine hangs in config watcher cleanup #208

Closed
steinulf opened this issue Jan 12, 2024 · 9 comments
Closed

Dorico's VSTAudioEngine hangs in config watcher cleanup #208

steinulf opened this issue Jan 12, 2024 · 9 comments
Labels
asiohost Relates to a specific ASIO Host Application.
Milestone

Comments

@steinulf
Copy link

We have a Dorico customer who uses FlexASIO in the VSTAudioEngine, and with him the audio engine frequently freezes in, because FlexASIO hangs in the main thread and somehow waits for some other thread to join.
The callstack looks like this then:

ntdll.dll!NtWaitForSingleObject�()	Unknown	Non-user code. Symbols loaded.
KERNELBASE.dll!WaitForSingleObjectEx�()	Unknown	Non-user code. Symbols loaded.
msvcp140.dll!_Thrd_join(_Thrd_t thr, int * code=0x0000000000000000) Line 56	C++	Non-user code. Symbols loaded.
[Inline Frame] FlexASIO.dll!std::thread::join() Line 130	C++	Symbols loaded.
FlexASIO.dll!flexasio::ConfigLoader::Watcher::~Watcher() Line 181	C++	Symbols loaded.
FlexASIO.dll!std::_Destroy_in_place<flexasio::FlexASIO::PreparedState>(flexasio::FlexASIO::PreparedState & _Obj) Line 298	C++	Non-user code. Symbols loaded.
[Inline Frame] FlexASIO.dll!std::_Optional_destruct_base<flexasio::FlexASIO::PreparedState,0>::reset() Line 110	C++	Symbols loaded.
FlexASIO.dll!flexasio::FlexASIO::DisposeBuffers() Line 798	C++	Symbols loaded.
[Inline Frame] FlexASIO.dll!flexasio::`anonymous-namespace'::CFlexASIO::EnterWithMethod::__l2::<lambda>() Line 186	C++	Symbols loaded.
[Inline Frame] FlexASIO.dll!flexasio::`anonymous-namespace'::CFlexASIO::EnterInitialized::__l2::<lambda>() Line 181	C++	Symbols loaded.
FlexASIO.dll!flexasio::`anonymous namespace'::CFlexASIO::Enter<void <lambda>(void)>(std::basic_string_view<char,std::char_traits<char>> context={...}, flexasio::`anonymous-namespace'::CFlexASIO::EnterInitialized::__l2::void <lambda>(void) functor=void <lambda>(void){...}) Line 153	C++	Symbols loaded.
FlexASIO.dll!flexasio::`anonymous namespace'::CFlexASIO::EnterInitialized<void <lambda>(void)>(std::basic_string_view<char,std::char_traits<char>> context, flexasio::`anonymous-namespace'::CFlexASIO::EnterWithMethod::__l2::void <lambda>(void) functor) Line 183	C++	Symbols loaded.
FlexASIO.dll!flexasio::`anonymous namespace'::CFlexASIO::EnterWithMethod<void (__cdecl flexasio::FlexASIO::*)(void)>(std::basic_string_view<char,std::char_traits<char>> context, void(flexasio::FlexASIO::*)() method=0x00007ffe64e42360) Line 187	C++	Symbols loaded.
FlexASIO.dll!flexasio::`anonymous namespace'::CFlexASIO::disposeBuffers() Line 107	C++	Symbols loaded.
baios.dll!000000000a1237fd()	Unknown	Non-user code. Cannot find or open the PDB file.
baios.dll!000000000a1220a2()	Unknown	Non-user code. Cannot find or open the PDB file.
baios.dll!000000000a12e3f5()	Unknown	Non-user code. Cannot find or open the PDB file.
baios.dll!000000000a134887()	Unknown	Non-user code. Cannot find or open the PDB file.
VSTAudioEngine5.exe!VSTMixerDevice::updateIo(bool force) Line 5984	C++	Symbols loaded.
VSTAudioEngine5.exe!VSTMixerDevice::notify(Steinberg::FMessage * m=0x000000000014f360) Line 3825	C++	Symbols loaded.
[Inline Frame] VSTAudioEngine5.exe!Steinberg::UController::onIdle::__l14::<lambda_1>::operator()(Steinberg::CmObject *) Line 796	C++	Symbols loaded.

Can this somehow be avoided? Would increasing the buffer size maybe help?
User has the latest 1.9 FlexASIO

Dorico 5 AudioEngine 5.6.20 64bit 2024.1.11 16.48.52.446-freezedump.dmp

@dechamps
Copy link
Owner

Your stack trace suggests the FlexASIO config file watcher thread got stuck, preventing FlexASIO's ASIODisposeBuffers() implementation from cleaning up the watcher. I don't think I've seen this before.

Can you provide a log?

Would increasing the buffer size maybe help?

This has nothing to do with buffer sizes.

@steinulf
Copy link
Author

Many thanks for the quick response.

The customer is currently on a deadline and therefore pretty stressed. I can ask for a log, but don't know when he can deliver.

@dechamps
Copy link
Owner

dechamps commented Jan 12, 2024

A log would definitely provide a lot more information to help troubleshoot this - it would show the exact timeline of both the main thread and the config watcher thread. It would also help determine if the joining is the true root cause or if we're dealing with broader undefined behavior/corruption (e.g. use after free on ConfigWatcher) and the stuck join is just a symptom.

Another piece of information that may be useful, and that you may be able to recover from your dump, is if you can see any thread that is running flexasio::ConfigLoader::Watcher::RunThread() (or any other flexasio:: thread really) at the time the problem is occurring. That's the thread that the main thread is stuck on joining, and its stack might reveal more information.

@steinulf
Copy link
Author

I've asked for a log but have to wait....

And in regards to Watcher::RunThread() or any other flexasio:: thread, no, I can't find it in any of the running threads.
You can also load the dump into VisualStudio and have a look.

@steinulf
Copy link
Author

I received a log file from the customer and attached here. I hope you can make something out of it. Thanks
Flexasio.zip

@dechamps
Copy link
Owner

I was unable to find a smoking gun in there I'm afraid. The host application recreates buffers 16 times. Every time the config watcher thread is joined and then recreated successfully, except for the 16th time for some reason:

2024-01-12T17:22:23.8823134-00:00 17164 11064 --- ENTERING CONTEXT: disposeBuffers()
2024-01-12T17:22:23.8823933-00:00 17164 11064 Signaling config watcher thread to stop
2024-01-12T17:22:23.8824336-00:00 17164 11064 Waiting for config watcher thread to finish
2024-01-12T17:22:23.8824754-00:00 17164 2516 Config watcher thread stopping
(hangs instead of logging "Joined config watcher thread" as it should)

The log shows the config watcher thread is correctly notified and it does return from its thread proc (consistent with your finding that the thread is not there), but somehow std::thread::join() does not return. The relevant code paths are really quite straightforward and it did work on the previous 15 times so that's quite the head scratcher.

My bet is on some kind of undefined behavior/corruption happening at some point earlier in the lifetime of the process that corrupts thread state somehow and breaks std::thread::join(). This seems… subtle. The only straightforward way this can happen that I can think of is FlexASIO overflowing the file notify buffer and stomping on the std::thread structure, but I don't see how it could do that given the code that writes to it seems obviously correct.

At this point I'm not optimistic that we will be able to get to the bottom of this without the ability to attach a debugger and step through the std::thread calls.

However…

While pondering over that code I think I may have found an actual undefined behavior bug: when the watcher is torn down, it does not cancel the ReadDirectoryChangesW() operation, which means it may end up writing to the file notify buffer after it has been freed. I'm not quite sure how that can cause the reported symptoms, but I guess it's possible it could, depending on how the memory gets reused afterwards. Since the application seems to be triggering the watcher clean up code somewhat frequently (16 times in the provided log), this would increase the likelihood that something bad may happen due to a previously closed "zombie" watcher trampling over various memory locations. I filed #209 about it, but I can't promise anything as to when I'd be able to work on it.

@dechamps dechamps changed the title Dorico's VSTAudioEngine frequently freezing in Dorico's VSTAudioEngine hangs in config watcher cleanup Jan 14, 2024
@dechamps dechamps added the asiohost Relates to a specific ASIO Host Application. label Jan 14, 2024
@steinulf
Copy link
Author

Hi Etienne, thanks for looking into the issue. No pressure, take your time and do as you can. Our customer mentioned that he wants to migrate to Mac sooner or later any way, so even if you maybe don't fix it for this customer, some other customer will surely benefit from it in the future.

@dechamps dechamps added this to the FlexASIO 1.10 milestone May 25, 2024
@dechamps
Copy link
Owner

FlexASIO 1.10 addresses #209, the suspected root cause of the crash. Please try again with 1.10 and feel free to reopen if you are still experiencing this issue.

@steinulf
Copy link
Author

Hi Etienne, thanks so much for addressing this issue. For the next weeks I'm too busy to try out the fix version, but really thank you for finally tackling that bug. :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
asiohost Relates to a specific ASIO Host Application.
Projects
None yet
Development

No branches or pull requests

2 participants