New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Windows] enable XAudio2 sink in Windows desktop and code improvements #25068
base: master
Are you sure you want to change the base?
Conversation
8d98ed4
to
4c40160
Compare
Tested on Xbox:
Note hat fixes channel layout enumeration, before was:
And this is on Windows 11 PC when configured 5.1 speakers (same as Xbox with PR):
|
It runs, I don't have the hardware to test multichannel, a/v sync is the other thing that needs a lot of testing. I'll try on Windows 8 as well. One missing piece for desktop use is the handling of devices coming and going (plug or unplug headphones for example). It sounds like there is a way to query the processing quantum so maybe we don't have to guess the chunk size https://learn.microsoft.com/en-us/windows/win32/xaudio2/xaudio2-redistributable#duration-of-audio-processing-quantum |
This don't need to be handled by XAudio2 (or any other sink) directly. These system events are handled by AudioEngine (ActiveAE) and all this is already implemented and code is common for all Windows sinks: https://github.com/xbmc/xbmc/blob/master/xbmc/platform/win32/IMMNotificationClient.h xbmc/xbmc/platform/win32/IMMNotificationClient.h Lines 111 to 125 in 7c7f08a
Basically:
Sinks only needs obey what AE says. Every time that an audio device is plugged or unplugged AE is notified and enumerates all sinks and re-initializes same or other device/endpoint depending what is available.
chunk size is not guessed: The duration we want use is known and predefined in code: const unsigned int bufferLen = static_cast<int>(format.m_sampleRate * 0.02); // 20 ms chunks
m_dwFrameSize = wfxex.Format.nBlockAlign;
m_dwChunkSize = m_dwFrameSize * bufferLen;
m_dwBufferLen = m_dwChunkSize * 4; // 80 ms buffer
m_AvgBytesPerSec = wfxex.Format.nAvgBytesPerSec;
format.m_frames = bufferLen; The value you say "quantum" is not very related and seems to be the value that the API uses internally as the minimum value. That is, if we use 20 ms in Audio Engine, the XAudio API will use 2 packets of 10 ms but that is done transparently. Obviously it is better that it be a multiple but this is already true.
Yes and no. Delay of sink is perfectly know and code it's implemented in XAUDIO2_VOICE_STATE state;
m_sourceVoice->GetState(&state, 0);
double delay = (double)(m_sinkFrames - state.SamplesPlayed) / m_format.m_sampleRate;
status.SetDelay(delay);
return;
This is typically 40 ms because are 2 packets in queue and each packet is 20 ms.
|
Tested plugging/unplugging audio interface during playback and it works but has some rough edges on Windows 10. Since I haven't really tested something similar with DS or Wasapi, I'll dig more. Yes I was suggesting maybe making the chunk size a multiple of the quantum, not by coincidence but by properly querying XAudio. Unless AE prefers to buffer a different way. |
For quantum and chunksize, the Linux setup nicely documents how AE works: Short summary: https://github.com/xbmc/xbmc/blob/master/xbmc/cores/AudioEngine/Sinks/AESinkALSA.cpp#L739 |
Windows 10: noticed a couple times an issue with no sound after video pause. It's not clear if pause length is a factor or not. That wasn't on dev PC so couldn't just break into code. Will try to find more. |
Windows 8: patch below allows enumeration and activation of output. There were 3 problems:
The catch is that the WinRT enumeration has less information than MMDevice enumeration used so far. It's a generic device enumeration and only provides device id and friendly name,The isdefault property doesn't work as we need it to for audio devices. Maybe there is some other WinRT API to connect the info with the information retrieved by MMDevice and grab the rest of the info, I didn't look too deep. It's possible to test that MMDevice GetId value is contained in a WinRT device id. Not documented but doesn't seem very risky. As a possible workaround, I found a registry value that contains what we need and it fits well in the MMDevice enumeration, though I couldn't find an official name in the Windows SDK for it. That's the solution of the patch below. Another way would be to allow only the default device (null device id), which works fine. |
patch on top of your last commit 93405197a6fd1aed70639f3d2880e098c9435e89 diff --git a/xbmc/cores/AudioEngine/Sinks/AESinkXAudio.cpp b/xbmc/cores/AudioEngine/Sinks/AESinkXAudio.cpp
index 318dec1ac4..22063380b8 100644
--- a/xbmc/cores/AudioEngine/Sinks/AESinkXAudio.cpp
+++ b/xbmc/cores/AudioEngine/Sinks/AESinkXAudio.cpp
@@ -13,6 +13,7 @@
#include "cores/AudioEngine/Sinks/windows/AESinkFactoryWin.h"
#include "cores/AudioEngine/Utils/AEDeviceInfo.h"
#include "cores/AudioEngine/Utils/AEUtil.h"
+#include "utils/SystemInfo.h"
#include "utils/log.h"
#ifdef TARGET_WINDOWS_STORE
@@ -34,6 +35,47 @@ using namespace Microsoft::WRL;
namespace
{
constexpr int XAUDIO_BUFFERS_IN_QUEUE = 2;
+
+HRESULT KXAudio2Create(IXAudio2** ppXAudio2,
+ UINT32 Flags X2DEFAULT(0),
+ XAUDIO2_PROCESSOR XAudio2Processor X2DEFAULT(XAUDIO2_DEFAULT_PROCESSOR))
+{
+ typedef HRESULT(__stdcall * XAudio2CreateInfoFunc)(_Outptr_ IXAudio2**, UINT32,
+ XAUDIO2_PROCESSOR);
+ static HMODULE dll = NULL;
+ static XAudio2CreateInfoFunc XAudio2CreateFn = nullptr;
+
+ if (dll == NULL)
+ {
+ dll = LoadLibraryEx(L"xaudio2_9redist.dll", NULL, LOAD_LIBRARY_SEARCH_DEFAULT_DIRS);
+
+ if (dll == NULL)
+ {
+ dll = LoadLibraryEx(L"xaudio2_9.dll", NULL, LOAD_LIBRARY_SEARCH_SYSTEM32);
+
+ if (dll == NULL)
+ {
+ // Windows 8 compatibility
+ dll = LoadLibraryEx(L"xaudio2_8.dll", NULL, LOAD_LIBRARY_SEARCH_SYSTEM32);
+
+ if (dll == NULL)
+ return HRESULT_FROM_WIN32(GetLastError());
+ }
+ }
+
+ XAudio2CreateFn = (XAudio2CreateInfoFunc)(void*)GetProcAddress(dll, "XAudio2Create");
+ if (XAudio2CreateFn == nullptr)
+ {
+ return HRESULT_FROM_WIN32(GetLastError());
+ }
+ }
+
+ if (XAudio2CreateFn != NULL)
+ return (*XAudio2CreateFn)(ppXAudio2, Flags, XAudio2Processor);
+ else
+ return E_FAIL;
+}
+
} // namespace
extern const char* WASAPIErrToStr(HRESULT err);
@@ -50,7 +92,7 @@ inline void SafeDestroyVoice(TVoice **ppVoice)
CAESinkXAudio::CAESinkXAudio()
{
- HRESULT hr = XAudio2Create(m_xAudio2.ReleaseAndGetAddressOf(), 0);
+ HRESULT hr = KXAudio2Create(m_xAudio2.ReleaseAndGetAddressOf(), 0);
if (FAILED(hr))
{
CLog::LogF(LOGERROR, "XAudio initialization failed.");
@@ -279,7 +321,12 @@ void CAESinkXAudio::EnumerateDevicesEx(AEDeviceInfoList &deviceInfoList, bool fo
IXAudio2SourceVoice* mSourceVoice = nullptr;
Microsoft::WRL::ComPtr<IXAudio2> xaudio2;
- hr = XAudio2Create(xaudio2.ReleaseAndGetAddressOf(), eflags);
+ // AudioCategory_Media is not accepted by Windows 8
+ const AUDIO_STREAM_CATEGORY streamCategory{
+ CSysInfo::IsWindowsVersionAtLeast(CSysInfo::WindowsVersionWin10) ? AudioCategory_Media
+ : AudioCategory_Other};
+
+ hr = KXAudio2Create(xaudio2.ReleaseAndGetAddressOf(), eflags);
if (FAILED(hr))
{
CLog::LogF(LOGERROR, "failed to activate XAudio for capability testing ({})",
@@ -310,9 +357,15 @@ void CAESinkXAudio::EnumerateDevicesEx(AEDeviceInfoList &deviceInfoList, bool fo
wfxex.Format.wFormatTag = WAVE_FORMAT_EXTENSIBLE;
wfxex.SubFormat = KSDATAFORMAT_SUBTYPE_IEEE_FLOAT;
- xaudio2->CreateMasteringVoice(&mMasterVoice, wfxex.Format.nChannels,
- wfxex.Format.nSamplesPerSec, 0, deviceId.c_str(), nullptr,
- AudioCategory_Media);
+ hr = xaudio2->CreateMasteringVoice(&mMasterVoice, wfxex.Format.nChannels,
+ wfxex.Format.nSamplesPerSec, 0, deviceId.c_str(), nullptr,
+ streamCategory);
+
+ if (FAILED(hr))
+ {
+ CLog::LogF(LOGERROR, "failed to create mastering voice ({})", WASAPIErrToStr(hr));
+ return;
+ }
for (int p = AE_FMT_FLOAT; p > AE_FMT_INVALID; p--)
{
@@ -363,11 +416,14 @@ void CAESinkXAudio::EnumerateDevicesEx(AEDeviceInfoList &deviceInfoList, bool fo
xaudio2->CreateMasteringVoice(&mMasterVoice, wfxex.Format.nChannels,
wfxex.Format.nSamplesPerSec, 0, deviceId.c_str(), nullptr,
- AudioCategory_Media);
- hr = xaudio2->CreateSourceVoice(&mSourceVoice, &wfxex.Format);
-
+ streamCategory);
if (SUCCEEDED(hr))
- deviceInfo.m_sampleRates.push_back(WASAPISampleRates[j]);
+ {
+ hr = xaudio2->CreateSourceVoice(&mSourceVoice, &wfxex.Format);
+
+ if (SUCCEEDED(hr))
+ deviceInfo.m_sampleRates.push_back(WASAPISampleRates[j]);
+ }
}
SafeDestroyVoice(&mSourceVoice);
@@ -438,11 +494,16 @@ bool CAESinkXAudio::InitializeInternal(std::string deviceId, AEAudioFormat &form
HRESULT hr;
IXAudio2MasteringVoice* pMasterVoice = nullptr;
+ // AudioCategory_Media is not accepted by Windows 8
+ const AUDIO_STREAM_CATEGORY streamCategory{
+ CSysInfo::IsWindowsVersionAtLeast(CSysInfo::WindowsVersionWin10) ? AudioCategory_Media
+ : AudioCategory_Other};
if (!bdefault)
{
- hr = m_xAudio2->CreateMasteringVoice(&pMasterVoice, wfxex.Format.nChannels, wfxex.Format.nSamplesPerSec,
- 0, device.c_str(), nullptr, AudioCategory_Media);
+ hr = m_xAudio2->CreateMasteringVoice(&pMasterVoice, wfxex.Format.nChannels,
+ wfxex.Format.nSamplesPerSec, 0, device.c_str(), nullptr,
+ streamCategory);
}
if (!pMasterVoice)
@@ -457,8 +518,9 @@ bool CAESinkXAudio::InitializeInternal(std::string deviceId, AEAudioFormat &form
// smartphone issue: providing device ID (even default ID) causes E_NOINTERFACE result
// workaround: device = nullptr will initialize default audio endpoint
- hr = m_xAudio2->CreateMasteringVoice(&pMasterVoice, wfxex.Format.nChannels, wfxex.Format.nSamplesPerSec,
- 0, 0, nullptr, AudioCategory_Media);
+ hr = m_xAudio2->CreateMasteringVoice(&pMasterVoice, wfxex.Format.nChannels,
+ wfxex.Format.nSamplesPerSec, 0, nullptr, nullptr,
+ streamCategory);
if (FAILED(hr) || !pMasterVoice)
{
CLog::LogF(LOGINFO, "Could not retrieve the default XAudio audio endpoint ({}).",
@@ -524,8 +586,9 @@ bool CAESinkXAudio::InitializeInternal(std::string deviceId, AEAudioFormat &form
wfxex.Format.nSamplesPerSec = WASAPISampleRates[i];
wfxex.Format.nAvgBytesPerSec = wfxex.Format.nSamplesPerSec * wfxex.Format.nBlockAlign;
- hr = m_xAudio2->CreateMasteringVoice(&m_masterVoice, wfxex.Format.nChannels, wfxex.Format.nSamplesPerSec,
- 0, device.c_str(), nullptr, AudioCategory_Media);
+ hr = m_xAudio2->CreateMasteringVoice(&m_masterVoice, wfxex.Format.nChannels,
+ wfxex.Format.nSamplesPerSec, 0, device.c_str(),
+ nullptr, streamCategory);
if (SUCCEEDED(hr))
{
hr = m_xAudio2->CreateSourceVoice(&m_sourceVoice, &wfxex.Format, 0, XAUDIO2_DEFAULT_FREQ_RATIO, &m_voiceCallback);
diff --git a/xbmc/cores/AudioEngine/Sinks/windows/AESinkFactoryWin32.cpp b/xbmc/cores/AudioEngine/Sinks/windows/AESinkFactoryWin32.cpp
index 3334b9ba27..d082effd1c 100644
--- a/xbmc/cores/AudioEngine/Sinks/windows/AESinkFactoryWin32.cpp
+++ b/xbmc/cores/AudioEngine/Sinks/windows/AESinkFactoryWin32.cpp
@@ -22,6 +22,8 @@ const IID IID_IAudioClient = __uuidof(IAudioClient);
DEFINE_PROPERTYKEY(PKEY_Device_FriendlyName, 0xa45c254e, 0xdf1c, 0x4efd, 0x80, 0x20, 0x67, 0xd1, 0x46, 0xa8, 0x50, 0xe0, 14);
DEFINE_PROPERTYKEY(PKEY_Device_EnumeratorName, 0xa45c254e, 0xdf1c, 0x4efd, 0x80, 0x20, 0x67, 0xd1, 0x46, 0xa8, 0x50, 0xe0, 24);
+const PROPERTYKEY PKEY_AudioEndpoint_Path{
+ {0x9c119480, 0xddc2, 0x4954, {0xa1, 0x50, 0x5b, 0xd2, 0x40, 0xd4, 0x54, 0xad}}, 1};
extern const char *WASAPIErrToStr(HRESULT err);
#define EXIT_ON_FAILURE(hr, reason) \
@@ -129,12 +131,21 @@ std::vector<RendererDetail> CAESinkFactoryWin::GetRendererDetails()
details.uiChannelMask = std::max(varName.uintVal, (unsigned int)KSAUDIO_SPEAKER_STEREO);
PropVariantClear(&varName);
+ hr = pProperty->GetValue(PKEY_AudioEndpoint_Path, &varName);
+ if (FAILED(hr))
+ {
+ CLog::LogF(LOGERROR, "Retrieval of endpoint path failed.");
+ goto failed;
+ }
+
+ details.strDeviceRawId = KODI::PLATFORM::WINDOWS::FromW(varName.pwszVal);
+ PropVariantClear(&varName);
+
if (pDevice->GetId(&pwszID) == S_OK)
{
if (wstrDDID.compare(pwszID) == 0)
details.bDefault = true;
- details.strDeviceRawId = KODI::PLATFORM::WINDOWS::FromW(pwszID);
CoTaskMemFree(pwszID);
} |
Other thing noticed, both Windows 8 and 10: there seems to be a different in the delay/latency with the other sink types. It's easy to notice with GUI sounds turned on. With DirectSound in particular you hear the click much later than with XAudio2 (feels instant). WASAPI sounded similar to DS to me, but not always for some reason. |
XAudio want device 'Path' to identify audio devices while WASAPI want device 'GUID'
Thank you very much for the diff. Added as is. Tested and works fine also in Win11 and UWP. I've changed the name from deviceRawId to devicePath which better reflects what it is. Seems Microsoft has migrated at some point from use device While in UWP deviceId propriety already returns Path (Id and Path is the same). =>a0193dd in Xbox deviceId is: Since device enumeration code is common for XAudio and WASAPI we can't replace Id by Path. As it is now, it is fine: both values are saved and it's used the one necessary in each case. |
Latency and a/v delay are different concepts. Low latency is already mentioned:
Latency is time from click in GUI to hear sound is low in XAudio because only 2 packets of 20 ms in queue (latency = 40 ms) On DirectSound now is 400 ms (it was less before Bluetooth fix). EDIT: 8 * 50 = 400 ms but as only it is filled with 7 packets = 350 ms Even with GUI sounds behaves different because is "realtime", video playback should be the same with DirectSound and XAudio. |
One thing to keep in mind with this very low buffer: While AE has absolutely no problems by design to provide samples in time. A slow computer might not. For menu sounds that is especially relevant when user did not activate keep audio alive. That way one might not hear menu sounds at all. Also when enabled AEs activeaesink will output 1000/period size samples per second. With normal audio periods being around 32 ms for AC3 or PCM it makes not much sense to go below 64 ms. I would suggest to always have at least two periods buffer as a minimum to avoid designed underrun. |
One point to note is that this has been working fine on Xbox since the beginning (and XAudio was the only sink available on Xbox). So there is no reason to change it now unless there emerge new issues on Windows desktop. Any changes to this could cause regressions on Xbox. If this had to be changed, it could be done later, this PR only aims to enable XAudio on Windows desktop with minimal functional changes: most of the changes are just code improvements that do not change the functionality, except for the differences in device enumeration that are necessary for it to work. |
No need to change anything. Just explained from a real-world scenario how often the activeaesink thread adds data. If it has no back buffer, e.g. two periods buffer, it will immediately run dry after adding the first packet. All good. |
I wasn't very satisfied with the previous solution of using an undocumented registry value. |
Maybe we can merge this and you continue later in a different PR. This way at least we would have a check point in case there are regressions. |
I've taken a look and I like the changes, I'm just saying that it would be easier to continue in a separate PR. |
That would be fine. Last thing I noticed, with an old computer I sometimes get "clicks" or "cracks" in the audio when playback resumes after skipping. Maybe it has something to do with the small buffer. |
Description
[Windows] enable XAudio2 sink in Windows desktop + code improvements
Now that Windows minimum version is Windows 8.1 is possible enable this sink also for Windows desktop as it's supported on Windows 8, 10 and 11.
Motivation and context
See: #25046 (comment)
Initially it was just a quick test to see what "would happen" when enabled XAudio on Windows desktop. It has turned out to work quite well but a multitude of things to improve have also been detected and hidden bugs have appeared in the enumeration of devices that were hidden since Xbox only uses one device.
It is not yet intended to be an alternative to DirectSound and does not replace it, it only adds XAudio as an additional option. I think that with these changes it is in a usable state although more issues will probably appear when tested on a large scale. The code improvements and general improvements are also valid for Xbox as the code is common.
Many of the changes are just code improvements with no changes in functionality. The only notable change is in the device enumeration in commit 4c40160. Also fixed issues on channels layout enumeration.
XAudio wants device Id in the from:
{0.0.0.00000000}.{4f5f9f01-868c-4a8b-b96c-d1e12e05cc92}
while the form{4F5F9F01-868C-4A8B-B96C-D1E12E05CC92}
was being used. This causes code always fallback to default device here:xbmc/xbmc/cores/AudioEngine/Sinks/AESinkXAudio.cpp
Lines 645 to 671 in 4876e9d
That was a known issue but not very important on Xbox since there is only one device anyway. For Windows Desktop this would not be acceptable because the user expects to open the device selected in settings and not always the default one i.e:
HDMI - DENON-AVR (NVIDIA High Definition Audio)
.How has this been tested?
Runtime Windows 11 x64 and Xbox Series S
All devices are enumerated correctly, channels layouts, sample rates, formats, etc.
Sound plays...
What is the effect on users?
Users have another type of sink to choose from besides DirectSound and WASAPI.
Pros:
Cons:
Screenshots (if appropriate):
Types of change
Checklist: