Provide feedback
Saved searches
Use saved searches to filter your results more quickly
Sign up
Appearance settings
After Windows Vista, the Audio system has changed a lot compared to the previous system, resulting in a new set of underlying APIs called Core Audio APIs. This low-level API provides services for high-level apis such as Media Foundation(which will replace high-level apis such as DirectShow). The system API has the characteristics of low delay, high reliability and security.
This article mainly introduces the use of the API in real-time audio and video scenarios.
Composition of Core Audio APIs: MMDevice, EndpointVolume, WASAPI, etc. For real-time audio and video systems, MMDevice and EndpointVolume apis are mainly used. Its position in the system is shown as follows:
My simple use of audio equipment in real-time audio and video can be divided into:
1. Device list management
3. Equipment function management
6. Device terminal monitoring
Next, we will introduce the implementation of relevant functions:
1. Device list management
The management of audio devices is realized by the MMDevice API.
We’ll start by creating a IMMDeviceEnumerator object to start calling the related functions.
IMMDeviceEnumerator* ptrEnumerator; CoCreateInstance(__uuidof(MMDeviceEnumerator), NULL, CLSCTX_ALL, __uuidof(IMMDeviceEnumerator), reinterpret_cast<void**>(&ptrEnumerator)); And through IMMDeviceEnumerator can be achieved: Obtain the default device GetDefaultAudioEndpoint, IMMDeviceCollection, GetDevice, and IMMNotificationClient.Copy the code
With these methods, we can get the system default device, traverse the device list, open the specified device, and listen for device changes. This enables device management functions in real-time audio and video.
2. Initialize the device
The startup of audio equipment is an important node of the reliability of the whole audio module. According to device type and device data capture mode, we can divide into three types of devices: microphone acquisition, speaker playback, and speaker acquisition.
First we need an IMMDevice object, available in device Management.
IMMDevice* pDevice;
//GetDefault
ptrEnumerator->GetDefaultAudioEndpoint((EDataFlow)dir, (ERole)role/* eCommunications */, &pDevice);
//Get by path
ptrEnumerator->GetDevice(device_path, &pDevice);
//GetIndex
pCollection->Item(index, &pDevice);Copy the code
Then IAudioClient is obtained through IMMDevice, and the formatting and initialization of the device are realized through the IAudioClient object. Generally, they are turned on in shared mode, in which microphone acquisition and speaker broadcast process data in event-driven mode, while speaker acquisition drives data processing in loopback mode. A simple example is as follows:
//mic capturer
ptrClient->Initialize(
AUDCLNT_SHAREMODE_SHARED,
AUDCLNT_STREAMFLAGS_EVENTCALLBACK |
AUDCLNT_STREAMFLAGS_NOPERSIST,
0,
0,
(WAVEFORMATEX*)&Wfx,
NULL);
//playout render
ptrClient->Initialize(
AUDCLNT_SHAREMODE_SHARED,
AUDCLNT_STREAMFLAGS_EVENTCALLBACK,
0,
0,
(WAVEFORMATEX*)&Wfx,
NULL);
//playout capturer
ptrClient->Initialize(
AUDCLNT_SHAREMODE_SHARED,
AUDCLNT_STREAMFLAGS_LOOPBACK,
0,
0,
(WAVEFORMATEX*)&Wfx,
NULL); Copy the code
The Wfx format parameters is equipment, generally in order to ensure the availability, use the default format (through the IAudioClient: : GetMixFormat access), if you need to use the custom format, Can through the IAudioClient: : IsFormatSupported method tries to traverse device support format.
3. Equipment function management
For microphone equipment, we usually need to process its data. Some hardware devices and systems support built-in noise reduction, gain, echo cancellation and other functions. However, under the general Windows system, devices are complicated and uncontrollable, and software algorithms are mostly used for processing. To check whether the device uses the built-in processing function and related parameters, use the Topology module.
IDeviceTopology* pTopo;
pDevice->Activate(__uuidof(IDeviceTopology), CLSCTX_INPROC_SERVER, 0,&pTopo);Copy the code
With IDeviceTopology, we can traverse IConnector objects, obtain IAudioAutoGainControl, IAudioVolumeLevel and other capability objects, and process related capabilities.
Note: IConnector may be nested in a loop, so you need to identify the type of the member object IPart while iterating through the IPart of IConnector.
4. Data interaction
During device initialization, we selected different startup modes according to different devices. Different devices have different data drives in their respective modes:
Microphone acquisition:
Speaker playback:
Speaker acquisition:
During data interaction with the device, we need to obtain the corresponding service object based on the data acquisition mode to obtain device data. In the collection part, IAudioCaptureClient service is used to obtain device data, and IAudioRenderClient service is used to obtain device data incoming pointer for playback. The following is an example:
//capturer IAudioCaptureClient* ptrCaptureClient; //audioin or audioout ptrClient->GetService(__uuidof(IAudioCaptureClient), (void**)&ptrCaptureClient); {//work thread //Wait Event ptrCaptureClient->GetBuffer( &pData, // packetwhich is ready to be read by used
&framesAvailable, // #frames in the captured packet (can be zero)
&flags, // support flags (check)
&recPos, // device position of first audio frame indata packet &recTime); // value of performance counter at the time of recording //pData processing ptrCaptureClient->ReleaseBuffer(framesAvailable); } //render IAudioRenderClient* ptrRenderClient; //audioout ptrClient->GetService(__uuidof(IAudioRenderClient), (void**)&ptrRenderClient); {//work thread BYTE* pData; //form buffer UINT32 bufferLength = 0; ptrClient->GetBufferSize(&bufferLength); UINT32 playBlockSize = nSamplesPerSec / 100; //Wait Event UINT32 padding = 0; ptrClient->GetCurrentPadding(&padding);if(bufferLength - padding > playBlockSize) { ptrRenderClient->GetBuffer(playBlockSize, &pData); //request and getdata ptrCaptureClient->ReleaseBuffer(playBlockSize, 0); }}Copy the code
In actual data interaction, separate threads are required to process GetBuffer and ReleaseBuffer. The collecting and speakers, microphones are event driven by equipment, can be set in the device initialization complete response of event handlers (the IAudioClient: : SetEventHandle).
In the whole audio and video system, the device data thread also needs to count the data processing time, the size of the cache for collecting and playing, etc., and the user listens to check the device status and aeC delay calculation.
5. Volume management
Generally, volume management processes the volume of the current device only after the device is selected. Therefore, IAudioEndpointVolume is generally used. This object is obtained from the IMMDevice device object:
IAudioEndpointVolume* pVolume;
pDevice->Activate(__uuidof(IAudioEndpointVolume), CLSCTX_ALL, NULL, reinterpret_cast<void**>(&pVolume));Copy the code
With the IAudioEndpointVolume object, we can handle volume controls for the current device:
pVolume->GetMasterVolumeLevelScalar(&fLevel);
pVolume->SetMasterVolumeLevelScalar(fLevel, NULL);Copy the code
BOOL mute;
pVolume->GetMute(&mute);
pVolume->SetMute(mute, NULL);Copy the code
And registered IAudioEndpointVolumeCallback to monitor the volume status:
IAudioEndpointVolumeCallback* cbSessionVolume; //need todo
pVolume->RegisterControlChangeNotify(cbSessionVolume);Copy the code
6. Device terminal monitoring
IAudioSessionEvents is generally used to listen to the following operations:
IAudioSessionControl* ptrSessionControl;
ptrClient->GetService(__uuidof(IAudioSessionControl), (void**)&ptrSessionControl);
IAudioSessionEvents* notify;
ptrSessionControl->RegisterAudioSessionNotification(notify);Copy the code
This callback listener can listen for the connection status of the device, name changes, etc.
Some considerations:
1. Thread priority
In the actual project development process, we need to deal with the audio thread worker thread. Usually by calling the system module avrt.dll, the function under it is dynamically called, and the calling thread is associated with the specified task (Pro Audio). The code:
avrt_module_ = LoadLibrary(TEXT("Avrt.dll"));
if (avrt_module_)
{
_PAvRevertMmThreadCharacteristics = (PAvRevertMmThreadCharacteristics)GetProcAddress(avrt_module_, "AvRevertMmThreadCharacteristics");
_PAvSetMmThreadCharacteristicsA = (PAvSetMmThreadCharacteristicsA)GetProcAddress(avrt_module_, "AvSetMmThreadCharacteristicsA");
_PAvSetMmThreadPriority = (PAvSetMmThreadPriority)GetProcAddress(avrt_module_, "AvSetMmThreadPriority");
}Copy the code
In actual data processing thread association:
hMmTask_ = _PAvSetMmThreadCharacteristicsA("Pro Audio", &taskIndex);
if (hMmTask_)
{
_PAvSetMmThreadPriority(hMmTask_, AVRT_PRIORITY_CRITICAL);
}Copy the code
Through task binding, the reliability of audio data processing thread can be improved effectively.
2. Worker threads
Device initialization and release operations need to be handled in a unified thread. Some SYSTEM COM objects need to be released in the creation thread when they are released, otherwise the release crash may occur. And some volume selection, monitoring and other processing can be handled in the user thread, but need to do a good job of multi-threaded security.
3. Equipment format selection
When selecting a format such as the sampling rate or sound channel, if you need to use a customized format, the format may fail to match or the device may fail to initialize after selecting a matching format. Generally, the default startup format is used in such scenarios.
4. Abnormal data processing
When the data processing thread processes audio data, event response timeout, device object exception and so on usually occur. The usual approach is to exit the data thread and terminate the device, then check whether the current device is functional, and then restart the current device or choose the default device.
Last Modified:
1. Audio API on macOS
Core Audio
Audio API provided by macOS , iOS , and iPadOS . Allows applications to connect with the computer’s audio hardware. It was developed to provide an excellent audio environment to Macintosh users and to provide MIDI protocol to application programs.
It supports OpenAL.
1.1. history
It was first installed in OS X 10.0 Cheetah.
2. Audio API on Windows
2.1. history
-
It was first introduced in Windows Vista and introduced the WASAPI layer.
-
Starting with Windows 10, the AudioGraph API was introduced, enabling low-latency audio playback, and WASAPI was also improved to allow setting buffer sizes lower than 10ms and low-latency modes.
3.
This document is available under CC BY-NC-SA 2.0 KR. (except for some documents and illustrations where licenses are specified)
The copyright of the contributed document belongs to each contributor, and each contributor owns the copyright of the part they contribute.
To access the Windows core audio API in Rust, you can use the winapi
crate which provides Rust bindings for the Windows API. The core audio API is part of the Windows multimedia API and can be found in the mmdeviceapi.h
header file.
To use the core audio API in Rust, you first need to define the necessary structs and enums. Here’s an example of how to do this for the audio client interface:
With the necessary structs and enums defined, you can now use the audio client interface to initialize the audio device and start playing audio. Here’s an example:
In this example, we first get the default audio rendering device using the IMMDeviceEnumerator
interface. We then use the IAudioClient
interface to initialize the audio stream with a given audio format and start audio playback. Finally, we use the IAudioRenderClient
interface to play audio data from a file by repeatedly calling get_buffer()
and release_buffer()
, and copying audio data to the buffer.
-
README
-
Frameworks
-
Dependencies
-
Used By
-
Versions
-
Release Notes
AudioSwitcher Core Audio Api.
This includes all controllers and devices to access Windows System Devices and manipulate them.
This library can be used on any PC running Windows Vista and above. Supports both x86/x64 runtimes.
Product |
Compatible and additional computed target framework versions. |
---|---|
.NET Framework |
net40 is compatible. net403 was computed. net45 was computed. net451 was computed. net452 is compatible. net46 was computed. net461 was computed. net462 was computed. net463 was computed. net47 was computed. net471 was computed. net472 is compatible. net48 is compatible. net481 was computed. |
-
-
AudioSwitcher.AudioApi
(>= 3.0.0)
-
AudioSwitcher.AudioApi
NuGet packages (3)
Showing the top 3 NuGet packages that depend on AudioSwitcher.AudioApi.CoreAudio:
Package | Downloads |
---|---|
VL.CoreLib.Windows Windows specific VL CoreLib |
10.1K |
Matrixden.Utils Matrixden.Utils |
7.8K |
Matrixden.SwissArmyKnives A library for extending .net object. |
4.0K |
GitHub repositories (6)
Showing the top 6 popular GitHub repositories that depend on AudioSwitcher.AudioApi.CoreAudio:
Repository | Stars |
---|---|
xenolightning/AudioSwitcher_v1 Version 1 of Audio Switcher |
930 |
sw3103/movemouse Move Mouse is a simple piece of software that is designed to simulate user activity. |
631 |
KjetilSv/Win10As Make your windows 10 computer IOT friendly |
235 |
Anc813/MicMute Mute default mic clicking tray icon or shortcut |
153 |
ADeltaX/AudioFlyout Replace the Volume/SMTC UI with a custom one. Only for Windows 10 17763+ |
131 |
ksasao/TTSController 各種 Text-to-Speech エンジンを統一的に操作するライブラリです |
125 |
Version | Downloads | Last updated |
---|---|---|
4.0.0-alpha5 |
47,245 | 10/6/2016 |
4.0.0-alpha3 |
1,123 | 9/29/2016 |
4.0.0-alpha2 |
1,103 | 7/20/2016 |
4.0.0-alpha1 |
1,384 | 3/2/2016 |
3.1.0-beta4 |
1,742 | 2/18/2016 |
3.1.0-beta3 |
1,097 | 2/17/2016 |
3.1.0-beta2 |
1,100 | 2/16/2016 |
3.1.0-beta1 |
1,102 | 2/16/2016 |
3.0.3 |
23,650 | 5/20/2023 |
3.0.3-beta2 |
162 | 5/20/2023 |
3.0.0.1 |
105,151 | 1/12/2016 |
3.0.0-beta8 |
1,288 | 12/26/2015 |
3.0.0-beta7 |
1,267 | 12/11/2015 |
3.0.0-beta5 |
1,314 | 12/10/2015 |
3.0.0-beta4 |
1,101 | 12/9/2015 |
3.0.0-beta3 |
1,371 | 11/30/2015 |
3.0.0-beta1 |
1,191 | 11/16/2015 |
1.7.0 |
2,977 | 6/8/2015 |
1.6.5 |
1,598 | 4/29/2015 |
1.6.4 |
1,713 | 4/14/2015 |
1.6.2 |
1,810 | 3/16/2015 |
1.6.1 |
1,727 | 3/1/2015 |
1.5.3 |
1,502 | 1/25/2015 |
1.4.0-beta1 |
1,598 | 12/14/2014 |
1.3.2 |
1,990 | 12/8/2014 |
1.2.2 |
1,843 | 11/23/2014 |
1.2.1 |
1,811 | 11/22/2014 |
1.2.0 |
3,682 | 11/22/2014 |
1.1.8 |
2,004 | 11/20/2014 |
— Windows 10 support
— Using api 3.0.0 as a base
— Standard events have been replaced with observables
— Audio Session support for Windows 7 and above.
— Session peak value observable
— SpeakerConfig support
— bit Depth/sample rate support