Skip to content
Navigation Menu
Provide feedback
Saved searches
Use saved searches to filter your results more quickly
Sign up
Appearance settings
-
README
-
Frameworks
-
Dependencies
-
Used By
-
Versions
Product |
Compatible and additional computed target framework versions. |
---|---|
.NET |
net5.0 was computed. net5.0-windows was computed. net6.0 was computed. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. |
.NET Core |
netcoreapp2.0 was computed. netcoreapp2.1 was computed. netcoreapp2.2 was computed. netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
.NET Standard |
netstandard2.0 is compatible. netstandard2.1 is compatible. |
.NET Framework |
net461 was computed. net462 was computed. net463 was computed. net47 was computed. net471 was computed. net472 was computed. net48 was computed. net481 is compatible. |
MonoAndroid |
monoandroid was computed. |
MonoMac |
monomac was computed. |
MonoTouch |
monotouch was computed. |
Tizen |
tizen40 was computed. tizen60 was computed. |
Xamarin.iOS |
xamarinios was computed. |
Xamarin.Mac |
xamarinmac was computed. |
Xamarin.TVOS |
xamarintvos was computed. |
Xamarin.WatchOS |
xamarinwatchos was computed. |
-
.NETFramework 4.8.1
-
System.Runtime.InteropServices
(>= 4.3.0)
-
System.Runtime.InteropServices
-
.NETStandard 2.0
-
System.Runtime.InteropServices
(>= 4.3.0)
-
System.Runtime.InteropServices
-
.NETStandard 2.1
-
System.Runtime.InteropServices
(>= 4.3.0)
-
System.Runtime.InteropServices
NuGet packages (2)
Showing the top 2 NuGet packages that depend on CoreAudio:
Package | Downloads |
---|---|
HASS.Agent.Shared Shared functions and models for the HASS.Agent platform. |
23.0K |
LaquaiLib.Windows Contains functionality specifically for Windows. Adds references to Windows Forms and WPF. |
1.8K |
GitHub repositories (4)
Showing the top 4 popular GitHub repositories that depend on CoreAudio:
Repository | Stars |
---|---|
Librelancer/Librelancer A re-implementation of Freelancer |
526 |
Pdawg-bytes/GyroShell A shell for Windows 11 (and maybe 10) that aims to allow for a much more customizable and streamlined shell experience. Fully written in C# WASDK. |
241 |
popeen/Classic-Volume-Mixer In Windows 11 the volume mixer was replaced by a UWP version, I preferred the old so I made this simple program to bring it back. |
199 |
Fragtality/Fenix2GSX GSX Integration for the Fenix A320 |
107 |
Version | Downloads | Last updated |
---|---|---|
1.40.0 |
3,346 | 9/12/2024 |
1.38.0 |
777 | 9/12/2024 |
1.37.0 |
12,143 | 8/19/2023 |
1.35.0 |
1,317 | 8/18/2023 |
1.32.0 |
1,325 | 8/18/2023 |
1.27.0 |
3,022 | 2/28/2023 |
1.26.0 |
1,417 | 2/28/2023 |
1.24.0 |
1,777 | 12/22/2022 |
1.22.0 |
11,868 | 11/25/2022 |
1.19.0 |
1,659 | 11/3/2022 |
1.18.0 |
2,926 | 10/23/2022 |
1.17.0 |
1,586 | 10/22/2022 |
1.16.0 |
3,596 | 7/9/2022 |
1.15.0 |
1,622 | 7/9/2022 |
1.12.0 |
9,127 | 2/18/2022 |
1.11.0 |
5,342 | 2/9/2022 |
1.10.0 |
4,672 | 9/2/2021 |
1.9.0 |
1,541 | 9/1/2021 |
1.8.0 |
1,527 | 9/1/2021 |
1.7.0 |
1,520 | 9/1/2021 |
1.6.0 |
1,549 | 8/31/2021 |
1.5.0 |
1,543 | 8/30/2021 |
1.4.0 |
1,533 | 8/30/2021 |
1.3.0 |
1,526 | 8/31/2021 |
1.1.0 |
1,595 | 8/31/2021 |
In the process of audio and video communication, the most basic aspect of audio is nothing more than the acquisition and playback of audio. Under Windows platform, there are many ways to collect and play audio. As a Windows Audio application developer, you can often be overwhelmed by the various apis available, such as MME, DirectSound, WDM/KS, and Core Audio. But almost all Audio and video communication developers will choose Core Audio as the underlying API for capturing and playing. In this article, we will focus on Core Audio, its pros and cons, and our technical practice of using it to capture and play Windows Audio.
# 1
According to the Core Audio?
Why Core Audio? Let’s start with a look at the pros and cons of some of the mainstream Windows APIs.
1.1 Windows Multimedia Extensions (MME/WinMM)
MME was the first standard API for Windows.
Advantages: The MME method is simple to implement.
Disadvantages: Latency is a major issue, dynamic, real-time audio (such as real-time audio calls, game event notifications, etc.) is a bit difficult to process in time, generally the minimum latency can be up to 120ms. In a real-time audio scene, anything 10 milliseconds later than the brain thinks it should happen is considered out of sync.
1.2 DirectSound (DirectX Audio)
DirectX is an umbrella term for a collection of Com-based multimedia apis, including DirectSound.
Advantage:
1) It can work very close to hardware, with a minimum latency of around 60 ms, and supports higher quality audio;
2) Simple apis to make interaction with hardware practical;
3) Brings pluggable software-based audio effects (DX Effects) and Instruments (DXi Instruments) to the platform.
1.3 Windows Driver Model/Kernel Streaming (WDM/KS)
With WDM, both MME and DirectSound audio now pass through something called a kernel audio mixer (commonly known as a KMixer). KMixer is a kernel-mode component responsible for mixing all system audio together. But the KMixer also introduced delays of about 30 milliseconds, in fact, sometimes more. In order to reduce the delay caused by KMixer, WDM/KS scheme was born.
Advantage: the delay can be very low, generally the minimum delay can be to 1 millisecond ~10 millisecond, and under certain circumstances can use non-paging memory, direct hardware IRP and RT, exclusive sound card all resources.
Disadvantage:
1) It monopolizes all resources of the sound card, resulting in the hearing of only specific applications. When multiple applications are started, the voice of other applications cannot be heard.
2) The KS also has no audio input, which means the microphone cannot be used.
Note: KMixer has been deprecated after Vista and Windows7, and KS does not work with versions after Vista and Windows7.
1.4 Audio Stream Input Output (ASIO)
ASIO was originally a professional audio-level driver specification for Windows, developed by a German company called Steinberg.
Advantages: Provides high-quality, low-latency data paths directly from applications to sound hardware. For applications that can support ASIO, you can completely avoid all processing of the Windows audio stack, minimizing system response time to audio streams. In the case of ASIO, the buffer can be less than 10 ms depending on the setting, or less than 1 ms due to the better environment.
Disadvantages: If you’re trying to use an audio application that only supports ASIO and your sound card is cheap and lacks ASIO support, then using ASIO is a problem. The actual performance of ASIO depends on the quality of the driver supplied by the manufacturer.
1.5 Windows Core Audio
When Vista finally hit the shelves in 2007, Windows Core Audio came along. Microsoft claims that Vista /7 has begun to eliminate KMixer and DMA-dependent Audio IO. All the audio apis you knew and loved were shuffled and you suddenly found yourself building on top of this new user-mode API. This includes DirectSound, which at this point completely loses support for hardware-accelerated audio.
Advantage:
1) Low-latency, fail-over audio streams;
2) Improved reliability (many audio features have been moved from kernel mode to user mode);
3) Enhanced security (protected audio content is processed in a secure, low-permission process);
4) Assign specific system-wide roles (console, multimedia, and communications) to individual audio devices;
5) Software abstractions for audio endpoint devices (for example, speakers, headphones, and microphones) that users operate directly.
There are a variety of apis in Windows capture and play, but most of them are on top of Core Audio. In real-time Audio, you should use apis that are closer to the bottom (ASIO or Core Audio) to reduce latency. Core Audio is more applicable due to ASIO’s limitations. Therefore, most existing Windows Audio and video communication clients use Core Audio APIs to collect and play.
# 2
The Core Audio,
Windows Core Audio, not to be confused with OSX’s similarly named Core Audio, is a complete redesign of the way Audio is handled on Windows. Most audio components have moved from kernel to user mode, which has had a huge impact on application stability. Almost all Windows APIs are built on top of Core Audio.
2.1 Core Audio System kernel framework details
With the release of Core Audio, the new Audio system kernel framework was changed.
Figure 1 Block diagram of Audio system based on Core Audio
As you can see from the system architecture diagram, Core Audio APIs consist of four APIs — MMDevice, WASAPI, DeviceTopology, and EndpointVolume.
MMDevice API
The client discovers audio terminal devices, enumerates all available audio device properties and determines their functionality, and creates driver instances for these devices. Is the basic Core Audio API that serves the other three APIs.
WASAPI
Client applications can manage the flow of audio data between applications and audio terminal devices.
DeviceTopology API
A client can walk through the internal topology of an audio adapter device and an audio terminal device, and step through connections that link the device to another device. Using the interfaces and methods in the DeviceTopology API, clients can access layout features directly along the data channels in the hardware of audio adapters (for example, volume control along the data paths of audio terminal devices).
EndpointVolume API
The client can control and monitor the volume level of the audio terminal device.
The figure shows a simplified representation of how rendered audio data flows from most applications to speakers. For acquisition, the audio data takes exactly the same path, but flows in the opposite direction. As you can see from the figure, some high-level APIs (such as MME, DirectSound, etc.) encapsulate the Core Audio APIs to make it easier to fulfill certain application requirements. But for audio and video, you need to reduce latency and use lower-level apis.
After processing from the API, the audio stream takes two paths to the audio endpoint cache — Shared Mode and Exclusive Mode.
Shared and exclusive modes are a major improvement that Core Audio has brought.
Sharing model
The shared model has some similarities with the old KMixer model. In shared mode, the application writes to a buffer passed to the system’s audio engine. The audio engine is responsible for mixing all the application’s audio together and sending the mix to the audio driver. As with KMixer, this introduces delays. Audio engines sometimes not only need to convert audio data, but must also mix data from multiple shared-mode applications. This takes time, usually milliseconds. In most cases, the delay is undetectable.
Exclusive mode
Exclusivity is Microsoft’s answer to the professional audio world. An application in exclusive mode has exclusive access to the hardware, and audio data is transferred directly from the application to the driver to the hardware. Exclusive mode streaming bypasses the Windows Audio engine entirely. It effectively locks down all applications, and one of the obvious advantages of exclusive audio over shared mode is that the latency it creates is completely eliminated as the audio engine exits.
But the biggest drawback to exclusive streaming is that there is little flexibility with the audio format. Only the formats natively supported by the audio adapter can be used. If a data format conversion is required, the application will need to do it manually. It’s worth pointing out that exclusive mode streaming is not actually guaranteed to be available to the application. It is user configurable. Users can completely disable exclusive mode audio for a given audio adapter. The diagram below:
Figure 2 Audio device attribute diagram
The audio flow in the system diagram ends up in the audio adapter. Audio adapters rarely have a single input and/or output connection. In fact, audio adapters for most modern consumer PCS support at least three types of connections: headphones, speakers, and microphones.
In this chapter, you have seen the phrase audio stream, which refers to a connection between an application and an audio terminal device.
2.2 Device Management of Core Audio
2.2.1 Enumeration of devices
In the audio and video client device list, the customer usually sees a list of microphones and speakers available on the computer. As described above, enumeration of devices is controlled by the MMDevice API. You can use the MMDevice API to enumerate the number of devices and device attributes. Firstly, the enumeration instance of the audio device is obtained through the COM interface, and then the device attributes are obtained through the IMMDeviceEnumerator object.
constCLSIDCLSID_MMDeviceEnumerator=__uuidof(MMDeviceEnumerator); constIIDIID_IMMDeviceEnumerator=__uuidof(IMMDeviceEnumerator); IMMDeviceEnumerator*ptrEnumerator; hr=CoCreateInstance( CLSID_MMDeviceEnumerator, NULL, CLSCTX_ALL, IID_IMMDeviceEnumerator, (void**)&pEnumerator); You can get an IMMDeviceEnumerator object from the above code. With this object, the client can directly or indirectly access the MMDevice API including IMMDevice, IMMDeviceCollection, and IMMNotificationClient, which notifies the audio endpoint device of state changes.
IMMDeviceCollectionpCollection=NULL; hr=pEnumerator->EnumAudioEndpoints( dataFlow, // data-flow direction (input parameter) DEVICE_STATE_ACTIVE|DEVICE_STATE_DISABLED|DEVICE_STATE_UNPLUGGED, &pCollection); // release interface when done IMMDevicepEndpoint = NULL; hr=pCollection->Item(index, &pEndpoint); // Device Index Value Through pCollection and pEndpoint objects, the IMMDeviceCollection GetCount interface can be called to obtain the number of devices. Call IMMDevice GetId to get the device ID of the terminal port. If obtaining the device name requires a slightly more complicated operation, first obtain an IPropertyStore object through the IMMDevice OpenPropertyStore interface, and obtain the device name through the IPropertyStore GetValue. Through these methods, audio and video client can enumerate the audio terminal devices and information existing in the current Windows computer.
2.2.2. Enable the specified device
After enumerating devices, when users specify a specific device, the client generally selects the default system device as the collection/playback device. For the default device, Core Audio has a specific interface to turn on the default device by calling GetDefaultAudioEndpoint in IMMDeviceEnumerator. However, when the user assigns a specific device, the device can be opened through the IMMDevice Item interface.
2.2.3 Initializing devices
Device initialization is an important step in the entire worker thread. The client can create and initialize an audio stream between the audio and video client and the audio engine (for streams in shared mode) or the hardware buffer of the audio terminal device (for streams in exclusive mode). To create an IAudioClient object with the specified interface, call IMMDevice’s Activate method.
constIIDIID_IAudioClient=__uuidof(IAudioClient); IAudioClient*pAudioClient=NULL; hr=pDevice->Activate( IID_IAudioClient, CLSCTX_ALL, NULL, (void**) &pAudioClient); After the IAudioClient object is acquired, the device is initialized, but in the Initialize call, the client needs to specify a shared or exclusive mode for the stream, and control the flags created for the stream, the audio data format, the buffer size, and the audio session. Audio and video clients generally choose the shared mode, and the collection and playback are generally event-driven. The audio data format can be obtained by using the GetMixFormat interface of IAudioClient to obtain the default format, but the obtained default format does not necessarily meet the device format parameters required by the client. The IsFormatSupported interface of IAudioClient is called to query the most suitable device format parameters through the number of channels and sampling rate.
hr=pAudioClient->Initialize( AUDCLNT_SHAREMODE_SHARED, AUDCLNT_STREAMFLAGS_EVENTCALLBACK, hnsRequestedDuration, 0, pwfx, NULL); 2.3 Volume management of Core Audio
The volume control system in audio devices is primarily provided by the EndpointVolume API. Volume control requires the IAudioEndpointVolume object, which is obtained by the IMMDevice interface.
IAudioCaptureClientpCaptureClient=NULL; IAudioEndpointVolumepEndpointVolume=NULL; hr=pEndpoint->Activate(__uuidof(IAudioEndpointVolume), CLSCTX_ALL, NULL, (void**)&pEndpointVolume); The pEndpointVolume object handles volume controls and mute controls.
floatfLevel; //Get Volume pEndpointVolume->GetMasterVolumeLevelScalar(&fLevel); //Set Volume fLevel = 255.0; pEndpointVolume->SetMasterVolumeLevelScalar(fLevel, NULL); BOOLmute; //Get mute state pEndpointVolume->GetMute(&mute); //Set mute state mute=0; pEndpointVolume->SetMute(mute, NULL); 2.4 Core Audio Event listening management
Mainly is against 2.4.1 device event listeners listening devices spot news, by the IMMDeviceEnumerator call RegisterEndpointNotificationCallback interface can be implemented when the equipment status changes intercom can notice to the client.
IMMNotificationClient*pClient=NULL; ptrEnumerator->RegisterEndpointNotificationCallback(pClient); 2.4.2 by EndpointVolume volume event listeners call RegisterControlChangeNotify interface implementation
IAudioEndpointVolumeCallback*pVolume=NULL; pEndpointVolume->RegisterControlChangeNotify(pVolume); 2.5 Core Audio Threading model with Call-flow
After the device is initialized, the most important step comes next: the interaction of the collected/played data. But how should the data interact, and how do so many aspects of the collection and playback practice build a threading model?
2.5.1 Thread model
In real-time audio and video, it is necessary to obtain a real-time and efficient collection/playback. In order to prevent the two from blocking each other, a separate thread is generally created for collection and playback in real-time audio and video, which is called collection/playback thread. In addition, to prevent other threads from occupying resources, the priority of the collection/playback thread is generally set to the highest level. Of course, for device enumeration, device initialization and other low-density operations, generally completed in the worker thread. And volume management and event monitoring, are through the user to operate, will use a user thread to manage.
FIG. 3 Schematic diagram of each thread
2.5.2 collect Call – Flow
Take a look at the flow chart of acquisition.
FIG. 4 Flow chart of collection thread
In the figure, you can see that the microphone device collection is driven by the event event. After initializing the device, a startEvent SetEvent(startEvent) is set to start the microphone collection, and an IAudioCaptureClient object is generated. Call the interface through the IAudioCaptureClient object to get the microphone data.
// Get the IAudioCaptureClient object in the worker thread
IAudioCaptureClient*pCaptureClient=NULL;
hr=pAudioClient->GetService(__uuidof(IAudioCaptureClient),
(void**)&pCaptureClient);
// Collect threads
// Get microphone data
hr=pCaptureClient->GetBuffer( &pData, // packet which is ready to be read by used &framesAvailable, // #frames in the captured packet (can be zero) &flags, // support flags (check) &recPos, // device position of first audio frame in data packet &recTime); // The first audio frame // ProcessCaptureData(&pData); // Release the microphone data DWORDdwFlags(0); hr=_ptrCaptureClient->ReleaseBuffer(framesAvailable); 2.5.3 play Call – Flow
The flow chart of audio playback is as follows:
FIG. 5 Flow chart of playback thread
Speaker playback is also driven by the Event event, which also sets a startEvent to start. The difference is that the speaker needs to fetch the current device’s buffer first. If the buffer in the cache is full, the device will no longer ask for data to be used for speaker playback. When the device buffer is insufficient, a device pointer is obtained and data from the remote end is written to the address pointed to by the pointer. When the buffer is full, the loudspeaker plays the data.
// The worker thread gets the object of IAudioCaptureClient. IAudioCaptureClient*pRenderClient=NULL; hr=pAudioClient->GetService(__uuidof(IAudioRenderClient), (void**)&pRenderClient); UINT32padding=0; hr=pRenderClient->GetCurrentPadding(&padding); Hr =pRenderClient->GetBuffer(playBlockSize, &pData); RequestPlayoutData(&pData); // Release the speaker data DWORDdwFlags(0); hr=pRenderClient->ReleaseBuffer(playBlockSize, dwFlags); # 3
Precautions for using Core Audio
3.1 Windows has its own timing clock. In the process of collection, the data size of callback is different from the constant frameSize due to the problem of clock precision. For example, when the sampling rate is 44100Hz, the data size of callback in unit clock is 448, which is different from the constant 441.
3.2 Once the collection/playback thread is blocked, the processing time of the thread will be longer, and the data taken out of the collection/playback will be broken. In the case of playback, the user will hear a stutter.
3.3 When the GetMixFormat method is used to obtain the default device format, the format is usually specified as WAVEFORMATEX structure. However, WAVEFORMATEX’s architecture has some limitations. For some device formats with more than two channels, or higher bit depth accuracy, or new compression solutions, Microsoft recommends WAVEFORMATEXTENSIBLE for better support. IsFormatSupported results for WAVEFORMATEX and WAVEFORMATEXTENSIBLE are different for some device drivers. To get reliable device formats, Microsoft recommends using IsFormatSupported to iterate through both WAVEFORMATEX and WAVEFORMATEXTENSIBLE.
3.4 There are other Settings in the audio device, such as the BUILT-IN AEC. Built-in AEC is configured with additional functionality using the COdec DMO interface, and DOM may affect support for some device formats.
Related quotes:
1. Practical Digital Audio for C++ Programmers;
2.Core Audio APIs – Win32 apps ;
3.Core Audio APIs – Win32 apps ;
4.Configuring Codec DMOs – Win32 apps
The story of why and how Core Audio .NET came to be, in three chapters
Chapter 1: The Good Ol’ Audio Mixer API
Up until Vista, Windows used the same API from Windows 95 to let third-party programs consume a series of resources (exposed by the drivers) that allowed them to discover and manipulate all the available settings on a sound card.
This API was known as the Audio Mixer API and it relied on the ability to first, enumerate all the available mixers (references to the sound cards, as exposed by their drivers) and then query them for the lines they exposed. Each line then had its own set of objects, called controls, which were a representation of the physical properties on the sound card, such as the volume of the microphone, the mute state of the CD line, etc.
A control could also expose a series of items that provided access to some advanced feature on the sound card, such as a loudness setting, or peak meter, or some other resource that was exposed by the sound card’s driver.
Sounds simple, right? Well let me tell you that it was anything but.
This API is awful, unorganized, complicated and worst of all, back in the day, it lacked documentation. No, the documentation wasn’t scarce, poorly documented or lacked samples — it simply didn’t exist. There were mentions, here and there, about the members, structures and other topics but there was nothing concise about it.
Microsoft’s mixapp
I’m sure Microsoft partners and affiliates of some sort had access to some privileged documentation, but we (the freelance developers) did not.
So, if you wanted to implement some sort of, for example, volume control on your application your best bet was to get a copy of the Windows SDK and check the source code for an application that it included, appropriately named mixapp.
This little program (just 61 KiB in size) was made available as a compiled binary accompanied by its C source code. Unfortunately, the code wasn’t documented!
Anyway, the little mixapp program did everything I mentioned before:
- Enumerate all the mixers
- Enumerate their lines
- Enumerate each line’s controls
- And, finally, it exposed (if any) the item controls inside each control
I had already spent a considerable amount of time fiddling with the API (without any valuable results) and seeing this (so damn small) program do what it did, appeared as a magic to me.
Fast forward three weeks (yep, that’s what it took me to fully understand how the API worked) and I had the first working prototype… in VB6.
Chapter 2: EQPro
The first incarnation of the wrapper I created (to allow VB6 programs to access the mixer API) was called EQPro.
EQPro, released on October 5th 1998, was an ActiveX control that let you manipulate, through a Slider, a specific volume control from any line in any of the installed sound cards. It was a hit. Actually, it was the very first program I sold on the Internet.
Then on June 26th 1999 the first wrapper for the Audio Mixer API was sold for US$75.00
For me, EQPro was more than enough as it provided all the functionality I needed for one application I was working on: the very first dual MP3 player intended for DJs. Yep, you read that right. I was, probably, the first to ever develop such an application.
xmPlayer, as in Xavier and Marino Player
But the control sold like crazy and its users wanted a lot more control, more flexibility and, above every thing else, the ability to have full access to all the resources the sound card’s drivers could provide.
That’s how EQPro 2.0 came to be.
Version 2.0 was no longer a user control, instead, it was distributed as a class library that exposed all the functionality in the Audio Mixer API.
From that moment on, full control over every single resource on a sound card was accessible to anyone developing in VB6.
Then, on October 8th 2003, EQPro was discontinued in favor of MixerPro, which was a considerable enhancement over the way the wrapper interfaced with the API, providing a set of useful functions to easily perform advanced queries to detect and manipulate all the resources exposed by the driver of any sound card. MixerPro made the Audio Mixer API so accessible that even AudioScience, makers of some of the most recognized professional grade audio boards in the market, decided to use it for their own mixer-related applications. If you would happen to have owned an ASI5111 PCI Sound Card (for example), then the mixer app distributed with this sound card was developed using MixerPro.
And MixerPro remained, mostly unchanged, as the most powerful ActiveX library for manipulating a sound card’s mixer for almost a decade until Microsoft released Windows Crap… I mean Vista.
Chapter 3: …and support for Core Audio was implemented
On November 10th 2004 I released a .NET version of MixerPro, called MixerProNET.
The .NET version was nothing more that a port of the VB6 version to .NET with some minor enhancements.
Fast forward 3 years and Microsoft releases Windows Vista. Among many annoyances, Vista implements a completely new mixer API which, according to Microsoft, should enhance and enrich the end-user experience… when… using the volume from the taskbar?
Like it or not, Vista, among other things, was Microsoft’s attempt to prove to the media industry that it was a safe platform for media consumption and that its users wouldn’t be able (or at least, not as easily as with XP) to steal copy protected content. Sure there are those who claim this isn’t true but I guess they aren’t developers.
The old API was convoluted but it worked and I’ve never heard any sound card manufacturer complain about it. So, why was it modified so drastically? What was the point? Who had the necessity?
And if I’m wrong, why is the infamous “Stereo Mix” device disabled by default under Vista (and Windows 7)?
Anyway, this wasn’t supposed to be a rant but an explanation of how and why MixerProNET got to have Core Audio support.
The why is easy to explain: simply because the legacy support that Microsoft left in Vista and 7 simply sucks. Any application making use of the Audio Mixer API under post-Vista is as good as nothing.
Just take a look at how the nice little mixapp behaves under Windows 7:
mixapp running under Windows 7
Where do I start…?
- First of all, mixerapp cannot properly enumerate all the mixers
- The Speakers line only exposes a volume and a mute switch. But this is incorrect, since the same sound card under different operating systems exposes additional controls
- The volume control is exposed as a monophonic control when I perfectly know that my sound card is very well capable of independently controlling the output volume for all the channels it supports. Why would the proprietary mixer application distributed with the sound card expose a balance control then?
So what’s happening here? The thing is that Microsoft introduced what’s known as sessions, which are nothing more than instances of programs that in one way or another, are either streaming or capturing audio through a sound card and what this does is that the legacy mixer APIs are only applied to the current session.
Useful? Well, maybe… it’s kind of nice, as an end-user, to be able to control the volume of the System Sounds independently from other programs but that doesn’t condone the way the legacy support was implement.
Why didn’t Microsoft simply extend the legacy API to support sessions?
When Vista was released, I learned all about these changes and I decided that I wouldn’t lift a finger to add support for the new APIs.
But then came Windows 7.
It was a pleasure to use my computer again — streamlined, smooth, fast, organized, consistent… what else can we say about this marvelous piece of technology?
Finally! Microsoft got it right.
So I guess I felt encouraged again and I began to study the new API.
A week or so later I decided that it just wasn’t worth it — sure it had many new exciting features but, again, it was unnecessarily convoluted as hell.
Fast forward a couple years… until one day, sitting in-front of my computer with nothing else better to do I decided to give it another chance.
This time instead, I did a search to see what others had done about it and found almost nothing useful. I found almost nothing except for an attempt to create a full Core Audio wrapper in C# by Ray Molenkamp from CodeProject.
After downloading his library and analyzing it I realized that he had already done all the hard work. The library wasn’t MixerProNET ready yet, but it was close. So, in just one sitting I implemented all the interfaces that were required for MixerProNET to be fully Vista and Windows 7 compliant. Of course, thanks to the marvelous job already done by Ray.
In less than 48 hours I had implemented all the required interfaces to query and manipulate any sound card, under a post-Vista version of Windows.
The next step was to enhance MixerProNET so that it could support the new Core Audio API and this was done by adding a series of new classes that interfaced the functions in the Core Audio library.
And that’s how MixerProNET 2.0 came to be.
MixerProNET 2.0 exposes a two-tier abstraction layer over the actual API. The first one by the Core Audio .NET library and the second one through the CCoreAudio class implemented in MixerProNET. This provides the end-user (that who has already had some experience with previous versions of the MixerProNET library) with a familiar interface of classes, collections and methods.
Actually, those who have never used MixerProNET will find that the way it exposes the Core Audio APIs is extremely simple and convenient, since the code behind the control does all the hard, complicated and convoluted work for you.
Let me give you an example:
A mixer device, in Core Audio, may expose what’s known as an IAudioEndPointVolume for controlling its main volume, with its own specific callbacks (for notification changes).
The same mixer may expose several additional controls (such as a control for the Line-In volume) of type IAudioVolumeLevel (among others, depending on the type of control).
A session exposes an ISimpleAudioVolume. Yet another interface to control both volume and mute states.
Why all these different interfaces to perform the exact same action? Who knows. But the fact is that MixerProNET’s implementation exposes a single object regardless of its source allowing your to query it and manipulate it without having to worry about the object’s source.
Basically, this means that if you want to mute a control (a control that supports muting, that is) you just set a “Mute” property to “True” and you are done. It’s that easy.
The next version of MixerProNET will include a series of WinForm user controls that mimic the look & feel and behavior of the controls Microsoft uses in their mixer-related applications.
Here’s an example:
MixerProNET 2.5
I hope you haven’t just scrolled all the way down here without having read the article, as it explains a lot about what you are about to download.
- MixerProNET
- Core Audio .NET Wrapper (11906 downloads )
- GitHub Repository
Core Audio is a low-level API for dealing with sound in Apple’s macOS and iOS operating systems. It includes an implementation of the cross-platform OpenAL.[1]
Core Audio
Developer(s) | Apple Inc. |
---|---|
Initial release | 2003 |
Stable release |
3.2.6[citation needed] |
Operating system | macOS, iOS |
Type | Developer library |
License | Proprietary |
Website | https://developer.apple.com/documentation/coreaudio |
Apple’s Core Audio documentation states that «in creating this new architecture on Mac OS X, Apple’s objective in the audio space has been twofold. The primary goal is to deliver a high-quality, superior audio experience for Macintosh users. The second objective reflects a shift in emphasis from developers having to establish their own audio and MIDI protocols in their applications to Apple moving ahead to assume responsibility for these services on the Macintosh platform.»[2]
It was introduced in Mac OS X 10.0 (Cheetah).[3]
Core Audio supports plugins, which can generate, receive, or process audio streams; these plugins are packaged as a bundle with the extension .component.[4]
- Audio Units
- PulseAudio
- ^ «Core Audio Overview: OpenAL (Open Audio Library)». Apple Inc. February 11, 2014. Retrieved January 28, 2015.
- ^ «Audio and MIDI on Mac OS X» (PDF). Apple Computer. May 29, 2001. p. 7. Archived from the original (PDF) on March 11, 2008. Retrieved August 13, 2020.
- ^ «Apple Developer Documentation».
- ^ Singh 2006, p. 78.
- Singh, Amit (June 19, 2006). Mac OS X Internals: A Systems Approach. Addison-Wesley Professional. ISBN 978-0-13-270226-3.
- Apple’s Core Audio page