Posted By Mike Ash on January 12th, 2009
In his post announcing Pulsar, Paul mentioned that it was our first public exposure of our new technology called AHKit2. I’m going to talk a bit about what it is, why it’s here, and what it will do for you.
AudioHijackKit is the full name of one of our internal frameworks, and it’s called AHKit for short. AHKit is the high-level audio processing framework that drives Audio Hijack Pro, Airfoil, and Nicecast.
Despite the name, the framework is not only for “hijacking”, the term we use to describe grabbing the audio out of any application on your system. AHKit also provides facilities for grabbing audio from input devices, applying audio effects, showing level meters, recording to a file, and some support functions like timed actions. Pretty much every feature you see in Audio Hijack Pro’s session view corresponds directly to a feature in AHKit.
Why a new version?
AudioHijackKit still works just fine but it was starting to get a little limited. The main problem was that it has a very linear idea of audio processing. It processes audio with a pipeline: audio goes in one end and comes out the other, used or altered by each stage in the pipeline. For example, Airfoil’s pipeline looks something like this:
Audio Source -> License Enforcer -> Effects
-> Speakers -> Output Meter
Audio starts off with the license enforcer, which is the module that overlays noise after 10 minutes if you haven’t unlocked the full version with a license key. From there, it next goes to the audio effects, where the audio is adjusted by the equalizer and other controls. After going through the effects, the resulting audio then goes out to the local and remote speakers. Finally, it goes to the output meter, which is that little moving levels bar at the bottom of the Airfoil window.
An astute reader might now wonder: why does the audio have to pass through the speakers before going to the meter? Can’t it just go straight to both? That’s due to the linear nature of the pipeline. We could switch the order around, to go to the meters and then the speakers, but that doesn’t really make any more sense. Ultimately, AHKit requires that the audio flow in a straight line, and it can’t just go out of one module and go into two others at the same time.
Now compare this with the graph from Pulsar. This is taken directly from a comment in Pulsar’s source code:
This follows a much more logical path. Audio comes in from the internet through the streamer. From there it goes straight to our meters. It also goes straight from the streamer to the license enforcer, from there through the volume control to the output, which plays it out your speakers.
Of course this could have been done with the old-style pipeline too, by putting the meters inline with the rest and having them output the audio that they receive. But the idea is that this is much cleaner and simpler, and for more complex cases it opens up far greater possibilities.
Audio Hijack Pro offers a huge, powerful, and intimidating effects tab for every session. Using that effects tab you can mix in audio from multiple sources, apply effects to pieces of them, and pretty much just run your own mixing board. It’s a lot more flexible than this AHKit pipeline I’ve been discussing, so how does that work? It ends up doing something like this:
Input -> Effects -> Output
To AHKit the effects appear as a single monolithic module, and that module then implements all of these fancy patches and routing and such inside itself.
When we started building AHKit2, we thought about the effects system and we wondered why we needed two different systems at all. Why not just integrate everything together into one? So that’s exactly what we did. AHKit2 provides functionality similar to the Audio Hijack Pro effects tab, although even more powerful and flexible, and uses that functionality as the basis for everything.
In AHKit, there is a sort of artificial partition between effects and the rest of the processing stream. Effects actually belong to a separate framework called sw4fx, which in turn loads every effect from an external plugin. The plugin architecture used by sw4fx is rather difficult to work with, so we only rarely build new ones. If we need to add effects processing to an application (for example, to add a balance control) and that application doesn’t already use sw4fx, it’s a bunch of trouble to add sw4fx, hook everything up, get the plugins installed in the right place, and get it all up and running.
In AHKit2, in contrast, everything is a “node”, which is an Objective-C class with a pretty simple interface. Creating new effects is as simple as creating a new node subclass. Any AHKit2-using application can use any AHKit2 node with ease. This should make it much easier for us to extend our software in the future, which in turn means that you get to see more great new features from us.
AHKit2 got its public debut as part of Pulsar, which serves as a good testbed for the basic functionality. We’re learning how it works out in the real world and, yes, we’re finding some new bugs.
Pulsar uses only a small portion of AudioHijackKit2’s capabilities, but we have big plans. Imagine setting up a graph like this:
/ / \ / \
Level Meters / \ / \
/ \ / MP3 Recorder
MP3 Recorder Net Streamer
This gives you a nice interview-recording setup, with each side of the conversation going into a separate file, and the whole thing being mixed together and sent out to a remote computer over the network in real time.
AHKit2 doesn’t have all of these features yet, but the foundation is there. And while we can’t make any specific promises at this point, we hope that it will enable some great new features in our applications in the months and years to come.