Beginning Game Audio Programming [Electronic resources] نسخه متنی

THE BEGINNINGS OF AN AUDIO ENGINE

Finally, no more concepts to learn for this chapter! Now you can start learning how to put together the first version of your audio engine. Check out the Ch3p1n_WAVPlayback sample program, which uses our budding audio engine to play a WAV file.

Exceptions in the Audio Engine

I know, you've just come through my exception handling rules and are probably sick of me rambling on and on about exceptions. But, this is where it gets interesting; in this section you'll put all that theory to use and build the error-handling logic of your engine.

All of the code I'm about to explain resides in the ErrorHandling.h and ErrorHandling.cpp source files, if you want to follow along.

The CError Object

It all starts with a CError object. Most of the time, when programmers use exceptions, they don't throw primitive data types like strings or integers. Instead, they throw error classes, which contain not only the error, but some diagnostic information and context about why and when the error occurred. This book's audio engine stores that information in a CError object, outlined here:


class CError
{
public:
CError(HRESULT hr, std::string err, std::string filename, int line) {
SetFile(filename);
SetError(err);
SetLine(line);
if (hr) SetReason(DXGetErrorString8(hr)); 
}
virtual ~CError() { }
std::string GetFile() { return(m_File); }
void SetFile(std::string f) { m_File = f; }
std::string GetError() { return(m_Error); }
void SetError(std::string f) { m_Error = f; }
int GetLine() { return(m_Line); }
void SetLine(int l) { m_Line = l; }
std::string GetReason() { return(m_Reason); }
void SetReason(std::string f) { m_Reason = f; }
std::string GetMessageBoxString();
protected:
std::string m_File;
int m_Line;
std::string m_Error;
std::string m_Reason;
};

As you can see, CError is a pretty simple class. It keeps track of the filename of the source file where the error occurred (m_File), the line number within that source file (m_Line), an error string displayable to the user (m_Error), and an internal diagnostic message (m_Reason). It has accessors for all of its members, a shortcut constructor that will automatically call DXGetErrorString8 if needed, and a GetMessageBoxString member that formats the variables into a user-displayable text message.

Error Handling Defines

I've written some #define magic to make dealing with CErrors a little easier. I based the #define code you're about to see on an excellent article by Steve Rabin called "Squeezing More Out of Assert," published in Game Programming Gems 1 (ISBN 1-58450-049-2), an excellent book I recommend for any serious game programmer (and not just because I wrote a couple of articles for it!).

Inside ErrorHandling.h are two defines called ThrowIfFailed and Throw (notice the capital T to avoid clashing with the throw keyword). I'll explain Throw first because it's easier:


#define Throw(err) { if (!ThrowCError(NULL, err, __FILE__, __LINE__)) { _asm { int 3 } } } \

Throw is just shorthand for a call to the ThrowCError global function. This is how CError gets its m_File and m_Line variables—the #define automatically supplies the current filename and current line number, thanks to the built-in __FILE__ and __LINE__ macros.

The ThrowCError function is responsible for creating the CError object:


bool ThrowCError(HRESULT hr, std::string err,
std::string filename, int line)
{
CError e(hr, err, filename, line);
string displaystr = e.GetMessageBoxString();
displaystr += "Press ABORT to end the program, RETRY to debug, IGNORE to throw the error.";
int result = MessageBox(NULL, displaystr.c_str(),
"ErrorHandling.cpp", MB_ABORTRETRYIGNORE | MB_ICONSTOP);
switch(result) {
case IDABORT: // end the program immediately
exit(-1); // could also use abort() here
case IDRETRY: // break into the debugger
return(false);
case IDIGNORE: // continue as usual (throw the error)
throw(e);
}
return(true); // just to avoid compiler warnings 
}

ThrowCError's job is to present a message box with Abort, Retry, and Ignore buttons (see Figure 3.4). If the user selects Abort, the function immediately exits. If he chooses Ignore, the code throws the error as if the message box was never displayed. If he selects retry, the function returns false. Look back at the Throw define—if ThrowCError returns false, we execute (in assembly) the int 3 instruction. The int 3 instruction fires off the debug interrupt, which, when the program's running under a debugger, causes execution to break immediately.

Figure 3.4: The message box presented by our errorhandling code.

So, why not just put the int 3 directly into ThrowCError? If it were part of ThrowCError, whenever you hit retry, you'd break into the code inside the ThrowCError function, and you'd have to move up the stack one spot to see what actually threw. With the int 3 as part of the define, when you break, you're taken immediately to the source of the error. It's a neat trick, and one for which I must give credit to Steve Rabin.

Now that you understand Throw, here's the more complicated throwiffailed:


#define ThrowIfFailed(result, err) { if(FAILED(result)) { if (!ThrowCError(result, err, __FILE__, __LINE__)) { _asm { int 3 } } } }

This define does what Throw does, but only if the HRESULT you give it is an error code. If FAILED(result) returns true, the code passes the HRESULT to ThrowCError, which in turn gives it to the CError constructor. The constructor looks up the code and its corresponding error string into m_Reason.

These two defines come in very handy, so you'll see them sprinkled liberally throughout the audio engine. All together, this small error-handling system gives you a lot of robustness. Sure, you could always add more things to it, but I've found that just this simple scheme really helps when something goes wrong.

Building CAudioManager

Exception handling toolbox firmly in hand, you're now ready to start writing CAudioManager, the class that will manage all of your game's audio. The biggest engine begins with a single header file, and this one is no exception. Here's the class declaration for CAudioManager:


class CAudioManager 
{
public:
CAudioManager();
virtual ~CAudioManager();
void Init(HWND hwnd, bool stereo = true);
void UnInit();
CSoundPtr LoadSound(std::string filename);
IDirectMusicPerformance8* GetPerformance() { return(m_Performance); }
protected:
bool m_InitGood;
IDirectMusicLoader8* m_Loader;
IDirectMusicPerformance8* m_Performance;
static const int CAudioManager::m_PerfChannels;
};

Obviously, the finished engine will have more methods than this, but this is a good start. You can see the methods for initializing and un-initializing the engine, as well as loading a sound. There's also a way to get the performance interface directly.

Internally, CAudioManager keeps track of the performance and loader interfaces, as well as an m_InitGood flag that keeps track of whether the manager has been initialized properly. The m_PerfChannels constant represents the number of performance channels, also known simplistically as the number of sounds you can play at once.

Tip

Your CD contains several source code snapshots of the audio engine as it stands for each chapter. That is, the Chapter 3. The Audio Manager files in the Chapter 4 directory contain this chapter's functionality, plus the Chapter 4 enhancements. This allows you to follow along as I add features to the engine, and gives you several different points along the code history at which you can diverge and start writing things your way.

If you want to take this book's audio engine and run with it as is, use the finished audio engine contained in the AudioEngine folder of your CD. It has all the features.

Init

CAudioManager uses two-phase creation. Two-phase creation is a programming term meaning that to create an object requires not only a constructor call, but also a call to an initialize method. I like to call this method Init, but I've seen other programmers call it Create, Setup, Initialize, or Reset.

Two-phase creation is a smart idea for a couple of reasons. First, it's great for consoles and lowmemory platforms like PDAs. On these devices, you want to minimize the number of times your game allocates memory. So, you can use two-phase creation to initialize every object your game will ever need, once, as your program begins. Once the actual game starts, no calls to new are allowed. Instead, the code simply calls the object's Init method, which restores the object to its pristine state. It's like getting a new object without all the fuss of memory allocation.

This book's audio engine doesn't really care about that, but it does care about the second reason to use two-phase creation: control over when an object gets created. To make the engine as flexible as possible, it allows you to control exactly when it gets initialized. So, the engine uses twophase creation.

The Init method looks like this:


void CAudioManager::Init(HWND hwnd, bool stereo, int perfchannels) 
{
HRESULT hr;
// initialize COM
hr = CoInitialize(NULL);
ThrowIfFailed(hr, "CAudioManager::Init: CoInitialize failed.");
// Create the loader
hr = CoCreateInstance(CLSID_DirectMusicLoader, NULL, CLSCTX_INPROC,
IID_IDirectMusicLoader8, (void**)&m_Loader);
ThrowIfFailed(hr,
"CAudioManager::Init: CoCreateInstance for loader failed.");
// Create performance object
hr = CoCreateInstance( CLSID_DirectMusicPerformance, NULL, CLSCTX_INPROC,
IID_IDirectMusicPerformance8, (void**)&m_Performance);
ThrowIfFailed(hr,
"CAudioManager::Init: CoCreateInstance for performance failed."); hr = m_Performance->InitAudio(NULL, NULL, hwnd, stereo ? DMUS_APATH_DYNAMIC_STEREO
: DMUS_APATH_DYNAMIC_MONO,
perfchannels, DMUS_AUDIOF_ALL, NULL);
if (hr == DSERR_NODRIVER) {

// output a warning message, then continue as usual
MessageBox(hwnd,
"The program could not locate your audio hardware.",
MB_ICONSTOP);
return; // notice we didn't set m_InitGood true
}
else ThrowIfFailed(hr,
"CAudioManager::Init: m_Performance->InitAudio failed.");
m_InitGood = true;
}

Of the three function arguments, two are optional—only a window handle is required. If you want, though, you can also specify whether the performance's audiopath should be stereo or mono, and the maximum number of channels it should prepare for. Channels are cheap (unless they're being played), so it's okay to pick more than you need. The default is 128.

The code starts by initializing COM, and then grabbing an interface to a DirectMusic loader. Once it's got the loader, it gets a performance object. All of these calls use CoCreateInstance, a COM API function that you can use to create any COM object. Some of the other components hide all this CoCreateInstance stuff inside functions, but with DirectMusic, there are just too many COM interfaces to make that practical, so you have to deal directly with the CoCreateInstance function. Essentially, CoCreateInstance creates an object of a certain class, and an interface used to talk to that new object. From first to last, the arguments specify a class ID (a GUID usually named starting with CLSID_), which identifies the object to make; an aggregate pointer (usually NULL); a context (usually CLSCTX_INPROC, which means the new object shares the same address space as the creator code, like a DLL); an interface ID (a GUID usually named starting with IID_), which identifies the exact interface you wish to use to talk to your newly created object; and finally, the address of a pointer that will hold the new interface. Once the code has created all its objects, it calls the InitAudio method of the IDirectMusicPerformance8 interface it just created. InitAudio initializes the performance and sets up a default audiopath. You can see the code checking the stereo parameter and specifying either DMUS_APATH_DYNAMIC_STEREO or DMUS_APATH_DYNAMIC_MONO. I should mention that there are two other options including DMUS_APATH_DYNAMIC_3D (for creating 3D sounds, which you'll learn in a few chapters) and DUS_APATH_SHARED_STEREOPLUSREVERB, most commonly used for enriching music by using reverb.

Notice that the code checks InitAudio's return value for DSERR_NODRIVER. This is a special error that indicates that DirectX Audio couldn't find any sound hardware. When it sees DSERR_NODRIVER, the code pops a message box informing the player that his audio hardware couldn't be found.

Assuming nothing fails and, therefore, no errors are thrown, the function ends by setting the m_InitGood flag to true, so that future functions know whether the manager is in good shape to play audio.

UnInit

The UnInit function of CAudioManager is much simpler:


void CAudioManager::UnInit()
{
if(m_Performance != NULL) {
m_Performance->Stop( NULL, NULL, 0, 0 );
m_Performance->CloseDown();
SAFE_RELEASE( m_Performance );
}
SAFE_RELEASE( m_Loader );
m_InitGood = false;
}

The code begins by calling Stop and CloseDown on m_Performance, assuming it's non-NULL. It then releases the m_Performance and m_Loader objects and clears the m_InitGood flag.

LoadSound

So far, the most important method of CAudioManager is the one that loads a wave file and creates a sound object from it. Here's what that looks like:


CSoundPtr CAudioManager::LoadSound(std::string filename)
{
HRESULT hr;
CSound *snd = new CSound(this);
// convert filename to wide-string
WCHAR widefilename[MAX_PATH];
DXUtil_ConvertGenericStringToWide( widefilename, filename.c_str());
// tell loader to load this file
hr = m_Loader->LoadObjectFromFile(
CLSID_DirectMusicSegment,
IID_IDirectMusicSegment8,
widefilename,
(void**) &snd->m_Segment);
ThrowIfFailed(hr,
"CAudioManager::LoadSound: LoadObjectFromFile failed.");
return(CSoundPtr(snd));
}

This is much easier than it would be in DirectSound. That's because the DirectMusic loader groks wave files, making this code a two-step process. First, it converts the given filename into a wide character string (DirectMusic only operates with wide character strings). Next, it tells the loader to load the wave file by calling LoadObjectFromFile. It gives the loader the class ID (CLSID_DirectMusicSegment) and interface ID (IID_IDirectMusicSegment8) that it expects back, and the loader gladly obliges.

Tip

Think of a wave file as a one note segment of music, played using one instrument. You still get a segment interface, but it's really just a wave file.

Once it has the segment interface, it stores it in CSound's m_Segment variable. (CAudioManager is a friend class to CSound, so it has access to CSound's protected variables.)

You'll learn about the last line of code in the next section. Just put that return line on your brain's back burner for now.

Smart Pointers

On the list of critical engine features to implement—right behind robustness and error handling—is resource management. A good engine must make it easy for its clients to create objects, use them, and dispose of them when no longer needed.

One of several good ways to do this is to use smart pointers. A smart pointer is a class that behaves like a normal pointer, but with one difference: it's smart! It is so smart that it knows when to delete objects that are no longer being used.

Many programmers use smart pointers to help ease the burden of memory management, especially when also using exceptions. If you'd like a refresher course on what a smart pointer is, consult the links on your CD.

The smart pointers for the audio engine all derive from a common base class, which also happens to be a template. This template base class is CRefCountPtr, and is implemented in RefCountPtr.h, which is based closely on code by David Harvey. On the CD, I've included a link to the article that originally accompanied the code; go there if you want to learn how it works.

CSoundPtr is a derivation for CSound pointers. I've provided a constructor and a Release method that output messages to the debug console, so you can see (in real time!) sounds getting created and destroyed.

CSound

In the audio engine, a CSound object represents a sound effect. As you now know, at the core of this object is a DirectMusic segment interface which contains the wave file itself. CSound wraps this interface.

I decided that I didn't want to follow DirectMusic exactly; instead, I wanted a play method for CSound so that I could say, "sound object, play thyself!" The alternative would have been to put the Play method inside CAudioManager, and pass a CSound as an argument. The reason I chose not to do that was mainly personal preference; it just seemed cleaner to me that way.

Playing a Sound

Putting the Play method inside CSound required me to expose the performance interface. That's why CAudioManager has a GetPerformance interface. The CSound object uses that method, then calls PlaySegment, passing its own m_Segment:


bool CSound::Play()
{
if (NULL == m_Segment) return(false);
if (NULL == m_Manager) return(false);
m_Segment->Download(m_Manager->GetPerformance());
m_Manager->GetPerformance()->PlaySegment(
m_Segment, 0, 0,
(IDirectMusicSegmentState **)&m_SegmentState);
return(true);
}

The heart of this method is the PlaySegment call. To get a segment to play, you have to provide the performance interface with the segment you want to play (m_Segment), any flags you want (none in this case), and the time it should start playing (zero, for "as soon as possible"). The function gives you back a IDirectMusicSegmentState interface, which you can use to query the segment as it's playing.

Downloading Sounds

Notice, in the preceding Play code, the line right before the call to PlaySegment. Before anything can be played, it must be downloaded to the synthesizer. This is very important because if you don't do it, no sound will play. You download a segment by calling its Download method, passing the interface to the performance you will eventually use to play it.

Tip

Sick of having to remember to download your segments? You can tell DirectMusic to automatically download segments when it needs to. You do this by calling the SetGlobalParam method of the IDirectMusicPerformance8 interface and passing GUID_PerfAutoDownload. Check out the DirectX docs for the details.

Determining If a Sound Is Playing

The most interesting thing you can do with the IDirectMusicSegmentState interface you get back from PlaySegment is to determine whether or not your segment is still playing. Simply call the IsPlaying method of your performance interface, and give it the segment state interface you're interested in. IsPlaying will return S_OK if the segment is still playing, or S_FALSE if the segment isn't playing.

Be careful when calling this on a segment you've just started playing. IsPlaying is sometimes too accurate; it simply tells you whether or not sound from that segment is coming out the speakers. If you've just called PlaySegment, IsPlaying may return false because of latency—you've told the sound to play but it hasn't made it to the speakers yet.

Unloading Sounds

Once you're done with a sound, you need to unload it by calling the Unload method of the segment interface. This unloads the segment's data from the performance.