[MUD-Dev] [TECH] Voice in MO* - Phoneme Decomposition and Reconstruction

John Buehler johnbue at msn.com
Thu May 23 01:53:59 New Zealand Standard Time 2002

Mike Shaver writes:
> On Tue, May 21, 2002 at 09:59:26AM -0700, John Buehler wrote:

>> With real speech, multiple people can be talking at once, and
>> their voices can overlap and be perfectly intelligible.

> While that's true in a cocktail party situation, I find that it
> isn't as true in a conference call/speakerphone situation.  I'm no
> sensory theorist, but I suspect that it's at least partially
> related to the loss of spatial cues, which make it easier for the
> brain to "demux" the different audio streams.

When I did my four-voice test, I had no problem isolating the
different voices.  I've also had the experience of dealing with a
conference call, and I suspect that it just boils down to the
quality of the sound.  But as has been mentioned elsewhere, visual
cues are a big win in a complex setting (moving speakers, off-screen
speakers, etc).

> Games today are certainly able to do spatial audio stuff, but
> unless the listening player has a very finely tuned setup, I
> suspect they're going to lose a lot of "spatial resolution".

I'll just encourage you to read my other post on this topic.  Visual
cues are going to be needed to take up the slack for the fact that
we're watching everything from 50 feet up in the air instead of
actually being alive in the virtual world.


