Active Speaker Tracking Technology Explained

Author: Tyler Cox

As business increasingly moves online, technology's role in facilitating communication and collaboration is becoming more and more critical. A meeting may be held at the company headquarters, say, in Chicago, with participants joining anywhere from New York to Tokyo and all points in between.

But as technology becomes more of a factor in ensuring smooth communication, it's also essential that the technology work seamlessly. Outfitting a conference room with an audio/video system isn't of much value if meeting hosts have to spend more time managing the technology than they do conducting business.

If you're preparing to outfit a conference room for around-the-globe collaboration, a key consideration is choosing equipment that automatically provides the best communication experience. That starts with active speaker tracking cameras.
 

The Benefits of Using Speaker Tracking Cameras 

Let's face it: The downside of many technologies is that what we can do with them is limited by that technology's capabilities. A communication platform that incorporates static cameras and simple microphones, for example, locks a speaker into remaining in one spot. If they stand up and move around, they run the risk of moving out of the camera and microphone range. Although those in the room may still see and hear them, remote participants are suddenly shut out of the discussion, potentially missing out on critical information.

That's not the way many of us conduct ourselves. When leading a conference or delivering a presentation, most people like to stand up and move around the room.

By incorporating a speaker tracking camera, meeting hosts are no longer anchored to one spot.

 

Understanding Active Speaker Tracking Technology

Speaker tracking cameras incorporate facial recognition technology, matching it with the audio input from microphones to track the speaker's location. As the speaker moves, the camera automatically follows them, essentially acting as a built-in cameraman.  

Facial recognition isn't enough, though. Most speaker tracking cameras rely on voice activity detection technology to effectively track speakers as they move around the room. The better the voice detection, the more accurate the camera is in tracking a speaker's movements.

Voice activity detection is a technology that detects human speech even when background noise is present. The ADECIA Ceiling Microphone Conference System, for example, incorporates voice activity detection technology in its microphones that understands what frequencies are reverberant in the room, how large the room is, and how long it takes for the audio to travel from the speaker to the microphone.

When setting up ADECIA for the first time, the solution goes through an autotuning process that automatically configures it to the environment, ensuring that it's already set up and ready to go when a meeting begins.

Consider the changing way education is being delivered. Remote learning is becoming more common, and educators face the challenge of keeping students engaged as they sit at their computers. That's where ADECIA can help.

ADECIA has narrow, tracking beams that utilize Yamaha's Human Voice Activity Detection (HVAD) to capture the audio no matter where a speaker may be in the room. Multiple participants can be picked up with no worry about microphone placement, ensuring there are no dead zones. Teachers can roam freely throughout the room to engage both in-person and online participants equally. The system can also capture student participation for a more natural experience.

If you're considering deploying a communication platform for your conference room, auditorium, or lecture hall, contact the experts at Yamaha UC. We're here to help!