Skip to content

Active Speaker Detection in LiveSwitch

Jacob Steele Jun 9, 2023 12:22:45 PM

 

In this guide, we will walk you through the process of detecting when a user is speaking and implementing a visual indicator for the active speaker in LiveSwitch. While the code samples provided are in JavaScript, the concepts discussed can be applied to other supported platforms as well.

 

Prerequisites

Before diving into the implementation, please ensure that the following prerequisites are met:

  • Local and Remote media objects must contain an audio stream.
  • The connection type is SFU (Single Forwarding Unit) or P2P (Peer-to-Peer). This guide is not applicable to MCU (Multi-point Control Unit). 

 

LocalMedia

To detect when you are speaking, we listen for the event addOnAudioLevel of the LocalMedia object.

let localMedia = new fm.liveswitch.LocalMedia(true, true); // audio: true, video: true.


// Start the capture process for the microphone and webcam.
await localMedia.start();
// Detect changes in audio level.
localMedia.addOnAudioLevel(async (level) => {
  // Level is a percent.
  if (level * 100 > 3) {
    localMediaDiv.style.border = "3px solid red";
  } else {
    localMediaDiv.style.border = "none";
  }
});

The code is straightforward. We create a new LocalMedia object with audio capture enabled. Then, we start the LocalMedia object to initiate audio capture. Finally, by listening to changes in the audio level, we can dynamically update the appearance of the localMediaDiv by adding a red border when the audio level exceeds 30%. 

 

RemoteMedia

The process for detecting when someone else is speaking is similar to LocalMedia. We attach the addOnAudioLevel event to the RemoteMedia object.

var remoteMedia = new fm.liveswitch.RemoteMedia();

var audioStream;
if (remoteConnectionInfo.getHasAudio()) {
  audioStream = new fm.liveswitch.AudioStream(remoteMedia);
}
...

remoteMedia.addOnAudioLevel(async (level) => {
  // Level is a percent.
  if (level * 100 > 3){
    remoteMediaDiv.style.border = "3px solid red";
  } else {
    remoteMediaDiv.style.border = "none";
  }
});

Here, we create a new RemoteMedia context and attach it to an AudioStream (assuming there is an audio track on the remote connection). By listening to changes in the audio level in the remote audio stream, we can modify the appearance of the remoteMediaDiv by adding a red border when the audio level exceeds 30%.

 

Tuning for production

The audio level event fires on an interval with the current audio level. You can adjust this interval by setting the desired value in milliseconds to receive audio-level feedback.

remoteMedia.setAudioLevelInterval(1000); // every second

localMedia.setAudioLevelInterval(1000); // every second

It may require some experimentation, but a recommended range for the audio level interval is typically between 250ms and 500ms to achieve optimal results. 

 

You can also check out our live demo on CodePen!

 

Need assistance in architecting the perfect WebRTC application? Let our team help out! Get in touch with us today!