Skip to content

Encrypt Video and Audio Streams (Even in SFU)

Jacob Steele Sep 28, 2023 4:28:42 PM

If you're looking to enhance the security of your WebRTC application by implementing encryption for video and audio streams, this blog post is for you. This guide provides instructions and code snippets to help you understand the encryption process using LiveSwitch's SDK and integrate it into your application. Let's dive in!

 

Introduction

The encryption capability we're about to explore sets LiveSwitch apart from many other SDKs. It introduces an interesting concept but comes with certain trade-offs, including the disruption of several server-side functionalities that are commonly desired. The most prominent impact is on recording, but it also affects lesser-known functionalities such as transcoding and server-side simulcast. Essentially, any server-side video processing will no longer be feasible. This limitation arises from the server's inability to decode the encrypted video stream, including MCU connections.

If you're not familiar with AES (Advanced Encryption Standard), I recommend reading up on it here before continuing. It's crucial to ensure proper utilization of a cryptography library. In our example, we'll modify the Initialization Vector (IV) for each frame. While it's unlikely for two frames to be identical, it is still possible, and we want to avoid producing identical data buffers from the encoder for such frames. Therefore, we randomize the IV on a per-frame basis.

Additionally, having some knowledge about codecs is beneficial. To facilitate effective RTCP (Real-Time Control Protocol) feedback handling in LiveSwitch, we need to leave the header of video frames alone. For example, in the case of VP8 we'll talk about below, we exclude up to 10 bytes of the header from the encryption process. Hence, you'll notice in the code that we extract these 10 bytes before applying the AES encryption.

 

Video Encryption

Let's start with the most common scenario - encrypting video streams. Below is a code snippet that demonstrates the encryption workflow:

// Setup AES

using System.Security.Cryptography;
using FM.LiveSwitch;

// Setup logging 
Log.DefaultLogLevel = LogLevel.Debug;
Log.AddProvider(new ConsoleLogProvider());

// Create a Source 
// Using I420 as the source prevents having to use a YUV Image Converter.
var source = new FakeVideoSource(new VideoConfig(new Size(640, 480), 1), VideoFormat.I420);
// Encoder
var encoder = new FM.LiveSwitch.Vp8.Encoder();
// Encryptor
var encryptor = new VideoEncryptionPipe();
// Packetizer
var packetizer = new FM.LiveSwitch.Vp8.Packetizer();

// Depacketizer
var depacketizer = new FM.LiveSwitch.Vp8.Depacketizer();
// Decryptor
var decryptor = new VideoDecrytpionPipe();
// Decoder
var decoder = new FM.LiveSwitch.Vp8.Decoder();
// Sink - We don't need an image converter as our sink is fake 
var sink = new NullVideoSink(VideoFormat.I420);

var sendTrack = new VideoTrack(source).Next(encoder).Next(encryptor).Next(packetizer);
var recvTrack = new VideoTrack(depacketizer).Next(decryptor).Next(decoder).Next(sink); source.Start();
var client1 = new Client("https://cloud.liveswitch.io/", "<APPID>");
var client2 = new Client("https://cloud.liveswitch.io/", "<APPID>");

var token1 = Token.GenerateClientRegisterToken(client1, new ChannelClaim[] { new ChannelClaim("E2EEncryptionDemo") }, "<SHAREDSECRET>");
var token2 = Token.GenerateClientRegisterToken(client2, new ChannelClaim[] { new ChannelClaim("E2EEncryptionDemo") }, "<SHAREDSECRET>");

var channels = await client1.Register(token1).AsTask();
var channel1 = channels[0]; channels = await client2.Register(token2).AsTask();
var channel2 = channels[0];

var videoStream_offer = new VideoStream(sendTrack, null);
var videoStream_answer = new VideoStream(null, recvTrack);
var connection_offer = channel1.CreatePeerConnection(client2.Info, videoStream_offer);
var connection_answer = (PeerConnection)null;
var openTasks = new List<Task>();
var openTasksReady = new TaskCompletionSource<object>(TaskCreationOptions.RunContinuationsAsynchronously);
channel2.OnPeerConnectionOffer += (peerConnectionOffer) =>
{
  connection_answer = peerConnectionOffer.Accept(videoStream_answer);
  openTasks.Add(connection_answer.Open().AsTask());
  openTasksReady.SetResult(null);
};
openTasks.Add(connection_offer.Open().AsTask());
await openTasksReady.Task;

await Task.WhenAll(openTasks);

Console.ReadLine();

public class VideoEncryptionPipe : VideoPipe
{
  private byte[] key = new byte[16] { 12, 44, 22, 66, 44, 88, 43, 82, 11, 01, 54, 76, 43, 11, 23, 78 };
  public override string Label => "Video Encryption Pipe";

  public VideoEncryptionPipe()
    : base(VideoFormat.Vp8)
  {
 
  }

  private byte[] GenerateIVForFrame()
  {
    byte[] IV = new byte[16];
    Random.Shared.NextBytes(IV);
    return IV;
  }

  private VideoBuffer Encrypt(VideoBuffer buffer)
  {
    using (Aes aesAlg = Aes.Create())
    {
      aesAlg.Padding = PaddingMode.Zeros;
      byte[] IV = new byte[16];
      Random.Shared.NextBytes(IV); // Change the IV every frame.

      ICryptoTransform encryptor = aesAlg.CreateEncryptor(key, IV);

      using (MemoryStream msEncrypt = new MemoryStream())
      {
        // Write the Vp8 header outside of encryption.
        msEncrypt.Write(buffer.DataBuffer.Data, buffer.DataBuffer.Index, 10);
        msEncrypt.Write(IV);
        using (CryptoStream csEncrypt = new CryptoStream(msEncrypt, encryptor, CryptoStreamMode.Write))
        {
          // Write the data to encrypt into the encryption stream (minus the 10 byte header of Vp8)
          csEncrypt.Write(buffer.DataBuffer.Data, (buffer.DataBuffer.Index + 10), (buffer.DataBuffer.Length - 10));
          csEncrypt.FlushFinalBlock();

          var encryptedArray = msEncrypt.ToArray();
         
          // Wrap into a DataBuffer.

          var db = DataBuffer.Wrap(encryptedArray);

          // Wrap into a VideoBuffer.
          return new VideoBuffer(buffer.Width, buffer.Height, db, buffer.Format);
        }
      }
    }
  }

  protected override void DoDestroy()
  {

  }

  protected override void DoProcessFrame(VideoFrame frame, VideoBuffer inputBuffer)
  {
    var buffer = Encrypt(inputBuffer);
    frame.AddBuffer(buffer);
    this.RaiseFrame(frame);
  }
}

public class VideoDecrytpionPipe : VideoPipe
{
  private byte[] key = new byte[16] { 12, 44, 22, 66, 44, 88, 43, 82, 11, 01, 54, 76, 43, 11, 23, 78 };
  public override string Label => "Video Decryption Pipe";

  public VideoDecrytpionPipe()
    : base(VideoFormat.Vp8)
  {

  }

  protected override void DoDestroy()
  {

  }

  private VideoBuffer Decrypt(VideoBuffer buffer)
  {
    using (Aes aesAlg = Aes.Create())
    {
      aesAlg.Padding = PaddingMode.Zeros;
      byte[] iv = new byte[16];
      byte[] header = new byte[10];
     
      MemoryStream
ms = new MemoryStream(buffer.DataBuffer.Data, buffer.DataBuffer.Index, buffer.DataBuffer.Length);
      ms.Read(header, 0, header.Length);
      ms.Read(iv, 0, iv.Length);

      // Create a decryptor to perform the stream transform.
      ICryptoTransform decryptor = aesAlg.CreateDecryptor(key, iv);
     
      using (ms)
      {
        using (CryptoStream csDecrypt = new CryptoStream(ms, decryptor, CryptoStreamMode.Read))
        {
          var decrypted = new byte[buffer.DataBuffer.Length - (iv.Length + header.Length)];
          var decryptedCount = csDecrypt.Read(decrypted, 0, decrypted.Length);
         
          byte
[] payload = new byte[decryptedCount + 10];
          for (int i = 0, l = header.Length; i < l; i++)
          {
            payload[i] = header[i];
          }
          for (int i = 0, l = decryptedCount; i < l; i++)
          {
            payload[i + 10] = decrypted[i];
          }
          var db = DataBuffer.Wrap(payload);
          return new VideoBuffer(buffer.Width, buffer.Height, db, buffer.Format);
        }
      }
    }
  }

  protected override void DoProcessFrame(VideoFrame frame, VideoBuffer inputBuffer)
  {
    var buffer = Decrypt(inputBuffer);
    frame.AddBuffer(buffer);
    this.RaiseFrame(frame);
  }
}

The code provided contains some complex sections, such as offer/answer and tasks. You don't really need to worry about those, and you could always attach encryption to:

encoder.OnRaiseFrame += (frame) => { //encrypt here };

Then decrypt it in the de-packetizer:

depacketizer.OnRaiseFrame += (frame) => { //decrypt here }

This straightforward approach eliminates the need to worry about intricate details and tasks, making it easier to implement encryption if you're using LiveSwitch's example code.

For encryption and decryption functions, you can utilize the Encrypt and Decrypt functions shared in the Encrypt and Send Files in WebRTC blog post:

byte[] key = new byte[16] { 12, 44, 22, 66, 44, 88, 43, 82, 11, 01, 54, 76, 43, 11, 23, 78 };


Func<byte[], byte[]> Decrypt = (byte[] toEncrypt) =>
{
  using (Aes aesAlg = Aes.Create())
  {
    aesAlg.Padding = PaddingMode.Zeros;
    byte[] iv = new byte[16];
    byte[] header = new byte[10];
    MemoryStream ms = new MemoryStream(toEncrypt);
    ms.Read(iv, 0, iv.Length);
    ICryptoTransform decryptor = aesAlg.CreateDecryptor(key, iv);
    using (ms)
    {
      using (CryptoStream csDecrypt = new CryptoStream(ms, decryptor, CryptoStreamMode.Read))
      {
        var decrypted = new byte[toEncrypt.Length - iv.Length];
        var decryptedCount = csDecrypt.Read(decrypted, 0, decrypted.Length);
        return decrypted;
      }
    }
  }
};

Func<byte[], byte[]> Encrypt = (byte[] toEncrypt) =>
{
  using (Aes aesAlg = Aes.Create())
  {
    aesAlg.Padding = PaddingMode.Zeros;
    byte[] IV = new byte[16];
    Random.Shared.NextBytes(IV);
    ICryptoTransform encryptor = aesAlg.CreateEncryptor(key, IV);
    using (MemoryStream msEncrypt = new MemoryStream())
    {
      msEncrypt.Write(IV);
      using (CryptoStream csEncrypt = new CryptoStream(msEncrypt, encryptor, CryptoStreamMode.Write))
      {
        // Write the data to encrypt into the encryption stream (minus the 10 byte header of Vp8)
        csEncrypt.Write(toEncrypt);
        csEncrypt.FlushFinalBlock();
        var encryptedArray = msEncrypt.ToArray();
        return encryptedArray;
      }
    }
  }
};

 

Audio Encryption

Now, let's explore the encryption process for audio streams. The code for audio stream encryption is similar to the video stream encryption code, with one notable difference. Since audio streams do not require picture loss indicators, there is no need to leave the header of the audio frames untouched. Here's a modified version of the code:

// Setup AES

using System.Security.Cryptography;
using FM.LiveSwitch;

// Setup logging :D
Log.DefaultLogLevel = LogLevel.Debug;
Log.AddProvider(new ConsoleLogProvider());

// Create a Source :D
// Using I420 as the source prevents having to use a YUV Image Converter.
var source = new FakeAudioSource(new AudioConfig(48000, 2));
// Encoder
var encoder = new FM.LiveSwitch.Opus.Encoder();
// Encryptor
var encryptor = new AudioEncryptionPipe(encoder.OutputFormat);
// Packetizer
var packetizer = new FM.LiveSwitch.Opus.Packetizer();

// Depacketizer
var depacketizer = new FM.LiveSwitch.Opus.Depacketizer();
// Decryptor
var decryptor = new AudioDecrytpionPipe(depacketizer.OutputFormat);
// Decoder
var decoder = new FM.LiveSwitch.Opus.Decoder();
// Sink - We don't need an image converter as our sink is fake :D
var sink = new NullAudioSink();

var sendTrack = new AudioTrack(source).Next(encoder).Next(encryptor).Next(packetizer);
var recvTrack = new AudioTrack(depacketizer).Next(decryptor).Next(decoder).Next(sink);
source.Start();
var client1 = new Client("https://cloud.liveswitch.io/", "<APPID>");
var client2 = new Client("https://cloud.liveswitch.io/", "<APPID>");

var token1 = Token.GenerateClientRegisterToken(client1, new ChannelClaim[] { new ChannelClaim("E2EEncryptionDemo") }, "<SHAREDSECRET>");
var token2 = Token.GenerateClientRegisterToken(client2, new ChannelClaim[] { new ChannelClaim("E2EEncryptionDemo") }, "<SHAREDSECRET>");

var channels = await client1.Register(token1).AsTask();
var channel1 = channels[0]; channels = await client2.Register(token2).AsTask();
var channel2 = channels[0];

var videoStream_offer = new AudioStream(sendTrack, null);
var videoStream_answer = new AudioStream(null, recvTrack);
var connection_offer = channel1.CreatePeerConnection(client2.Info, videoStream_offer);
var connection_answer = (PeerConnection)null;
var openTasks = new List<Task>();
var openTasksReady = new TaskCompletionSource<object>(TaskCreationOptions.RunContinuationsAsynchronously);
channel2.OnPeerConnectionOffer += (peerConnectionOffer) =>
{
  connection_answer = peerConnectionOffer.Accept(videoStream_answer);
  openTasks.Add(connection_answer.Open().AsTask());
  openTasksReady.SetResult(null);
};
openTasks.Add(connection_offer.Open().AsTask());
await openTasksReady.Task;

await Task.WhenAll(openTasks);

Console.ReadLine();

public class AudioEncryptionPipe : AudioPipe
{
  private byte[] key = new byte[16] { 12, 44, 22, 66, 44, 88, 43, 82, 11, 01, 54, 76, 43, 11, 23, 78 };
  public override string Label => "Audio Encryption Pipe";

  public AudioEncryptionPipe(AudioFormat format)
    : base(format)
  {

  }

  private AudioBuffer Encrypt(AudioBuffer buffer)
  {
    using (Aes aesAlg = Aes.Create())
    {
      aesAlg.Padding = PaddingMode.Zeros;
      byte[] IV = new byte[16];
      Random.Shared.NextBytes(IV);
      ICryptoTransform encryptor = aesAlg.CreateEncryptor(key, IV);
      using (MemoryStream msEncrypt = new MemoryStream())
      {
        msEncrypt.Write(IV);
        using (CryptoStream csEncrypt = new CryptoStream(msEncrypt, encryptor, CryptoStreamMode.Write))
        {
          // Write the data to encrypt into the encryption stream (minus the 10 byte header of Vp8)
          csEncrypt.Write(buffer.DataBuffer.Data, buffer.DataBuffer.Index, buffer.DataBuffer.Length);
          csEncrypt.FlushFinalBlock();
          var encryptedArray = msEncrypt.ToArray();
          var db = DataBuffer.Wrap(encryptedArray);
          return new AudioBuffer(db, buffer.Format);
        }
      }
    }
  }

  protected override void DoDestroy()
  {
 
  }

  protected override void DoProcessFrame(AudioFrame frame, AudioBuffer inputBuffer)
  {
    var buffer = Encrypt(inputBuffer);
    frame.AddBuffer(buffer);
    this.RaiseFrame(frame);
  }
}

public class AudioDecrytpionPipe : AudioPipe
{
  private byte[] key = new byte[16] { 12, 44, 22, 66, 44, 88, 43, 82, 11, 01, 54, 76, 43, 11, 23, 78 };
  public override string Label => "Audio Decryption Pipe";

  public AudioDecrytpionPipe(AudioFormat format)
    : base(format)
  {

  }

  protected override void DoDestroy()
  {

  }

  private AudioBuffer Decrypt(AudioBuffer buffer)
  {
    using (Aes aesAlg = Aes.Create())
    {
      aesAlg.Padding = PaddingMode.Zeros;
      byte[] iv = new byte[16];
      byte[] header = new byte[10];
      MemoryStream ms = new MemoryStream(buffer.DataBuffer.Data, buffer.DataBuffer.Index, buffer.DataBuffer.Length);
      ms.Read(iv, 0, iv.Length);
      ICryptoTransform decryptor = aesAlg.CreateDecryptor(key, iv);
      using (ms)
      {
        using (CryptoStream csDecrypt = new CryptoStream(ms, decryptor, CryptoStreamMode.Read))
        {
          var decrypted = new byte[buffer.DataBuffer.Length - iv.Length];
          var decryptedCount = csDecrypt.Read(decrypted, 0, decrypted.Length);
          var db = DataBuffer.Wrap(decrypted);
          return new AudioBuffer(db, buffer.Format);
        }
      }
    }
  }

  protected override void DoProcessFrame(AudioFrame frame, AudioBuffer inputBuffer)
  {
    var buffer = Decrypt(inputBuffer);
    frame.AddBuffer(buffer);
    this.RaiseFrame(frame);
  }
}

The code structure is similar to the video stream encryption code, except for the absence of header extraction. The pipeline in this case is designed for audio, utilizing AudioBuffers instead of VideoBuffers.

We hope this guide helps you embark on your audio/video encryption journey with LiveSwitch. If you have any further questions or need assistance, please don't hesitate to contact the LiveSwitch team. Happy encrypting!

 

Need assistance in architecting the perfect WebRTC application? Let our team help out! Get in touch with us today!